Crawler Policy
How VeltimaBot discovers and catalogues public e-commerce websites — what we respect, what we never touch, and how site owners can opt out.
Last updated: June 13, 2026. Questions or opt-out requests: crawler@veltima.app.
User-Agent
Our crawler identifies itself as:
VeltimaBot/1.0 (+https://veltima.app/crawler-policy)
This User-Agent string is stable. If you see a different string claiming to be us, it is not us — please email crawler@veltima.app with the source IP so we can investigate.
robots.txt is honored
We fetch and parse robots.txt before every crawl. Every Disallow and Crawl-delay directive targeting VeltimaBot, *, or our fallback identifiers is respected. A site that blocks all crawlers is never touched again until its robots.txt allows access.
Per-host rate limits
We never hammer a single site. Defaults:
- Maximum 1 request every 10 seconds per host
- Maximum 1 request every 3 seconds per shared-hosting group (IP umbrella)
- Automatic back-off on
429 Too Many Requestsand503 Service Unavailable, obeyingRetry-Afterheaders
A server that looks stressed is treated as a higher-priority opt-out — we pause the crawl and retry later with more conservative pacing.
Scope of what we fetch
VeltimaBot fetches public HTML pages only:
- Homepages and linked product / collection / blog pages
- No login walls — we never submit forms or authenticate
- No JavaScript execution — we read what the server returns
- No private endpoints, admin panels, or checkout flows
Everything we index is information the site already publishes to any visitor.
Caching and recrawl cadence
We respect ETag and Last-Modified headers and use conditional requests where supported. Re-crawl frequency ranges from 30 to 90 days per site, weighted by how often its signals change. Fast-moving signals (pricing, availability) are rechecked more often; stable metadata is rechecked rarely.
Opting out
If you do not want your store in our index, email crawler@veltima.app with the host name. We remove the store from the index within 7 business days and add the host to a permanent do-not-crawl list. No justification required.
Alternatively, add this to your robots.txt:
User-agent: VeltimaBot
Disallow: /
VeltimaBot will see the block on its next visit and remove the site from the crawl queue automatically.
Contact
Questions, abuse reports, or security concerns: crawler@veltima.app. We respond within 3 business days.
Built for responsible data use
Veltima is an e-commerce intelligence platform for B2B sales teams — transparent sourcing, clear opt-out, no dark-pattern extraction.
About Veltima