Crawler Policy

How VeltimaBot discovers and catalogues public e-commerce websites — what we respect, what we never touch, and how site owners can opt out.

Last updated: June 13, 2026. Questions or opt-out requests: crawler@veltima.app.

User-Agent

Our crawler identifies itself as:

VeltimaBot/1.0 (+https://veltima.app/crawler-policy)

This User-Agent string is stable. If you see a different string claiming to be us, it is not us — please email crawler@veltima.app with the source IP so we can investigate.

robots.txt is honored

We fetch and parse robots.txt before every crawl. Every Disallow and Crawl-delay directive targeting VeltimaBot, *, or our fallback identifiers is respected. A site that blocks all crawlers is never touched again until its robots.txt allows access.

Per-host rate limits

We never hammer a single site. Defaults:

  • Maximum 1 request every 10 seconds per host
  • Maximum 1 request every 3 seconds per shared-hosting group (IP umbrella)
  • Automatic back-off on 429 Too Many Requests and 503 Service Unavailable, obeying Retry-After headers

A server that looks stressed is treated as a higher-priority opt-out — we pause the crawl and retry later with more conservative pacing.

Scope of what we fetch

VeltimaBot fetches public HTML pages only:

  • Homepages and linked product / collection / blog pages
  • No login walls — we never submit forms or authenticate
  • No JavaScript execution — we read what the server returns
  • No private endpoints, admin panels, or checkout flows

Everything we index is information the site already publishes to any visitor.

Caching and recrawl cadence

We respect ETag and Last-Modified headers and use conditional requests where supported. Re-crawl frequency ranges from 30 to 90 days per site, weighted by how often its signals change. Fast-moving signals (pricing, availability) are rechecked more often; stable metadata is rechecked rarely.

Opting out

If you do not want your store in our index, email crawler@veltima.app with the host name. We remove the store from the index within 7 business days and add the host to a permanent do-not-crawl list. No justification required.

Alternatively, add this to your robots.txt:

User-agent: VeltimaBot
Disallow: /

VeltimaBot will see the block on its next visit and remove the site from the crawl queue automatically.

Contact

Questions, abuse reports, or security concerns: crawler@veltima.app. We respond within 3 business days.

Built for responsible data use

Veltima is an e-commerce intelligence platform for B2B sales teams — transparent sourcing, clear opt-out, no dark-pattern extraction.

About Veltima