Web Crawler

Discovering and indexing open records from government websites.

Our Promise

The UnGovr crawler is designed to be a good citizen of the web. We:

  • Identify ourselves clearly with a descriptive User-Agent string
  • Respect robots.txt path restrictions
  • Honor Crawl-delay headers when specified
  • Limit request rates to avoid overwhelming servers
  • Provide contact information for site administrators

If you manage a government website and have questions about our crawler, please contact us at crawl@ungovr.org.

What We Crawl

We focus on publicly accessible government documents:

  • Meeting agendas and minutes
  • Budget documents and financial reports
  • Policy documents and ordinances
  • Public notices and announcements
  • Reports and studies

We do not crawl:

  • Login-protected or authenticated content
  • Personal information or private records
  • Non-government websites

Technical Details

User-Agent

UnGovrBot/1.0 (+https://ungovr.org/crawler)

Default Behavior

  • Maximum 1 request per second per domain (unless Crawl-delay specifies otherwise)
  • Respects robots.txt path restrictions
  • Only follows links within the same domain
  • Verifies external domains before crawling

Working With Us

If you'd prefer we access your data differently, we're happy to work with you:

  • Provide data feeds (JSON, XML, RSS) instead of crawling
  • Schedule crawls during off-peak hours
  • Set up specific crawl rules for your site

Contact us at crawl@ungovr.org to discuss options.