Web Crawler
Discovering and indexing open records from government websites.
Our Promise
The UnGovr crawler is designed to be a good citizen of the web. We:
- Identify ourselves clearly with a descriptive User-Agent string
- Respect robots.txt path restrictions
- Honor Crawl-delay headers when specified
- Limit request rates to avoid overwhelming servers
- Provide contact information for site administrators
If you manage a government website and have questions about our crawler, please contact us at crawl@ungovr.org.
What We Crawl
We focus on publicly accessible government documents:
- Meeting agendas and minutes
- Budget documents and financial reports
- Policy documents and ordinances
- Public notices and announcements
- Reports and studies
We do not crawl:
- Login-protected or authenticated content
- Personal information or private records
- Non-government websites
Technical Details
User-Agent
UnGovrBot/1.0 (+https://ungovr.org/crawler)
Default Behavior
- Maximum 1 request per second per domain (unless Crawl-delay specifies otherwise)
- Respects robots.txt path restrictions
- Only follows links within the same domain
- Verifies external domains before crawling
Working With Us
If you'd prefer we access your data differently, we're happy to work with you:
- Provide data feeds (JSON, XML, RSS) instead of crawling
- Schedule crawls during off-peak hours
- Set up specific crawl rules for your site
Contact us at crawl@ungovr.org to discuss options.