Platform Architecture
How UnGovr discovers, processes, and serves government information across 200+ countries.
System overview
The data pipeline that powers 321,000+ government entities
High-level data flow: sources are crawled and processed through the pipeline, producing Open Records Extracts (OREs) stored in structured databases, and served through five application interfaces.
Technology Stack
Python, FastAPI, PostgreSQL, PostGIS, Playwright, GPU-accelerated OCR, Cloudflare, and more.
Request workflows
How requests flow through the system – from the public to the right government agency
Service Request
Resident reports an issue
Photo + description via SMS, RCS, web form, or email. Location is captured from GPS or entered manually.
AI classifies and routes
Issue type is extracted from the description. GPS coordinates are mapped to the correct jurisdiction using the entity graph.
Agency identified
The entity graph resolves overlapping jurisdictions – city vs. county vs. special district – to find the responsible agency.
Request formatted and sent
The request is formatted per the agency's intake method (Open311 API, email, or web form) and submitted.
Confirmation and tracking
Resident receives a confirmation with a tracking reference. Status updates are relayed as they arrive.
Open Records Request
Resident describes what they need
Plain-language description of the records sought, plus a location or address. No need to know which agency holds the records.
Jurisdiction and law resolved
The location is mapped to the correct jurisdiction using the entity graph. UnGovr then identifies which open records law applies – federal FOIA, state open records act, or international FOI law – across 140+ countries and 57 US jurisdictions.
Request formatted to requirements
Each law has different requirements: response deadlines (3–30+ days), fee structures, residency rules, and appeal processes. The request is drafted to comply.
Submitted to the records officer
Sent via the agency's designated channel – email to the records officer, online portal submission, or postal mail where required.
Deadline tracking and follow-up
UnGovr tracks the statutory response deadline and notifies the requester. If the deadline passes, guidance on appeals is provided.
Under the hood
Entity registry
A comprehensive database of 321,000+ government entities across 200+ countries – from US cities and counties to French communes, Japanese prefectures, and Indigenous nations. Each entity is mapped to its geographic boundaries and connected to its domains, documents, and services.
Web crawler
Our crawler (UnGovrBot) discovers and indexes open records from government websites. We use a three-tier fetch engine – transparent HTTP, stealth browser, and hardened browser – to handle everything from simple sites to those with aggressive bot protection. We respect robots.txt and provide full documentation for webmasters.
Document processing and Open Records Extracts
Documents are processed to produce Open Records Extracts (OREs) – the raw text mined from open records responses. OREs are deduplicated, chunked, and vectorized for loading into a RAG-style knowledge system. Each entity gets its own isolated knowledge base, which can be queried individually (e.g. a single city) or merged into larger groupings (e.g. all entities in Ventura County). For PDFs, we use GPU-accelerated OCR where needed.
Geographic services
Given a location, we identify all the government entities that serve that area – handling complex overlapping jurisdictions where a single address might fall under a city, county, school district, water district, and fire district simultaneously. This powers entity discovery, service request routing, and open records request targeting.
Messaging relay
UnGovr relays service requests and records requests to government agencies via SMS, RCS, and email on behalf of residents. The relay acts as an intermediary – the resident's personal phone number, email address, and identity are never shared with the agency unless approved by the resident. Government sees the request, not the requester. This protects anonymity while still delivering properly formatted, trackable requests.
Open records engine
A database of open records laws across 140+ countries and 57 US jurisdictions, including response deadlines, fee structures, residency requirements, and appeal processes. This powers the open records request workflow and the public reference pages.
Design principles
Traceable sources
Every piece of data links back to its original source. We store source URLs, capture dates, and provenance information so users can verify information at its official source.
Incremental processing
We detect changes and reprocess only what's needed. This keeps our data current while minimizing load on source websites.
Structured extraction
Beyond full-text search, we extract structured information where possible – meeting dates, agenda items, findings and recommendations, response deadlines.
Privacy by design
User data is minimized and protected. We don't track users across the web or build profiles for advertising purposes.