TXLookup · Civic-data agent · v0.1
Texas civic data, in plain English.
6,061 Texas datasets indexed across 6 open-data portals — Austin, Dallas, San Antonio, Houston, TX state. 11 are deeply curated (full schema, locally mirrored). The rest are answered live: an agent reads catalog metadata, plans a query, runs it on the source-of-truth portal. A smart layer over public data — every claim citable, every step replayable.
The motivation
Texas publishes everything.
Hard to navigate, until now.
The state and its cities run six open-data portals. Together they expose 6,061 datasets covering permits, inspections, 311 calls, code violations, traffic fatalities, franchise tax, contracts, library checkouts — millions of rows refreshed daily. All of it public. To use it directly you have to write SoQL by hand against six different APIs.
Six portals
6
different APIs
Austin runs Socrata. San Antonio runs CKAN. Houston runs CKAN. Dallas runs Socrata. Different IDs, different conventions, different filters.
Schema drift
180+
columns just for permits
Each dataset has its own column names, types, code values. permittype vs work_class vs permit_class_mapped — same idea, three columns, three meanings.
SoQL syntax
Brutal
to hand-write
$select, $where, $group, $order, $limit, date_extract_y, double-quoting strings, escaping single quotes. One typo and the whole query 400s.
Download + sift
Hours
of CSV manual review
The current path: download a 200,000-row CSV, open it in a spreadsheet, filter by hand, hope you didn't miss a column. Most people give up before getting to an answer.
A team of OpenAI-powered agents stands between you and 6,061 datasets. If you can search Google or read a news article, you can ask Texas civic data anything. The planner picks the dataset; the analyst writes the SoQL; the reporter composes the answer; the critic verifies citations. Same data the experts use — now reachable in plain English.
What people ask
Pick a question. Skip the typing.
Click anything below — the agent answers in 7 seconds with a citation. No login, no setup, no SoQL.
Housing
Public health
311 + code
Compare cities
Trends + outliers
Conversational
Want to chat about the data instead? Open /chat — same agent, multi-turn. Ask what we have, what a column means, which dataset fits.
Browse by topic
Six domains, hundreds of datasets.
01
Housing & Permits
Construction, zoning, code enforcement.
02
Public Safety
Crime reports, traffic fatalities, 311.
03
Public Health
Restaurant inspections, food-safety scoring.
04
Transportation
Vision Zero, road incidents, mobility data.
05
311 & Code
Service requests, code violations, response.
06
Economy & Business
Franchise tax, mixed beverage, expenditures.
Local mirror · refreshed every 6h
9 curated, locally mirrored. The other 6,052 answered live.
The 9 datasets behind these tiles are mirrored to a local SQLite store every 6 hours by an autonomous ingestor cron. Pages render from the mirror in milliseconds and survive upstream throttling. The remaining 6,061-dataset catalog across 6 portals is queried on demand — each tile shows a freshness badge so the source is never ambiguous.
Austin · top inspection zip · 30d
—
ecmv-9xxi
Live · just nowDallas · 311 requests · 30d
643
gc4d-8a49
Live · just nowTX · active franchise permits
—
9cir-efmm
Mirror · 3d agoAustin · 311 requests · 30d
25,328
xwdj-i9he
Live · just nowDallas · police active calls
63
9fxf-t2tr
Live · just nowAustin · open code violations
3,397
6wtj-zbtb
Live · just nowAustin permits · 7-day pulse
+1,217
Improvement flywheel
Five agents.
One sourced answer.
The orchestrator dispatches parallel queries. The critic catches a window bug. The reporter composes the answer. The citation locks in. A real run, looped on autoplay.
Live replay · marquee question
cycle: 0.00s / 7.4s
reason: parsing · domain=permits geo=78704 window=2024-Q4
The selling point
Any dataset. Any portal. Knowledge in 24 hours.
The scout + ingestor + multi-agent loop is portable. Texas is the demo corpus — the same pipeline ingests Chicago, NYC, federal data.gov, anywhere with a Socrata-compatible API.
1. Point at portal
Open an issue. The scout's next 6h tick discovers every dataset, scores it, and proposes a catalog entry.
File a portal request →
2. Ingestor populates
The ingestor cron pulls deltas into a local SQLite cache so cross-dataset SQL JOINs work that Socrata SoQL can't.
See the ingestor →
3. Anyone asks
Type a question. Orchestrator dispatches. Critic rejects ungrounded. Reporter composes. Citation enforced.
Try a question →
Use as agent
Install in 30 seconds.
MCP server + agent skill. Drops into Claude Code, Codex, Cursor. Bounded queries, citation enforced.
# 1. install in claude code $ claude mcp add txlookup -- python -m mcp.server # 2. ask $ claude > use txlookup: food truck permits 78702 last 6 months # 3. answer with citation → count by month, % change vs prior 6mo → cite: dataset_id · portal_url · age_seconds
8 tools · 5,000-row cap · 30s timeout · backoff on 429 · citation enforced