The pitch · 90 seconds end-to-end
Texas civic data, accessible to anyone who can search Google.
6,061 datasets across 6 open-data portals. A team of OpenAI-powered agents picks the right one, writes the query, runs it on the source-of-truth portal, and hands you a sourced answer in plain English. Free. Open source. MIT-licensed. Live at txlookup.vercel.app.
01 · The problem
Civic data is public. Reaching it isn’t.
Sifting through your city’s data is hard. Unless you’re a developer, a city official, or a reporter. The state and its cities run six open-data portals exposing 6,061 datasets. The current path: download a CSV, open a spreadsheet, filter by hand, give up. Most people never even try.
Six portals
6
Different APIs (Socrata + CKAN), different IDs, different conventions, different filters.
Schema drift
180+
180+ columns just for permits. permittype vs work_class vs permit_class_mapped — same idea, three columns, three meanings.
SoQL syntax
Brutal
$where, $group, date_extract_y, double-quoting strings, escaping single quotes. One typo and the query 400s.
Download + sift
Hours
200,000-row CSVs in a spreadsheet. Most people give up before they reach an answer.
02 · The product
Google search, with a concierge agent.
You type a question. Seven specialist agents work for you. Planner picks the dataset. Analyst writes the SoQL. Critic verifies the citation. Reporter composes plain English. Support handles disambiguation. Two background agents grow the corpus on a six-hour cron.
Planner
Picks the dataset · drafts the plan
Data analyst
Writes SoQL · computes stats with quality flags
Reporter
Composes plain-English answer · grounded in findings
Critic
Reviews plan + answer · forces revision on reject
Support
Disambiguation · meta questions · no SoQL
Scout + ingestor
Cron-driven · grows the indexed catalog every 6h
03 · How it works
Plain-English in. Sourced answer out. Seven seconds.
01
Reason
User question hits /api/agent. Planner parses intent, decides scope, drafts a structured plan with bounded tool calls.
02
Plan + Critic
Critic reviews the plan. Flags ungrounded steps. Forces a revision if the plan won't survive the answer-stage critic.
03
Execute
Tool dispatch fires bounded SoQL via Socrata + CKAN. 5,000-row cap. 30s timeout. 429 fallback to local cache.
04
Doom-loop guard
Pattern-based detector catches identical-3x and [A,B,A,B] cycles in code. Replan path preserves user intent across rewrites. (The patentable bit.)
05
Compose
Reporter takes findings + cited rows, writes plain English. Critic verifies groundedness. Forces a final revision if needed.
06
Cite + Ship
Every answer carries portal + dataset_id + last_refreshed + a replayable SODA URL. Click any citation to reach the source.
04 · The corpus grows itself
11 curated today. 6,061 indexed. More every six hours.
The 11 curated datasets carry full schema knowledge — hand-picked SoQL, glossary entries per key column, locally mirrored every 6h. The rest of the 6,061-dataset universe is answered live: the agent reads catalog metadata, plans a query, hits the source portal, comes back. As the data analyst agent works through more datasets, more graduate into the curated corpus. The system grows itself.
data.austintexas.gov
2,387
datasets indexed
datahub.austintexas.gov
1,333
datasets indexed
dallasopendata.com
1,044
datasets indexed
data.texas.gov
1,051
datasets indexed
data.sanantonio.gov
163
datasets indexed
data.houstontx.gov
83
datasets indexed
05 · Visual answers via Miro
The board is the answer.
When a question benefits from a visual, the agent hands off to Miro through its live REST API. The result is a real, persistent board — share the link, embed it, keep editing. Same capability is exposed as MCP tools (create_miro_board, add_to_miro) so any MCP client — Claude Code, Cursor, Codex — can drive Miro through TXLookup.
06 · It’s extensible
Use the same agent in your coding tools.
TXLookup ships as an MCP server. Eight tools. Installable in Claude Code, Cursor, Codex — one command. Your coding agent now queries Texas civic data the same way ours does. Skill doc included — teaches any runtime when to call which tool.
Claude Code
claude mcp add txlookup -- python -m mcp.server
Codex
codex mcp add txlookup --command python --args -m --args mcp.server
Cursor — paste into MCP settings
{
"txlookup": {
"command": "python",
"args": ["-m", "mcp.server"]
}
}Roadmap
Texas today. Same pipeline ingests Chicago, NYC, San Francisco — anywhere there’s a Socrata or CKAN portal. Open source. Anyone can extend it. Add a portal config; the scout starts indexing on the next 6h tick.
07 · By the numbers
What landed in 48 hours.
6,061
Datasets indexed
6 portals · Socrata + CKAN
11
Deeply curated
Schema + cached rows + glossary
7
Specialist agents
5 in /q loop · 2 scheduled crons
8
MCP tools
Installable in Claude Code, Cursor, Codex
08 · The team
Four people. Shipped at the AITX × Codex Hackathon, May 2026.
Ravinder Jilkapally · Kunal Vasavada · Godwyn James · Raj Akula. Full bios + LinkedIns.
Try it now
No login. No setup. Click any question, the agent fires.