The pitch · 90 seconds end-to-end

Texas civic data, accessible to anyone who can search Google.

6,061 datasets across 6 open-data portals. A team of OpenAI-powered agents picks the right one, writes the query, runs it on the source-of-truth portal, and hands you a sourced answer in plain English. Free. Open source. MIT-licensed. Live at txlookup.vercel.app.

Try the agent →See the flagship report →GitHub ↗

01 · The problem

Civic data is public. Reaching it isn’t.

Sifting through your city’s data is hard. Unless you’re a developer, a city official, or a reporter. The state and its cities run six open-data portals exposing 6,061 datasets. The current path: download a CSV, open a spreadsheet, filter by hand, give up. Most people never even try.

Six portals

Different APIs (Socrata + CKAN), different IDs, different conventions, different filters.

Schema drift

180+

180+ columns just for permits. permittype vs work_class vs permit_class_mapped — same idea, three columns, three meanings.

SoQL syntax

Brutal

$where, $group, date_extract_y, double-quoting strings, escaping single quotes. One typo and the query 400s.

Download + sift

Hours

200,000-row CSVs in a spreadsheet. Most people give up before they reach an answer.

02 · The product

Google search, with a concierge agent.

You type a question. Seven specialist agents work for you. Planner picks the dataset. Analyst writes the SoQL. Critic verifies the citation. Reporter composes plain English. Support handles disambiguation. Two background agents grow the corpus on a six-hour cron.

Planner

Picks the dataset · drafts the plan

Data analyst

Writes SoQL · computes stats with quality flags

Reporter

Composes plain-English answer · grounded in findings

Critic

Reviews plan + answer · forces revision on reject

Support

Disambiguation · meta questions · no SoQL

Scout + ingestor

Cron-driven · grows the indexed catalog every 6h

03 · How it works

Plain-English in. Sourced answer out. Seven seconds.

01
Reason
User question hits /api/agent. Planner parses intent, decides scope, drafts a structured plan with bounded tool calls.
02
Plan + Critic
Critic reviews the plan. Flags ungrounded steps. Forces a revision if the plan won't survive the answer-stage critic.
03
Execute
Tool dispatch fires bounded SoQL via Socrata + CKAN. 5,000-row cap. 30s timeout. 429 fallback to local cache.
04
Doom-loop guard
Pattern-based detector catches identical-3x and [A,B,A,B] cycles in code. Replan path preserves user intent across rewrites. (The patentable bit.)
05
Compose
Reporter takes findings + cited rows, writes plain English. Critic verifies groundedness. Forces a final revision if needed.
06
Cite + Ship
Every answer carries portal + dataset_id + last_refreshed + a replayable SODA URL. Click any citation to reach the source.

04 · The corpus grows itself

11 curated today. 6,061 indexed. More every six hours.

The 11 curated datasets carry full schema knowledge — hand-picked SoQL, glossary entries per key column, locally mirrored every 6h. The rest of the 6,061-dataset universe is answered live: the agent reads catalog metadata, plans a query, hits the source portal, comes back. As the data analyst agent works through more datasets, more graduate into the curated corpus. The system grows itself.

data.austintexas.gov

2,387

datasets indexed

datahub.austintexas.gov

1,333

datasets indexed

dallasopendata.com

1,044

datasets indexed

data.texas.gov

1,051

datasets indexed

data.sanantonio.gov

163

datasets indexed

data.houstontx.gov

datasets indexed

05 · Visual answers via Miro

The board is the answer.

When a question benefits from a visual, the agent hands off to Miro through its live REST API. The result is a real, persistent board — share the link, embed it, keep editing. Same capability is exposed as MCP tools (create_miro_board, add_to_miro) so any MCP client — Claude Code, Cursor, Codex — can drive Miro through TXLookup.

Open in Miro ↗Live · agent-generated · MCP-installable

06 · It’s extensible

Use the same agent in your coding tools.

TXLookup ships as an MCP server. Eight tools. Installable in Claude Code, Cursor, Codex — one command. Your coding agent now queries Texas civic data the same way ours does. Skill doc included — teaches any runtime when to call which tool.

Claude Code

~/txlookup

claude mcp add txlookup -- python -m mcp.server

Codex

~/txlookup

codex mcp add txlookup --command python --args -m --args mcp.server

Cursor — paste into MCP settings

~/txlookup

{
  "txlookup": {
    "command": "python",
    "args": ["-m", "mcp.server"]
  }
}

Roadmap

Texas today. Same pipeline ingests Chicago, NYC, San Francisco — anywhere there’s a Socrata or CKAN portal. Open source. Anyone can extend it. Add a portal config; the scout starts indexing on the next 6h tick.

07 · By the numbers

What landed in 48 hours.

6,061

Datasets indexed

6 portals · Socrata + CKAN

Deeply curated

Schema + cached rows + glossary

Specialist agents

5 in /q loop · 2 scheduled crons

MCP tools

Installable in Claude Code, Cursor, Codex

08 · The team

Four people. Shipped at the AITX × Codex Hackathon, May 2026.

Ravinder Jilkapally · Kunal Vasavada · Godwyn James · Raj Akula. Full bios + LinkedIns.

Try it now

No login. No setup. Click any question, the agent fires.

Try the agent →Read the architecture →GitHub ↗