Architecture · the whole system

How the system fits together.

TXLookup is one Codex-driven multi-agent loop surfaced through six flows. All six share a typed plan/dispatch contract, a single skill policy, the same bounded Socrata + CKAN client, and a local-mirror resilience layer. The diagram below shows the layers; the cards walk through each flow end-to-end.

6,061

Datasets indexed

6 portals · Socrata + CKAN

Deeply curated

Schema + cached rows + glossary

Specialists

5 in /q loop · 2 scheduled crons

MCP tools

Installable in Claude Code, Cursor, Codex

Why agents

Each agent does the work you used to do by hand.

The legacy path was: visit a portal, find a dataset, download a 200k-row CSV, open it in a spreadsheet, filter by hand, build a chart. TXLookup replaces every step with a specialist agent powered by OpenAI Codex / GPT-4o. You ask in plain English. The agents do the rest.

Planner
What you used to do
Pick the right dataset, learn its schema, write SoQL.
What this agent does
Reads catalog metadata, picks the dataset, drafts a structured plan with bounded tool calls.
Data analyst
What you used to do
Hand-write group-by + window math + null handling.
What this agent does
Runs the bounded query, computes deltas / top-N / YoY with quality flags (null rate, top concentration, sample factor).
Reporter
What you used to do
Skim the spreadsheet, paraphrase, hope you didn't misread.
What this agent does
Composes a plain-English answer grounded in the analyst's findings — no hallucinated numbers.
Critic
What you used to do
Hope the answer is right. No way to check.
What this agent does
Reviews plan + answer for groundedness, scope, citation. Forces a corrective revision on reject.
Support
What you used to do
Re-Google when a column name is unfamiliar.
What this agent does
Handles meta-questions and disambiguation ("south austin" → which zip?). No SoQL fired.
Scout + ingestor
What you used to do
Notice when a portal added a new dataset. (Most people don't.)
What this agent does
Cron-driven. Scout indexes new portal datasets every 6h. Ingestor refreshes the local-mirror cache so pages stay fast and survive throttling.

The user-facing change: if you can search Google or read a news article, you can ask civic data a question. Same data the experts use — reachable from a single search box, with citations on every claim.

Seven layers, top to bottom.

01User surface

BrowserMCP-client (Claude Code, Codex, Cursor)

02Edge

Vercel — / · /q · /chat · /datasets · /reports · /sources · /api/agent (SSE)

03Agent loop

7 specialists (planner · analyst · reporter · support · critic · scout · ingestor)Doom-loop guardReplannerSynthesizer

04Tool dispatch

8 MCP tools · discover · describe · fetch · summarize · cite · status · miro_create · miro_add

05Data + I/O

Socrata SODA (Austin / Austin Hub / Dallas / TX state)CKAN (San Antonio / Houston)data/cache/*.json local mirrorMiro REST API

06Resilience

cache → live → stale-cache → error chain5,000-row cap · 30s timeout · 429 backofffreshness badge per visible stat

07Bound + safety

Skill doc · citation enforced · doom-loop pattern detection · replan preserves intent

Markdown source: docs/architecture.md

Six flows, one agent.

Flow 01/q?q=…

User asks a question (live agent)

01Browser POST /api/agent { query }
02Server SSE stream opens (text/event-stream)
03phase=reasoning — Codex parses intent
04phase=planning — Codex returns structured Plan { intent, steps[] }
05phase=executing — for each step, dispatch tool
06phase=replanning (if step fails ≤2 times) — Codex emits a new Plan
07phase=completing — Codex synthesizes final answer
08phase=done — answer + citation + artifacts streamed back

Nodes:Browser/api/agent (Vercel)Codex (gpt-4o)Socrata SODAMiro REST

Flow 02/datasets/[id]

Browse a dataset

01Server component renders at request time (revalidate 600s)
02Promise.all fetches /api/views/{id}.json + /resource/{id}.json?$limit=5
03Schema columns + sample rows + last refresh rendered as static HTML
04Scoped 'ask about this dataset' search submits back to /q with dataset=<id>

Nodes:Browser/datasets/[id] server componentSocrata SODA

Flow 03/

Live homepage stats

01Server-render at request time (revalidate 300s)
02Promise.all fans out 5+ Socrata queries:
03 • Austin permits last 7 days (group by day)
04 • Austin permits 7d total
05 • Top inspection zip last 30 days
06 • 311 requests last 30 days
07 • Open code violations
08 • Per-dataset metadata for the cards
09Sparkline + ticker render with real numbers

Nodes:Browser/ server componentSocrata SODA

Flow 04data/cache/<id>.json

Cache-resilience layer (the local mirror)

01GitHub Actions ingestor cron fires every 6h
02Pulls 5,000 most-recent rows per curated dataset to JSON
03Commits data/cache/*.json (~5 MB total) to main
04Vercel build bundles cache files into every serverless function
05Reader: try cache → on miss, hit live Socrata → on 429/5xx, fall back to stale cache with caveat
06Each visible stat tile carries a freshness badge (Mirror · Nh ago / Live · just now)

Nodes:GitHub Actions croningestor.pydata/cache/*.jsonapp/lib/cache.ts

Flow 05claude mcp add txlookup …

External agent installs TXLookup

01Developer runs claude/codex/cursor mcp add against mcp/server.py
02FastMCP advertises 8 tools (ask_data, discover_datasets, get_dataset_schema, fetch_data, get_task_status, create_miro_board, add_to_miro, list_known_tools)
03Skill doc (skills/txlookup/SKILL.md) teaches the runtime when to call each
04Tool calls land at the same data layer (agent/tools/data.py)
05Citations enforced — every reply includes portal + dataset_id + last_refreshed

Nodes:External agent runtimeTXLookup MCP serverSocrata SODA + CKAN

Flow 06create_miro_board tool

Agent-to-agent (A2A) — render to Miro

01Planner emits create_miro_board for visualizable answers
02Executor calls Miro REST API with title + summary + records
03Miro returns board_id + view_link
04View link surfaced as an artifact alongside the answer
05Judge clicks → opens the live, persistent Miro board

Nodes:TXLookup agentMiro REST APIMiro board (persistent)

How the system fits together.

Each agent does the work you used to do by hand.

Planner

Data analyst

Reporter

Critic

Support

Scout + ingestor