Every government portal
has a wall.

Every wall has a crack.

01

The Wall

Four barriers governments put between you and public data. Four cracks.

reCAPTCHA

Google's bot detector sits in front of the search form. Click the checkbox, solve the puzzle, prove you're human. Every. Single. Time.

The crack: POST with ajaxRequest:true header. The captcha only guards the form — not the API behind it.
POST /InmateSearch.aspx Headers: { "ajaxRequest": "true" } Body: lastName=Smith&firstName=John // reCAPTCHA never fires. Server returns JSON.
ASP.NET ViewState

Microsoft's server-side state token. Every form submission requires a valid __VIEWSTATE and __EVENTVALIDATION token — tokens that change every session.

The crack: GET the page first, extract the tokens, POST them back with cookies.
// Step 1: GET the page, save cookies const page = await fetch(url); const cookies = page.headers.get('set-cookie'); // Step 2: Parse out __VIEWSTATE + __EVENTVALIDATION const vs = page.text().match(/__VIEWSTATE.*?value="(.*?)"/); // Step 3: POST with tokens + cookies fetch(url, { headers: { cookie: cookies }, body: `__VIEWSTATE=${vs}&lastName=Smith` })
JavaScript SPA

The entire portal is a single-page app. No HTML to scrape. Data loads asynchronously through JavaScript bundles that compile at runtime.

The crack: grep the JS bundle for /api/ endpoints. Call them directly. The SPA is just a frontend to a REST API.
// View source. Find the 800KB bundle.js // Search for fetch( or /api/ or .json // Found: /api/v1/inmates?search= // No auth header. No session token. Public. curl 'https://portal.gov/api/v1/inmates?search=Smith' // JSON. Every field. No captcha.
Cloudflare / Incapsula WAF

Enterprise web application firewalls. Bot fingerprinting, JavaScript challenges, TLS fingerprint analysis. The real wall.

The crack: forge-browser. Headless Chrome with a real browser fingerprint. If a human browser can see it, so can we.
// forge-browser: headless Chrome on Cloudflare const browser = await env.BROWSER_SERVICE.fetch( 'https://portal.gov/search', { headers: { 'X-Action': 'render' } } ); // Real browser. Real fingerprint. Real data.
02

The Crack

Fresno County Sheriff jail blotter. 230 bookings. Here's exactly how we got them.

1
Recon
curl the URL. See what comes back.
curl -s 'https://www.fresnosheriff.org/mfjail/who-is-in-jail.html' | head -100 # Content-Type: text/html # Transfer-Encoding: chunked # Server: Microsoft-IIS/10.0 # No redirect. No captcha challenge. Just... HTML.
2
Discovery
Pre-rendered Blazor HTML. 952KB. No auth. No WAF. The entire jail roster, server-rendered in one page load.
# 952,847 bytes of pre-rendered HTML # Framework: Blazor Server (ASP.NET) # Auth: None # WAF: None # Rate limit: None # Every booking. Every charge. Every bail amount. # Just sitting there.
3
Parse
Split on BookingNumber. Extract fields with regex. Structure it.
const rows = html.split('BookingNumber'); // 230 bookings for (const row of rows) { const name = row.match(/Name.*?>(.*?)(.*?)(.*?)
4
Deploy
Cloudflare Worker. Edge-cached. 230 bookings in 0.87 seconds.
// Cloudflare Worker — fresno-jail.js // Fetches, parses, returns structured JSON // Edge-cached 10 minutes // Response time: 0.87s (cold) / 12ms (cached) GET /api/jail/fresno/bookings?last=Smith → { bookings: [...], count: 3, source: "fresno-jail" }
03

The Terminal

Mac Mini. Overnight. One Claude instance building the entire system.

manhattan-poller.sh — overnight build log
0
Pages
0
APIs
0
Counties
0
Tasks
The data doesn't have an opinion.
foia.tools