/insights_agent

Eight days of research in eight seconds

A question goes in. Cited, themed analysis comes out. Here is what happens in between.

01

Data ingestion

One-time setup

Scrapers pull customer feedback from public sources. No API keys, no permissions needed. Each review, post, and comment becomes a searchable document.

Trustpilot reviews are extracted from __NEXT_DATA__ JSON embedded in server-rendered pages. Reddit posts and comments come via the old.reddit.com JSON API. Each document gets a full-text search index (tsvector) automatically on insert, weighting titles higher than body text.

Example
Huel demo: 200 Trustpilot reviews + 1,016 Reddit posts and comments = 1,216 documents
02

Your question arrives

0ms

A product manager types a plain English question. "What do customers say about the taste?"

The question hits the /api/ask endpoint. No preprocessing, no keyword extraction on the client side. The raw question goes straight to the agent brain.

03

Query expansion

~1 second

Claude Haiku breaks the question into 3-5 Postgres search queries, adding synonyms and related terms the user would never think to search for.

Haiku receives the original question and generates websearch-compatible queries. It knows to add synonyms ("taste" becomes taste OR flavor OR texture OR disgusting OR delicious), filter by source when relevant, and use phrase matching for specific concepts.

Example
"What do customers say about taste?" becomes:

taste OR flavor OR texture
"tastes good" OR "tastes bad" OR disgusting
flavor comparison OR "better than"
chocolate OR vanilla OR berry OR banana
04

Full-text search

~200ms

Each expanded query hits the Postgres full-text search index. Results are deduplicated and ranked by relevance. Top 40 documents pass through.

Supabase runs websearch_to_tsquery against the tsvector index on every document. Title matches rank higher than body matches. Results from all queries are merged, deduplicated by document ID, and sorted by ts_rank_cd score. The top 40 go to synthesis.

05

Analysis and synthesis

~5-8 seconds

Claude Sonnet reads all 40 documents, identifies themes, pulls direct quotes, and writes a structured answer with numbered citations linking back to original sources.

Sonnet receives the original question plus all retrieved evidence with metadata (source type, rating, author, date, URL). It groups findings into themes, surfaces patterns, shows both sides when views are mixed, and never invents. Every claim is backed by a numbered citation.

Example
The output includes:

Themed sections with headers
Direct customer quotes
Source citations [1], [2], [3]
Clickable links to original reviews/posts
Analysis duration timestamp
06

Answer delivered

Total: 8-12 seconds

The cited answer appears in the chat. Every source is expandable with links to the original review or post. The query is logged for analytics.

The full pipeline replaces what would take a research team 8 days of manual review reading. The answer is as good as the evidence. More documents means better answers.

The stack

FrameworkNext.js on Vercel
DatabaseSupabase (Postgres + full-text search)
Query planningClaude Haiku
SynthesisClaude Sonnet
Data sourcesTrustpilot, Reddit, forums, email, CSV
Cost per query~$0.01-0.02