Hub-side LLM engine

Server automation is the QB of the information system.

One small LLM service on the hub does four jobs: it vets every public-source proposal for PII and content safety, extracts a structured pack draft from free text, drafts the email DueCare sends when curators need expert input, and processes inbound replies through the same boundary scan. Nothing the server decides is final. A human curator signs off before any vetted pack ships.

01 · Four jobs, one engine

Vet, extract, draft, vet again.

The hub now treats crawler proposals, API submissions, research-monitor drafts, and inbound expert replies as one bounded engine: server automation.

Job 01 · Submission vetting

Decide whether a public-source proposal can advance.

Every submission, whether typed into the website form or scraped by the public-source crawler, runs through the same content-safety + PII + intent triage. The verdict is one of accept, needs_curator_review, or reject. The reasons go to the curator alongside the proposal.

  • Detects raw worker case content the regex filter misses.
  • Flags prompt-injection attempts hidden in payload text.
  • Catches operational advice that should never become a pack.
  • Conservative by default. Anything ambiguous goes to a human.
Job 02 · Pack templating

Turn free text into a structured pack draft.

A submission usually arrives as prose plus a public URL. The automation extracts the jurisdiction, the corridor (when stated), the cited URLs, and a JSON payload the curator can paste into a pack manifest with minor edits. Confidence score attached so low-confidence drafts get more curator attention.

  • Pulls a draft payload_json the curator can edit.
  • Lifts cited URLs into a cited_urls field.
  • Suggests jurisdiction and corridor when the text supports it.
  • Always optional. The curator can ignore the draft.
Job 03 · Outbound solicitation

Draft the emails curators send to subscribed experts.

Subscribers to the expert-request mailing list get periodic asks like "do you have a recent fee cap for this corridor?". The automation drafts those emails so curators only have to review and send. The draft names the topic, asks one clear question, and includes a one-click reply link.

  • One topic per email. No marketing tone, no obligation.
  • Curator reviews and edits before send. Nothing auto-sends.
  • Reply links route through the inbound-email vetting loop.
  • Volume cap: ~one email per subscriber per month.
Job 04 · Inbound vetting

Process email replies through the same boundary.

When a subscriber replies, the email gateway POSTs the body to the hub. The automation runs the same PII + safety scan as a website submission, classifies the intent (verification, new information, regulatory change, or unclear), and extracts whatever public-source facts it can. The curator sees the structured result, not raw email.

  • Auto-rejects auto-replies and unsubscribe messages.
  • Same rejection rules as the website form. No special path.
  • Extracted facts become a candidate pack diff for review.
  • Sender address never enters a public pack.
02 · Submission flow

Same five steps, regardless of where the proposal came from.

Whether a submission landed via the website form, the public-source scraper finding a regulator update, or an inbound email reply, it follows the same path before anything ships.

Step 01

Intake

Form POST, scraped public source, or inbound email payload arrives at the hub.

Step 02 · boundary

Edge regex filter

Cheap, deterministic. Rejects emails, phone numbers, ID-like strings, and free text over the size cap before any LLM call runs.

Step 03 · Server automation

LLM evaluator

The automation classifies intent, scans for prompt injection and operational advice, surfaces PII the regex missed, and writes the structured pack draft.

Step 04

Curator review

A human curator reads the automation verdict, the original text, and the draft. They edit, accept, or reject. Their decision is the source of truth.

Step 05

Vetted pack release

The curator vets the pack, the manifest references the previous version (nothing is overwritten), and clients pull the new version on their next sync.

03 · Stakeholder email loop

How experts get asked, and how their replies come back.

The mailing list at /newsletter is the public side of this loop. The inbound-email endpoint is the reply side.

Step A

Curator queues a topic

"Need verification of the 2026 PHL placement fee cap." The automation drafts the email body, names the corridor, asks one question, includes a reply link.

Step B

Curator reviews and sends

The draft is editable. Nothing sends without a human click. The send goes through the email provider; the hub never stores the recipient list.

Step C

Subscriber replies

The reply hits the email gateway. The provider POSTs subject + body + sender domain to POST /api/hub/automation/inbound-email on the hub.

Step D · boundary

Server automation vets the reply

Same rules as a website submission. Auto-replies are dropped. PII triggers reject. Intent gets classified. Public-source facts get extracted.

Step E

Curator reviews structured result

Curator sees the automation summary + extracted facts, not the raw email. If accepted, the result becomes a candidate pack diff that re-enters the standard review pipeline above.

04 · Which model runs the server automation

Could be Gemma 4. Currently a cloud API.

The automation could run on a self-hosted Gemma 4 instance for screening, PII review, and the templating step. For this submission it routes to a cloud API to keep the public hub CPU-only and avoid provisioning a GPU for a demo.

Provider configured by env var. First one set wins.

The module resolves a provider at request time, makes a single synchronous call, and falls back to a deterministic regex-only verdict if no provider is reachable. The fallback is conservative: anything ambiguous routes to needs_curator_review. The hub never blocks because a model is down.

Switching providers is one env var. The interface the module uses is OpenAI-chat-compatible across all four supported routes.

OpenRouterDUECARE_AUTOMATION_OPENROUTER_KEY · any chat model on OpenRouter
Mistral APIDUECARE_AUTOMATION_MISTRAL_KEY · default mistral-small-latest
OpenAIDUECARE_AUTOMATION_OPENAI_KEY · default gpt-4o-mini
Self-hosted Gemma 4DUECARE_AUTOMATION_OLLAMA_BASE_URL · default model gemma2:2b (swap to a Gemma 4 tag once available)
NoneRegex-only fallback. Every record routes to a curator. No silent acceptance.
05 · API surface

Two endpoints today.

Anything the automation decides is logged with the model name and the verdict reasons so curators can audit and replay.

METHODPATHPURPOSECALLS AUTOMATION
POST/api/hub/opencrawl/updatesPublic-source proposal intake. Used by the website form and the public-source scraper.yes
POST/api/hub/automation/inbound-emailEmail-gateway webhook for replies to expert-request emails.yes
POST/api/hub/signalsAnonymized usage signals. Edge regex only; no LLM call.no
GET/api/hub/opencrawl/updatesList the queue of proposals (curator UI reads this).no

The automation has hard limits on purpose.

It never publishes. It never auto-sends email. It never accepts raw worker case content. Every action it takes is logged with the model name and reasons so a curator (or you, reviewing months later) can replay why a proposal was accepted, rejected, or routed for review.

Submit a proposal →