DEFI@home — contribute an assessment

DEFI@home assessments are produced by contributors, not crawlers. You run a pinned prompt through the LLM of your choice (Claude, ChatGPT, Gemini, etc.) and submit the JSON output as a pull request. A quorum bot merges your submission once at least 3 independent runs from different models agree — that agreement is what turns a grade into a published claim.

The flow

Open any protocol's page and find the Audit a dimension yourself · DEFI@home section under the Risk analysis cards. Click Copy prompt on one slice, or use Audit all to run the five risk dimensions in one LLM session.
The prompt includes the snapshot timestamp, the protocol's chains, GitHub repos, audit links, and the current prompt_version already pinned in. Paste it into Claude, ChatGPT, Gemini, or any LLM with browsing / tool use.
A slice prompt returns a single JSON object matching the slice-assessment schema. Audit all returns a single JSON array containing exactly five normal slice objects. The schema file is named v2 but is forward-compatible: prompts emit schema_version: 4 today and the validator accepts older supported versions too. No markdown fences outside the single JSON code block, no prose outside the JSON.
Click Submit run ↗ on the same row. It opens GitHub's new-file interface pre-pointed at data/submissions/<slug>/<slice>/ for one slice or data/submissions/<slug>/all/ for the combined array. Paste your JSON, commit, open the PR.
CI validates your JSON against the schema. The quorum bot compares it against other submissions; overlapping grades + citations land an accepted assessment into data/assessments/, which the site reads on next build.

Why the prompt is pinned

The prompt you see on the protocol page bakes in the DeFiLlama snapshot timestamp, analysis date, and prompt version. Pinning these fixes the evidence set, so consensus across re-runs is meaningful and re-verifiable rather than dependent on any single LLM being deterministic — a contributor running the same prompt a week from now works from the same inputs.

The current pin is PROMPT_VERSION = 29. Older versions still merge, just at a lower weight (see the weighting table).

Every submission records the model used (claude-opus-4-7, gpt-5, etc.) and the commit SHA of any repo it cites. A reviewer can re-open the same PR two months later and re-verify every citation.

What counts as evidence

Public block explorers (Etherscan, Basescan, Arbiscan, etc.) for addresses in the protocol's contract set. Slices that touch on-chain state require at least one block-explorer URL.
Commits in the protocol's linked GitHub repositories, cited with a commit SHA.
Audit PDFs or reports linked from DeFiLlama or the protocol's docs.
DeFiLlama's pinned fields — but only for category / chain lists, never for risk assessment.

Anything outside that list (Twitter threads, Medium posts, Discord screenshots) does not count. The review is a PR against a git repo: if a reviewer can't independently re-fetch the URL later, the evidence isn't evidence. Adding fetched_at timestamps to evidence entries earns a small weight bonus, since they make later re-verification cheaper.

When to say `unknown`

If you cannot find a signal after checking the sources above, submit grade: "unknown" with at least one entry in unknowns[] describing what you looked for, prefixed with the checklist code. "C3: couldn't find the timelock delay on the proxy admin" is useful to the next contributor; a guessed grade is a defect. Listing unknowns[] alongside a non-unknown grade is also valued — it earns a small bonus because it shows you completed the checklist honestly rather than papering over gaps.

The chat_url field in the JSON output is for a publicly-readable share URL of the conversation that produced the assessment. The quorum bot gives extra weight to submissions where the reasoning chain is independently re-readable.

Important: the default share link your LLM may offer is usually the private kind, requiring viewers to be logged into the same account. You need to explicitly enable public sharing. Per-platform:

Claude (claude.ai): Click the share icon at the top of the conversation → toggle "Share publicly" → copy the resulting https://claude.ai/share/... URL.
ChatGPT: Click "Share" → "Create public link" → copy the https://chatgpt.com/share/... URL. ("Anyone with the link" is the right setting.)
Gemini (gemini.google.com): Click the three-dot menu on the response → "Share & export" → "Create public link" → copy the URL.
Local LLMs / API: No public-share option exists. Leave chat_url as null; the submission is still accepted, just without the public-share weight bonus.

Paste the public URL into the chat_url field of your JSON before opening the PR. The prompt explicitly tells the LLM to leave this field as null — only you can produce a public-share URL, since it requires a user-side toggle the LLM cannot perform.

Batch submissions

A slice-directory submission file can hold either a single JSON object or an array of objects with the same slug and slice but different model values. Useful when one contributor runs the same prompt through several models in one sitting — each entry in the array is scored independently and counts toward quorum separately. The all directory is reserved for Audit all: one array with exactly five objects, one for each risk slice. Naming convention for batch files: models-<date>.json.

Quorum and autorun

Three GitHub Actions run the semi-automated pipeline:

validate-submission — every submission PR is schema-checked, format-cleaned (markdown-wrapped URLs auto-stripped where inner == outer), and labeled. Failures block the PR with a structured comment.
quorum — runs daily at 06:00 UTC and after every submission push to main. Computes consensus per (slug, slice): strong = weight share ≥60% with ≥3 submissions, weak = weight share ≥50% with ≥2 submissions. On consensus, opens a PR into data/assessments/. The first submission for a slug lazily opens a persistent aggregation issue (one issue per protocol, not per slice) where per-slice consensus and dissent are tracked as comments.
autorun (third voice) — runs Mondays at 04:00 UTC, picks (slug, slice) pairs stuck at 1–2 submissions ordered by TVL, and runs the same pinned prompt through the Anthropic API. Default model is claude-sonnet-4-6. Submissions get the model name suffixed with (autorun) so their weight is transparent.

How submissions are weighted

Every submission gets a base weight of 1.0, then adjusted by the factors below. The total weight per grade decides which grade wins quorum. The highest-weight submission for the winning grade becomes the canonical rationale in the merged assessment.

Factor	Adjustment
Base	1.0
Public `chat_url` share link	+0.3
Block-explorer URLs in evidence	+0.1 each, capped at +0.3
`fetched_at` on evidence entries	+0.05 each, capped at +0.2
Non-empty `unknowns[]` with a non-unknown grade	+0.15
Autorun submission (model suffixed `(autorun)`)	+0.2
Older `prompt_version`	−0.2 per version behind current
Snapshot mismatch (different `snapshot_generated_at` than current)	−0.1
Non-thinking model (no `thinking` in name and not opus / o-series / gemini-3-pro)	×0.2 (5× penalty)
Hallucination-prone model (claude-haiku-4-5, gemini-3-flash-preview, gpt ≤ 5.3)	×0.05 (20× penalty)
Floor	weight ≥ 0.1 (≥ 0.02 non-thinking, ≥ 0.0025 hallucination-prone)

Reconcile — the master file

After quorum merges per-slice assessments, a fourth scheduled action — reconcile, Mondays at 06:00 UTC — runs Claude Sonnet over the merged assessments plus the raw submissions and writes a synthesized verdict to data/master/<slug>.json. This feeds the protocol detail page's narrative: findings, steel-man arguments per grade, verdict, noted dissent, and any flags. Reconcile does not re-grade — it consolidates what the quorum already decided into prose. If your submission lost quorum, your dissenting view still surfaces here.