DEFI@home — contribute an assessment
DeFiPunk'd does not run crawlers. Contributors assess protocols by running a pinned prompt through an LLM of their choice (Claude, ChatGPT, Gemini, etc.) and submitting the JSON output as a pull request. A quorum bot merges your submission once at least 3 independent runs from different models agree.
The flow
- Open any protocol's page and find the Audit a dimension yourself · DEFI@home section under the Risk analysis cards. Click Copy prompt on one slice, or use Audit all to run the five risk dimensions in one LLM session.
- The prompt includes the snapshot timestamp, the protocol's chains, GitHub repos, audit links, and the current
prompt_versionalready pinned in. Paste it into Claude, ChatGPT, Gemini, or any LLM with browsing / tool use. - A slice prompt returns a single JSON object matching the slice-assessment schema. Audit all returns a single JSON array containing exactly five normal slice objects. The schema file is named v2 but is forward-compatible: prompts emit
schema_version: 4today and the validator accepts older supported versions too. No markdown fences outside the single JSON code block, no prose outside the JSON. - Click Submit run ↗ on the same row. It opens GitHub's new-file interface pre-pointed at
data/submissions/<slug>/<slice>/for one slice ordata/submissions/<slug>/all/for the combined array. Paste your JSON, commit, open the PR. - CI validates your JSON against the schema. The quorum bot compares it against other submissions; overlapping grades + citations land an accepted assessment into
data/assessments/, which the site reads on next build.
Why the prompt is pinned
The prompt you see on the protocol page has the DeFiLlama snapshot timestamp, analysis date, and prompt version baked in. Determinism of the workflow comes from consensus across re-runs, not from the LLM being deterministic — a contributor running the same prompt a week from now operates on the same inputs.
The current pin is PROMPT_VERSION = 29. Older versions still merge, just at a lower weight (see the weighting table).
Every submission records the model used (claude-opus-4-7, gpt-5, etc.) and the commit SHA of any repo it cites. Reviewers can re-open the same PR two months later and re-verify every citation.
What counts as evidence
- Public block explorers (Etherscan, Basescan, Arbiscan, etc.) for addresses in the protocol's contract set. Slices that touch on-chain state require at least one block-explorer URL.
- Commits in the protocol's linked GitHub repositories, cited with a commit SHA.
- Audit PDFs or reports linked from DeFiLlama or the protocol's docs.
- DeFiLlama's pinned fields — but only for category / chain lists, never for risk assessment.
Anything outside that list (Twitter threads, Medium posts, Discord screenshots) does not count. The review is a PR against a git repo; if a reviewer can't independently re-fetch the URL later, the evidence isn't evidence. Adding fetched_at timestamps to evidence entries earns a small weight bonus, since they make later re-verification cheaper.
When to say unknown
If you cannot find a signal after checking the sources above, submit grade: "unknown" with at least one entry in unknowns[] describing what you looked for, prefixed with the checklist code. "C3: couldn't find the timelock delay on the proxy admin" is useful to the next contributor; a guessed grade is a defect. Listing unknowns[] alongside a non-unknown grade is also valued — it earns a small bonus because it shows you completed the checklist honestly rather than papering over gaps.
Sharing your chat (recommended)
The chat_url field in the JSON output is for a publicly-readable share URL of the conversation that produced the assessment. The quorum bot gives extra weight to submissions where the reasoning chain is independently re-readable.
Important: the default share link your LLM may offer is usually the private kind, requiring viewers to be logged into the same account. You need to explicitly enable public sharing. Per-platform:
- Claude (claude.ai): Click the share icon at the top of the conversation → toggle "Share publicly" → copy the resulting
https://claude.ai/share/...URL. - ChatGPT: Click "Share" → "Create public link" → copy the
https://chatgpt.com/share/...URL. ("Anyone with the link" is the right setting.) - Gemini (gemini.google.com): Click the three-dot menu on the response → "Share & export" → "Create public link" → copy the URL.
- Local LLMs / API: No public-share option exists. Leave
chat_urlasnull; the submission is still accepted, just without the public-share weight bonus.
Paste the public URL into the chat_url field of your JSON before opening the PR. The prompt explicitly tells the LLM to leave this field as null — only you can produce a public-share URL, since it requires a user-side toggle the LLM cannot perform.
Batch submissions
A slice-directory submission file can hold either a single JSON object or an array of objects with the same slug and slice but different model values. Useful when one contributor runs the same prompt through several models in one sitting — each entry in the array is scored independently and counts toward quorum separately. The all directory is reserved for Audit all: one array with exactly five objects, one for each risk slice. Naming convention for batch files: models-<date>.json.
Quorum and autorun
Three GitHub Actions run the semi-automated pipeline:
- validate-submission — every submission PR is schema-checked, format-cleaned (markdown-wrapped URLs auto-stripped where inner == outer), and labeled. Failures block the PR with a structured comment.
- quorum — runs daily at 06:00 UTC and after every submission push to main. Computes consensus per (slug, slice): strong = weight share ≥60% with ≥3 submissions, weak = weight share ≥50% with ≥2 submissions. On consensus, opens a PR into
data/assessments/. The first submission for a slug lazily opens a persistent aggregation issue (one issue per protocol, not per slice) where per-slice consensus and dissent are tracked as comments. - autorun (third voice) — runs Mondays at 04:00 UTC, picks (slug, slice) pairs stuck at 1–2 submissions ordered by TVL, and runs the same pinned prompt through the Anthropic API. Default model is
claude-sonnet-4-6. Submissions get the model name suffixed with(autorun)so their weight is transparent.
How submissions are weighted
Every submission gets a base weight of 1.0, then adjusted by the factors below. The total weight per grade decides which grade wins quorum. The highest-weight submission for the winning grade becomes the canonical rationale in the merged assessment.
| Factor | Adjustment |
|---|---|
| Base | 1.0 |
Public chat_url share link | +0.3 |
| Block-explorer URLs in evidence | +0.1 each, capped at +0.3 |
fetched_at on evidence entries | +0.05 each, capped at +0.2 |
Non-empty unknowns[] with a non-unknown grade | +0.15 |
Autorun submission (model suffixed (autorun)) | +0.2 |
Older prompt_version | −0.2 per version behind current |
Snapshot mismatch (different snapshot_generated_at than current) | −0.1 |
Non-thinking model (no thinking in name and not opus / o-series / gemini-3-pro) | ×0.2 (5× penalty) |
| Hallucination-prone model (claude-haiku-4-5, gemini-3-flash-preview, gpt ≤ 5.3) | ×0.05 (20× penalty) |
| Floor | weight ≥ 0.1 (≥ 0.02 non-thinking, ≥ 0.0025 hallucination-prone) |
Reconcile — the master file
After quorum merges per-slice assessments, a fourth scheduled action — reconcile, Mondays at 06:00 UTC — runs Claude Sonnet over the merged assessments plus the raw submissions and writes a synthesized verdict to data/master/<slug>.json. This is what feeds the protocol detail page's narrative: findings, steel-man arguments per grade, verdict, noted dissent, and any flags. Reconcile does not re-grade — it consolidates what the quorum already decided into prose. If your submission lost quorum, your dissenting view still surfaces here.