realvuln v1.0
Dashboard Methodology Dataset Findings Roadmap GitHub ↗
02 · Dataset

The corpus

RealVuln favors fewer findings with audit-grade label quality over scale. Studies have found 40%+ of entries in popular auto-labeled datasets mislabeled; every entry here was reviewed by hand and ships with its evidence.

2.1

At a glance

Ground truth
817

hand-labeled findings — 697 real vulnerabilities and 120 false-positive traps across 18 CWE families.

Frameworks (26 repos)
Flask16
Django3
FastAPI3
aiohttp1
Tornado1
custom3
Provenance
100%

human-authored, with high confidence. All projects predate widespread LLM code generation — relevant for mitigating data-contamination concerns when evaluating LLM scanners.

2.2

Ground-truth schema

Each target repository carries a manifest pinned to a commit SHA. An is_vulnerable: false entry is a false-positive trap; acceptable_cwes absorbs reasonable CWE ambiguity.

ground-truth.json — example entries
{
  "schema_version": "1.0",
  "repo_id": "realvuln-pygoat",
  "repo_url": "https://github.com/adeyosemanputra/pygoat",
  "commit_sha": "a1b2c3…",          // pinned: prevents ground-truth drift
  "type": 1, "language": "python", "framework": "django",
  "authorship": "human_authored",
  "authorship_confidence": "high",
  "authorship_evidence": "pre-LLM project, established 2018",
  "findings": [
    {
      "id": "pygoat-014", "is_vulnerable": true,
      "vulnerability_class": "sql_injection",
      "primary_cwe": "CWE-89",
      "acceptable_cwes": ["CWE-89", "CWE-564", "CWE-943"],
      "file": "introduction/views.py",
      "location": { "start_line": 42, "end_line": 48, "function": "sql_lab" },
      "severity": "high",
      "evidence": { "source": "manual_review", "cve_id": null,
        "description": "SQL injection via unsanitized parameter" }
    },
    {
      "id": "pygoat-fp-003", "is_vulnerable": false,   // false-positive trap
      "vulnerability_class": "sql_injection",
      "primary_cwe": "CWE-89",
      "evidence": { "source": "manual_review",
        "description": "ORM filter() — auto-parameterized, safe" }
    }
  ]
}
2.3

Repositories

The ground-truth corpus holds 26 repositories · 817 findings (697 vulnerabilities, 120 traps), all scored on the current leaderboard. The full corpus is listed below.

RepositoryFrameworkVulnsFP traps
pygoatdjango7010
vulnpycustom7816
vulpyflask546
djangoatdjango506
damn-vulnerable-graphql-applicationflask354
dsvpwacustom326
owasp-web-playgroundflask296
extremely-vulnerable-flask-appflask284
flask-xssflask285
dsvwcustom274
lets-be-bad-guysdjango244
threatbyteflask245
defdev-appflask225
dvblabflask224
dvpwaaiohttp224
vulnerable-python-appsflask225
python-appflask204
vulnerable-flask-appflask204
damn-vulnerable-flask-applicationflask154
vulnerable-apiflask143
vulnerable-tornado-apptornado143
vampiflask134
insecure-webflask92
vfapifastapi92
python-insecure-appfastapi82
intentionally-vulnerable-python-applicationflask72
pythonsstifastapi21