When Retrieval Hurts: An Honest Evaluation of RAG for Solidity Vulnerability Detection

Sergei Solovev

2026-05-01 · Preprint, Figshare · DOI: 10.6084/m9.figshare.32141182

Download PDF View on Figshare

Abstract

Empirical study showing a sample-size sign reversal in naive RAG for Solidity vulnerability detection: +2.0% Macro-F1 at n=100 flips to -2.7% at n=250 on SolidiFI. Argues for bootstrap confidence intervals in any RAG evaluation.

Keywords: smart contract security; retrieval-augmented generation; Solidity; vulnerability detection; bootstrap confidence intervals