OCR-Based vs. End-to-End Transformer Pipelines for Receipt Information Extraction: A Comparative Study on SROIE 2019

Sergei Solovev

2026-02-26 · Preprint, Figshare · DOI: 10.6084/m9.figshare.31430086

Download PDF View on Figshare

Abstract

EasyOCR + heuristic rules vs. Donut end-to-end Transformer on SROIE 2019. Error taxonomy under image degradation typical of messenger-grade distortions (compression, rotation, blur).

Keywords: OCR; Donut transformer; document understanding; SROIE 2019; receipt extraction; image degradation; error taxonomy