Corpora Overview

This page serves as overview (and link collection) of existing corpora and specific subsets of them (e.g., known faults, smell annotation, version history).

Enron

EUSES

FUSE

  • FUSE corpus
  • Subset of FUSE enhanced with type annotations (Meta-data, headers, attributes, data, derived data)

Payroll/Gradebook

  • Original Forms3 spreadsheets with inserted faults and test verdicts in a log file (PDF,  authors send corpus on request)
  • Excel version

Info1

  • Corpus with real faults and simulated test verdicts

Integer

  • Collection of spreadsheets with inserted faults and test verdicts. All spreadsheets  of this corpus comprise only integer values.

Hawaii Kooker

  • Collection of faulty spreadsheets created by undergraduate business students (PDF, authors send corpus on request)