Integer Corpus

The special feature of this corpus is — as the name suggests — that it consists of spreadsheets with Integer values only. The corpus consists of 33 spreadsheets, 21 of them are real-life spreadsheets while 12 are artificially created. The spreadsheets are used for a variety of purposes, e.g. for determining the winner of Wimbledon and for calculating the lowest price combination of a shopping list. The spreadsheets use arithmetical and logical operations as well as the functions SUM and IF in their formulas. The  size of the spreadsheets vary from 7 to 233 formulas, with an average of 39 formulas.

From the 33 base spreadsheets, 231 faulty versions were created by randomly selecting formula cells and applying one of the mutation operators defined by Abraham and Erwig on them.  We seeded up to three faults into a single copy of a spreadsheet. More details about the creation of the faulty spreadsheets can be found in our paper “The Right Choice Matters! SMT Solving Substantially Improves Model-Based Debugging of Spreadsheets“.

The Integer corpus can be downloaded here.  If you want to cite this corpus, please refer to the initial paper using the following bibtex entry.

author = {Simon Ausserlechner and Sandra Fruhmann and Wolfgang Wieser and Birgit Hofer and Raphael Spork and Clemens Muhlbacher and Franz Wotawa},
title = {The Right Choice Matters! {SMT} Solving Substantially Improves Model-Based Debugging of Spreadsheets},
booktitle = {2013 13th International Conference on Quality Software (QSIC)},
publisher = {{IEEE}},
isbn = {978-1-4799-0500-3},
pages = {139--148},
year = {2013},
doi = {10.1109/QSIC.2013.46},