Payroll/Gradebook Corpus

The Payroll/Gradebook corpus is an Excel version of Ruthruff  et al.’s Forms/3 spreadsheets. The corpus is based on a user study, in which 20 participants debugged two spreadsheets, namely  Gradebook and Payroll  by setting testing decisions and using the fault localization technique Nearest Consumer.
The Gradebook spreadsheet computes a student’s grade  at the end of the term, including results from quizzes, midterms and an exam to calculate a course grade and contains five faults in five different cells. The Payroll spreadsheet computes an employee’s salary, insurance and tax costs. This slightly larger spreadsheet contains six input cells, 18 formula cells, nine of which contain IF statements, and five faults in four cells. All user actions that were performed in Ruthruff  et al.‘s study, i.e. input and formula changes as well as any testing decisions, were written to log files. For our testing purposes, we converted the Forms/3 test files as well as the logging information to our evaluation format.  For each study participant, one or more spreadsheets and sets of testing decisions are created, depending on the user actions during the session: Each time a user changes either an input cell or a formula cell, an additional spreadsheet and test set is created.  A detailed description of the conversion can be found  here.

Since the testing decisions were set by humans, they may contain errors: nearly one third of the values are misclassified (oracle mistakes). We have indicated in the properties files of this corpus the correctness of the classification for each testing decision.

The Excel version of this corpus can be downloaded here.