EXCEL GRADER

Dr. Thomas G. Hill

Northeast Mississippi Community College

Booneville, MS

(662) 720-7384

thill@necc.cc.ms.us

 

Abstract: Although program grading has been an active area of research in computer science education, there is little evidence of research on automated grading systems for office applications. This paper describes EXCEL GRADER. EXCEL GRADER is an automated grader for use by instructors in Microsoft Excel and Microsoft Office application courses. EXCEL GRADER performs static analysis on Microsoft Excel workbooks. EXCEL GRADER compares a student workbook with the instructor’s correct version of the workbook. EXCEL GRADER marks the student workbook and assigns a grade. EXCEL GRADER includes reports and a tool for plagiarism detection. The graders have been tested in thirty-nine Microsoft Office applications classes (with about 800 students) over two semesters. Thousands of assignments have been graded. According to a survey of instructors using the programs, the graders were extremely accurate and saved hours of grading time. Students that were surveyed felt the grading was fair and accurate.

 

 

 

1.     INTRODUCTION

 

Throughout the history of computer science education, a number of grading systems for evaluating programming assignments have been developed. These include Kassandra [vM94], Ceilidh/CourseMaster [FHT00], BOSS [JL98], and Curator [McQ01].

 

Although program grading has been an active area of research, there is little evidence of research on automated grading systems for office applications. One commercially available system is SAM (Thomson Learning/Course Technology) [Tho02]. SAM grades Microsoft Office tasks performed in a simulated Office XP environment. SAM assigns students a task, and then records the students’ mouse and keyboard events used in their response. Those events are compared with possible correct actions and the results are graded.

 

This paper describes EXCEL GRADER (see Figure 1). EXCEL GRADER is another automated grader for use by instructors in Microsoft Excel and Microsoft Office application courses. In contrast to SAM, EXCEL GRADER performs static analysis of Microsoft Excel workbooks.

 

Excel courses often require students to create worksheets based on a list of requirements. When the instructor has a correct solution to an assignment, this correct version can be compared to the student’s worksheet. The differences detected can be used as a basis for marking and grading the student’s worksheet.

 

EXCEL GRADER has two goals. The first is to grade assignments more accurately and in greater detail when compared to hand grading via printed copies of assignments. The second is to reduce the time and effort required in grading Excel assignments.

 

 

Figure 1 EXCEL GRADER Screen Shot

 

 

 

EXCEL GRADER grades the following worksheet attributes:

·       text labels

·       numbers

·       formulas

·       fonts (face, size, style: bold/italics/underline, color)

·       cell alignment (left/right/center)

·       cell fill color

·       indent level

·       cell orientation (tilt)

·       number formats (currency, percent, accounting, scientific, etc.)

·       borders (color, thickness,  style: solid, broken, etc.)

·       row height

·       column width

·       merged ranges

·       worksheet names

·       chart  type, subtype, chart location, and chart source data

 

 

EXCEL GRADER does not conditional formatting, embedded objects (such as pictures, clip art, organizational charts, word art, etc.). Neither does it grade drawing objects (shapes, lines, drop shadows, etc.) or controls (buttons, list boxes, text boxes, etc.).

 

 

 

2.     PROGRAM DESIGN

 

EXCEL GRADER is a Microsoft Access application (the grader requires both Excel 2002 (XP) and Access 2002 (XP). EXCEL GRADER extracts attributes from a student workbook and the instructor’s correct version of the workbook. These attributes are placed into one of eight different EXCEL GRADER system tables:  Cell, Format, Border, Formula, Row, Col, Chart, and Worksheet (see Figure 2.)

 

 

Figure 2 EXCEL GRADER Grading Process

 

After attributes are extracted from both the instructor's correct workbook and a student's workbook, the resulting tables are compared. Missing records are first detected and recorded in the errors table. Next, records with common keys are compared a field at a time. For example, when Cell tables are compared, cells with a common worksheet name and the same row and column are connected. The Value, NumberFormat, Font Name, FontSize and FontStyle fields in the two records are compared. Fields that differ are recorded in the errors table.

 

 

3.     MARKING ERRORS

 

EXCEL GRADER marks the original student workbook in two ways. a strike-through format is added to an error cell that has an invalid text label, number, or formula (see Figure 3). Comments are added to cells with error values and/or formatting errors. Errors (for all workbooks) are also recorded in the errors table.

 

 

Figure 3 Cell Marking and a comment

 

Figure 4 gives an example of a correct version of a worksheet. It is followed by a student’s version after being marked in Figure 5. Labels in cells A1 and A4 contain typographical errors. Column C contains six incorrect numeric values. The border weight at the bottom of row 8 should have been thin. The comment in G9 implies a string comparison of the G9 formula and the correct version found the formulas were unequal, but the formulas were algebraically equivalent. Commented fields have a red triangle in the upper right corner of the cell. Multiple comments are shown for illustrative purposes, but a comment only appears when one mouses over the cell.

 

Figure 4 Correct Version of a Worksheet  [SCV00]

 

 

Figure 5 Student's Marked Up Version

 

A grade summary is added to the bottom of the worksheet. This summary gives the number of problems missed and the number of points deducted.

 

 

4.     CHALLENGES: GRADING FORMULAS / RENAMING WORKSHEETS

 

Excel formulas are similar to algebraic expressions, one difference being operands are often cell references. Some examples are =E15-E16, =C9+(10%*C9), =SUM(D8:D12). The last formula uses a built-in Excel function. EXCEL GRADER first compares cell formulas as text strings. When the text strings are equal, the student formula is correct. When the strings are unequal, the formula may still be algebraically equivalent.

 

In the following example (Figure 6), the lower right corner sum may be obtained in different but algebraically equivalent ways. The spreadsheet comes from [SCV00].

 

Figure 6 Two paths to the same sum [SCV00]

 

Since both column F and row 8 are totals, the sum of either is equivalent to =Sum(B3:E7), the numbers in the interior of the spreadsheet. So possible answers for the formula in cell F8 are =Sum(F3:F7), =Sum(B8:E8), or =Sum(B3:E7).

 

The above problem is one example in a category of formulas equivalent because of the order in which the operations are performed (sum by row versus sum by column). Since addition is commutative and associative, addition in either order gives the same answer.

 

Similar problems occur with algebraically equivalent formulas like =B9+(10%*B9) and =B9*(1+10%). Algebraic identities must be applied to convert one formula into the second.

 

EXCEL GRADER solves this type of grading problem by employing the computer algebra system Symbolic. Symbolic is an interpreter for symbolic manipulation of mathematical expressions (like Maple or Mathematica) [Dol01].

 

Symbolic accepts identifiers consisting of at least one letter followed by several letters or digits. This means Symbolic can process cell references (A1 to IV65536) as variables. Symbolic also evaluates functions with the same syntax as Excel:  SQRT(x), EXP(x), LN(x), SIN(x), COS(x), TAN(x), ASIN(x), ACOS(x), ATAN(x), SINH(x), COSH(x), ABS(x).

 

On the other hand, Symbolic does not handle recognize ranges, many Excel functions, including common ones like Sum and Average. Symbolic does not recognize Excel formula symbols like $ and %. These special cases are handled prior to calling Symbolic. Dollar sign symbols are stripped from the formulas, so EXCEL GRADER does not currently mark a formula if they are missing.

 

But most importantly Symbolic includes a useful function: Simplify(instr, outstr). This function can simplify an algebraic expression. If both formulas are equivalent, we should be able to subtract one from the other and have an expression that reduces to zero.

 

Another problem that occurs when grading worksheets is a when a student misnames a worksheet. This problem occurs when an assignment has multiple worksheets and the student is instructed to rename each worksheet. For example, an assignment has three worksheets to be renamed: Payroll, AP, AR. If a student forgets to rename the worksheets, the sheet names retain their default values: Sheet1, Sheet2, Sheet3.

 

Cells cannot be accessed without first knowing the worksheet in which they are embedded. If corresponding worksheets are not matched, the student’s worksheet contents cannot be graded. A minor error would result in major points being deducted.

 

Often the ordinal position of the worksheets will match, but this is unfortunately not always the case. EXCEL GRADER uses a worksheet statistic to match worksheets: the number of non-empty cells in the worksheet.  Worksheets with the closest number of non-empty cells are assumed to correspond. The worksheet name is then renamed in the Access tables used to model the worksheet (Cell, Format, Border, Formula, Row, Col). In order to mark the original error, the Worksheet table entry is not renamed.

 

 

5.      USER INTERFACE

 

EXCEL GRADER requires the instructor to gather all student Excel files. A correct version of the workbook must also be constructed. All student files are placed in the same folder (requiring file names to be unique). The instructor then selects the correct file and the student directory. Browse buttons are provided (see Figure7.)

 

Figure 7 Workbooks to be graded

 

 

EXCEL GRADER allows the instructor to determine the number of points to deduct for general error types: mistyped/missing labels or numeric constants, incorrect formulas, formatting errors. EXCEL GRADER allows user input for the grading scale (see Figure 8.)

 

Figure 8 Points deducted

 

 

Formatting errors are counted on a per cell basis. An incorrect number format applied to a range of 10 cells would count off one point assuming 0.1 point deducted per error.

 

EXCEL GRADER has two options: skip format grading, and deduct extraneous. The “Skip Grading Fonts and Formatting” results in only formulas, text labels, and numbers being graded. This avoids grading advanced features (for example, row height/column width) before being covered in class. The option also avoids "error overload"; hiding details that the instructor might prefer to handle manually or simply ignore. The “Deduct Extraneous Cells/Formats” counts any extra formatting, cells values, or formulas against the student.

 

 

Figure 9 Grading options

 

 

The two numeric values in the options box apply to the "Student Pairs Common Errors" report. This report compares two student workbooks at a time and counts the number of common errors between the two students (common errors are those in the same cell with the same error type and error value). The "Minimum Common Errors" input value allows the instructor to determine the (error count) lower threshold for being included in the report.  The "Minimum Common Error Percent" establishes a lower limit for the percentage the first file’s total errors being also found in the second file.

 

 

Figure 10 Lower Limits for inclusion in Possible Plagiarism report

 

In a test grading of 992 student documents (8 assignments), 85 matches were found in the Possible Plagiarism report. The majority of these were assignments that had been submitted twice. After examining each file, only about 10 student documents were judged to be plagiarized (1%). The plagiarized files had identical typographical errors and/or other errors. Fifteen (1.5%) files were judged not to be plagiarized (false positives). Four files in the report may have been plagiarized, but the results were inconclusive.

 

 

6.     REPORTS

 

EXCEL GRADER grades all the workbooks in a given student folder (these workbooks are all compared against the same correct workbook). Three reports can be generated after grading. The Grade Report (see Figure 11) gives the grade for each file in the directory. Also included in the report is the number of errors of each type: text label/number errors, formula errors, and formatting errors. Student names have been blocked out.

 

Figure 11 Grade Report

 

An Errors List report shows all errors from every student workbook graded. The report shows the location of the error, the type of error, correct and student values.

 

Figure 12 Error List (All Workbooks)

 

The “Possible Plagiarism” report compares two student workbooks at a time (see Figure 13.) Matching errors are those in the same cell with the same error type and incorrect student value. A count of matching errors between the students is displayed.

 

Matching errors do not necessarily imply plagiarism. An instructor may find her best students with common errors. These may occur when steps are not explicit in the assignment, students make one assumption, and the instructor's workbook uses different assumptions.

 

A greater indicator of plagiarism may be the percentage of matching errors. This percentage can be summarized in the question: What percent of a student A's total errors were also made by student B?  A mediocre student worksheet may have 100% of the errors found in a very poor worksheet. However, when student A has 100% of student B’s errors, and student B also has 100% of student A’s errors, the worksheets should be compared by the instructor. Figure 13 gives an example of probable plagiarism (marked in red).

 

Figure 13 Common Errors Report

 

 

7.     EVALUATIONS

 

In Fall of 2003, three instructors who were beta testing EXCEL GRADER were surveyed. The results of the survey follow.

 

        Rate the accuracy of the grader: 

       Extremely accurate (3 for 3)

        Rate the difficulty learning to use the grader: 

       Easy (3 for 3)

        Give the total number of student complaints:

       4 student complaints for all instructors out of 392 total students (1%)

        Give the number of files skipped (the grader was unable to process): 

       2 total files for all instructors (out of more than 1300 graded)

        Estimate of the number of hours the grader saved you per week:

       Instructor 1: 6 hours,  Instructor 2:15 hours,  Instructor 3: 10 hours

 

 

A survey of students was also conducted during the Fall 2003 semester. Sixty students were surveyed. Fifty-nine students (98%) rated EXCEL GRADER as accurate or extremely accurate (see Figure 14.) Forty-eight students (80%) felt the grading was fair. Five students (8%) felt the grading process was harsh (see Figure 15.) Twenty-four students (40%) preferred the automated grader to an instructor. Seven students (12%) preferred the instructor, and 29 (48%) students had no preference (see Figure 16.)

Figure 14 Student Survey: Grader Accuracy

 

 

 

Figure 15 Student Survey: Grader Fairness

 

Figure 16 Student Survey: Grading Preference

 

To evaluate EXCEL GRADER, several assignments previously graded by hand by two instructors were processed by EXCEL GRADER. The comparison yielded the following observations on the strengths of EXCEL GRADER:

·       EXCEL GRADER marks in greater detail than human graders.
An instructor will rarely mark all the details missed by a student who has many errors on an assignment. Details ignored (or missed) by human graders included: alignments, column widths, row heights, missing borders, number formats, merged cells, and bold formatting

·       EXCEL GRADER can detect errors a human grader cannot.
EXCEL GRADER can grade negative number formats for currency cells when the value of the cell is positive. EXCEL GRADER can grade font color and fill color, which may be difficult or impossible if the instructor has a grayscale printout of the worksheet. EXCEL GRADER can differentiate between fonts which look similar but are different.

Experience using EXCEL GRADER has also shown anecdotal evidence that:

·       EXCEL GRADER can reduce the time required to grade assignments.
When using EXCEL GRADER, the Excel instructor will want to verify the results of the program’s grading. Objects not graded (such as charts) must be graded manually. The results must also be recorded by the instructor. Still, anecdotal evidence suggests these two tasks require less time than grading by hand. This timesaving comes primarily from not having to search for errors. Since EXCEL GRADER marks worksheet errors, the instructor only has to evaluate the severity of the errors that have been marked. It is the practice of beta testers to only check documents with low grades. Other workbooks are returned to students without being opened.

 

 

EXCEL GRADER’s weaknesses include:

 

·       EXCEL GRADER requires instructors to gather and redistribute Excel workbook files.
Instructors may choose to have students give them floppy disks, or submit files via ftp. If assignments are submitted as email attachments, Attachment Executive (http://www.madsolutions.com/AE/Main.htm) is a commercially available product for processing attachments and moving them to directories. Blat (http://www.interlog.com/~tcharron/blat.html) is an excellent freeware command line tool for mailing graded Excel files as attachments. GetMail (http://www.interlog.com/~tcharron/getmail.html) is the companion tool for downloading and extracting files from an email account. An overview of the process used by beta testers for gathering, grading, and returning student files is found in Figure 17. This process can be adapted for those with access to distance learning environments. The grader includes a batch file that can be scheduled for overnight grading.

 

 

 

 

Figure 17 Example scenario for acquiring and returning workbooks

 

 

·       EXCEL GRADER’s marking may seem excessive.
Since EXCEL GRADER accepts the instructor’s worksheet attributes as correct, non-equivalent student values are marked as incorrect. Extra spaces in student labels are marked. EXCEL GRADER can differ between number formats and fonts that look almost identical (for example, Currency with no dollar sign versus Accounting with no dollar sign). EXCEL GRADER grades negative number formatting for cells with Currency and Accounting formats.

 

·       EXCEL GRADER may occasionally skip a file.
EXCEL GRADER still does not handle all types of exceptions. When EXCEL GRADER encounters an error, it skips the workbook and continues grading the remainder of the files. Skipped workbooks are given a grade of -1 in the grade report. The instructor must grade these workbooks manually.

 

 

1.     FUTURE WORK

 

Future plans for EXCEL GRADER include continued testing and corrections. The program has been used for four semesters grading thousands of assignments from two different textbooks ([SCV00] and [SCV02]). Exposing the program to more assignments from other textbooks will improve the program’s handling of exceptional conditions.

 

There is also a need for tools to facilitate assignment gathering and redistribution. Students frequently misname documents that are submitted. Student email addresses are also incorrectly submitted. Web pages that require students to properly submit assignments would be useful. Integrating the program with distance learning environments is also needed.

 

Once an assignment is received, the ultimate goal is to have the graders perform their work without any human intervention. The current practice of beta testers is to manually check all assignments that receive a grade less than 80%. Hopefully, the graders will improve to the point that instructors will be content to receive reports of the grading results.

 

Both graders could grade more document attributes. In particular, grading Excel conditional formatting would be useful. Grading more Excel chart attributes and Access form and report features would improve the program. Having the programs deduct points for late submissions has also been suggested.

 

Student and instructor surveys at other educational institutions would advance the research. More studies contrasting the grader results with hand grading would also be constructive.

 

Information about the program can be found at http://www.excelgrader.com. A trial version of the software can be downloaded at no cost.

 


Works Cited

 

 

[Dol01] Jens-Uwe Dolinsky and Thorsten Pawletta. Interpreter for symbolic manipulation of mathematical expressions. Internet WWW page, at URL <http://www.mb.hs-wismar.de/~pawel/Uwe/casE.html> (version current at September 12, 2001).

 

[FHT00] E Foxley, C A Higgins, A Tsintsifas, P Symeonidis, The Ceilidh CourseMaster System, An Introduction, 4th Java in the Curriculum Conference, South Bank University, UK, 24th Jan 2000.

 

[FW65] Automatic grading programs; George E. Forsythe and Niklaus Wirth; Commun. ACM 8, 5 (May. 1965), Pages 275 – 278

 

[HI03] Thomas G. Hill. MEAGER & MADBAGS: Automated Graders for Microsoft Excel and Access Assignments, Unpublished doctoral dissertation, University of Mississippi, Oxford.

 

 [JL98] Mike Joy and Michael Luck. The BOSS System for On-line Submission and Assessment. Monitor: Journal of the CTI Centre for Computing 10 pp. 27-29, 1998

 

[McQ01] William McQuain. Curator Introduction. Internet WWW pages and documentation at http://ei.cs.vt.edu/~eags/Curator.html> (version current at September 8, 2000).

 

[Pie00] Matt Pierce. MDB Diff. Internet WWW page, at URL <http://jupiter.drw.net/matpie/PBSystems/products/retired/MDBDiff.html> (version current at October 7, 2000).

 

[SCV00] Gary B. Shelly, Thomas J. Cashman, and Misty Vermaat. Microsoft Office 2000 Introductory Concepts and Techniques, Course Technology, Cambridge, MA, 2000.

 

[SCV02] Gary B. Shelly, Thomas J. Cashman, and Misty Vermaat. Microsoft Office XP Introductory Concepts and Techniques, Course Technology, Cambridge, MA, 2002.

 

[Tho02] Thomson Learning/Course Technology. Course Technology – SAM. Internet WWW page, at URL <http://www.course.com/assessment/sam/default.cfm> (version current at November 29, 2002).

 

[vM94] Urs von Matt. Kassandra: the automatic grading system. SIGCUE(22):26-40, 1994.