Mathematical skills > Assessment of proof

Question 49
How can the assessment of proof be automated?

What motivates this question?

Proof lies at the heart of mathematics, and is a hallmark which distinguishes mathematics from other subjects. So, from the perspective of assessment which is authentic to the discipline, it is natural to seek to assess proof.

Mathematical proof may (or may not) be more constrained and structured language than other subject areas. Whereas automatic assessment of the content of an essay seems hopeless, a mathematical proof may be constrained enough to make progress.

Bickerton and Sangwin (2021) made some practical proposals; these might be investigated with more specific research questions.

Note that there have been attempts to test aspects of proof, such as by laying out a proof and asking students to identify where an error has taken place, or giving some phrases and calculations and asking students to order these into a proof. Lawson (2002) says that such approaches are “undoubtedly imaginative use of current technology”, but “cannot be thought of as equivalent to asking a student to prove [a conjecture] from scratch”, though such questions might help students learn “general ideas” about proofs. Greenhow (2015) comments that this type of approach is “one is asking if a student can recognize the correct response when he/she sees it rather than generating it him/herself. This is certainly a necessary skill, but it is not sufficient, and certainly not all we would aspire to in our students”.

What might an answer look like?

Research in this direction might focus on convergent validity, considering established metrics for students’ understanding of proof as benchmark metrics against which to compare innovated automated assessments. This could include traditional written assessments, multiple-choice comprehension quizzes (e.g. Mejia-Ramos et al. 2017), proof summaries (Davies et al., 2020) or validation tasks (Selden & Selden, 2003). A study to this end is in development in 2021, with the intention to collect relevant data in 2022 at The University of Edinburgh and University College London.

Another approach to this question, related to Q22: What principles should inform the design of e-assessment tasks? and Q23: E-assessment task designers often convert questions that could be asked on a traditional pen and paper exam: what are the implications, technicalities, affordances and drawbacks of this approach?, might focus on design principles for writing automated assessment questions in this area. Researchers might draw on the Design-based Research methodology to iteratively develop theoretical principles and practical artefacts in a cyclic process. Ruth Reynolds and Ben Davies (University College London) have proposed a workshop to this end at the 4th International STACK Conference hosted by TTK University of Applied Sciences in April 2021.

References

Bickerton, R. and Sangwin, C (2021). Practical online assessment of mathematical proof. International Journal of Mathematical Education in Science and Technology, 1-24. https://doi.org/10.1080/0020739X.2021.1896813

Davies, B., Alcock, J. and Jones, I. (2020). Comparative judgement, proof summaries and proof comprehension. Educational Studies in Mathematics. https://doi.org/10.1007/s10649-020-09984-x

Greenhow, M. (2015). Effective computer-aided assessment of mathematics; principles, practice and results. Teaching Mathematics and its Applications, 34(3), 117-137. https://doi.org/10.1093/teamat/hrv012

Lawson, D. (2002). Computer-aided assessment in mathematics: Panacea or propaganda? International Journal of Innovation in Science and Mathematics Education, 9(1). Retrieved from https://openjournals.library.sydney.edu.au/index.php/CAL/article/view/6095

Mejía-Ramos, P., Lew, K., de la Torre, J. & Weber, K. (2017). Developing and validating proof comprehension tests in undergraduate mathematics, Research in Mathematics Education, 19:2, 130-146. https://doi.org/10.1080/14794802.2017.1325776

Selden, A., & Selden, J. (2003). Validations of proofs considered as texts: Can undergraduates tell whether an argument proves a theorem? Journal for Research in Mathematics Education, 34(1), 4–36