Grading Accuracy Examples

This blog-style page showcases how different grading models evaluate student responses against marking schemes across multiple subjects. It's designed to make scoring behavior easy to audit: you can open an example and see the question, the rubric, the student answer, the model reasoning, and the feedback.

Definitive answers are graded strictly. For questions with one clear target (equations, numeric results, factorisation/simplification, and unambiguous facts), the scorer expects correct method and correct final output. Small arithmetic or algebra slips are flagged, even if the approach is on the right track—just like a careful examiner.

Interpretation is treated as a feature, not a bug. The “Questionable” tab highlights responses where the boundary between full credit and partial credit can reasonably differ across teachers (e.g., vague phrasing, implicit assumptions, borderline specificity, or alternative valid framings). In real classrooms, two markers might award slightly different scores while still being fair; we surface those cases explicitly to make the system's judgment transparent.

Each section includes a metrics snapshot that summarizes performance for that subject. Use it as a quick overview, then open individual examples to see the question, marking scheme, student answer, model reasoning, and feedback.

Model Grading Showcase

Select a subject to explore worked examples and scoring behavior.