Europe AI & Data Rules Hub

The Ethics of Grading. Can We Trust AI to Assess Creative Work?

PostedFebruary 26, 2025

ByIuliia Gorshkova

In 1784, a Prussian philosopher named Johann Gottlieb Fichte published an essay arguing that education should cultivate “the free, self-active human being.” Two centuries later, his words haunt us as we debate whether algorithms—machines built on binary logic—can meaningfully evaluate the messy, luminous spark of human creativity.

The question is no longer theoretical. Schools across Europe are piloting AI tools to grade essays, poetry, and art portfolios. Proponents praise their efficiency: an algorithm can assess 10,000 essays in the time it takes a teacher to drink a cup of coffee. But beneath the allure of speed lies a thornier dilemma: Can a system trained on past data ever truly understand the future of human expression?

The Ghost of Grading Past: A Brief History of Human Bias

Long before AI entered classrooms, human grading was a flawed art. Consider:

In 19th-century Oxford, essays were graded by candlelight, with examiners favoring florid Latin phrases over original thought.
A 2012 study found that teachers consistently rated identical essays higher when told the author was from a privileged background.
In France, the “baccalauréat” grading scandals of the 1990s revealed how regional biases influenced scores.

“We’ve always had bias in assessment,” says Dr. Elinor Bergmann, a philosopher of education at the Sorbonne. “The danger isn’t that AI will replicate our flaws—it’s that we’ll mistake its judgments for objectivity.”

The Algorithm as Critic: What AI Sees (and Doesn’t)

Modern AI grading tools, like OpenAI’s ChatGPT or Turnitin’s Revision Assistant, analyze creativity through proxies:

Vocabulary complexity: Does the student use “sophisticated” words?
Structural patterns: Does the essay follow a five-paragraph template?
Sentiment analysis: Is the tone “positive” or “critical”?

But creativity often defies such metrics. When a Swedish student submitted a poem written entirely in emojis, her AI grader labeled it “incoherent.” A human teacher, however, recognized it as a commentary on digital communication—and gave it an A.

“AI is like a chef who only knows how to measure ingredients,” says Marco Rossi, an AI ethicist in Milan. “It can’t taste the dish.”

The Ethical Minefield

1. The Standardization Trap

AI thrives on uniformity. But as Kafka wrote, “Art is the axe that breaks the frozen sea within us.” When algorithms reward conformity, students learn to write for machines, not humans. A 2023 EU study found that schools using AI graders saw a 40% drop in experimental writing styles.

2. The Cultural Blind Spot

An AI trained on Shakespeare may dismiss a migrant student’s code-switching poem as “grammatically inconsistent.” In Latvia, a student’s essay blending Latvian folk motifs with cyberpunk themes was flagged for “off-topic content” by an algorithm—yet later won a national youth literature prize.

3. The Death of Nuance

Human teachers can sense when a clunky metaphor is a first draft’s stumble versus a non-native speaker’s struggle. AI reduces such context to numerical scores. As one Dublin teacher lamented: “It’s like judging a sunset by its hex code.”

Case Studies: When Algorithms Fail the Turing Test for Empathy

The Van Gogh Incident: In 2022, a Dutch AI art grader rejected a student’s abstract painting for “lack of realism.” The student’s teacher—noting the homage to Van Gogh’s later works—overruled the system.
The Hemingway Paradox: A Budapest school’s AI tool downgraded essays for using “short sentences,” penalizing a student emulating Hemingway’s style.
The Plagiarism False Positive: A Polish student’s original poem about war was flagged as “plagiarized” because its phrases resembled news headlines in the AI’s database.

A Path Forward: Hybrid Models and Humility

None of this means AI has no role in grading. The solution lies in collaboration, not replacement:

1. AI as First Reader, Not Final Judge

Use algorithms to flag technical errors (spelling, citation formatting) while reserving creative assessment for humans. Finland’s newest EdTech guidelines mandate that AI scores never override teacher evaluations.

2. Train Algorithms on Diverse Voices

Include marginalized authors, non-Western literature, and avant-garde works in training data. Spain’s “AI for Inclusive Education” initiative now funds datasets featuring Roma poetry and Basque experimental prose.

3. Teach Students to “Hack” the System

In a Berlin pilot program, students analyze AI graders’ criteria to create meta-critical art—like a story that deliberately confuses the algorithm while delighting humans.

The Ultimate Question: What Is Grading For?

Grading has always been a means, not an end. Its purpose is to nurture potential, not merely rank it. As we automate assessment, we must ask:

Are we measuring creativity—or our ability to replicate the past?
Do we want students who write like Dickens or thinkers who reinvent storytelling?

In a Brussels middle school, I recently met a teacher who uses AI feedback as a “provocation.” When her students receive a bland algorithmic score, she challenges them: “Now—go rewrite it to confuse the machine and move the human.”

Perhaps that’s the answer. Let AI handle the arithmetic of education, but never the poetry.

Table of Contents

AI in Education: Fundamentals & Tools

Understanding AI Basics

Practical AI Tools for Educators

Integrating AI into Teaching

Case Studies & Success Stories

Ethical AI & Inclusive Practices

Ethical & Legal Frameworks for AI Systems

Equity & Inclusion

Transparency & Trust

Algorithmic Bias, Discrimination & Legal Risk

AI Errors, Accountability & Legal Responsibility

Institutional AI Governance & Policy Design

AI, Security & GDPR Compliance

Data Protection & Privacy

Cybersecurity in AI

EU Regulations & Policies

Engaging Parents & Guardians

Data Protection, Privacy & AI Governance

Additional Resources

Glossary of Terms

Templates & Guides

Webinars & Research

AI for Administrative & Pedagogical Support

AI for Time Management

AI in Student Performance Tracking

AI for Automated Communication

AI for Document Management

AI, Robotics & Biotech Regulation in Europe

EU AI Act Explained: Scope, Risk Levels, Obligations

AI-Enabled Products: Robots, Medical Software, Smart Devices

Machinery Regulation & Safety Standards for Intelligent Systems

AI in Healthcare & Biotech: MDR, IVDR, EMA

Liability & Responsibility for AI-Driven Systems

Compliance in Practice: From Risk Assessment to CE Marking

Country-Specific AI Regulation & Enforcement

How EU AI Law Is Applied at National Level

Different Compliance Models

National AI Sandboxes & Regulatory Experiments

Public Sector AI Rules Across Europe

Cross-Border AI Deployment Challenges

When National Law Overrides EU Guidance

Legal Cases, Enforcement & Real-World Precedents

AI Act, GDPR & Algorithmic Decision-Making Cases

Liability Disputes Involving Automated Systems

Robotics Accidents & Legal Responsibility

Medical & Biotech AI Failures: Lessons Learned

What Enforcement Trends Tell Us About Future Regulation