2026-04-28agentsalignment

Ceci n'est pas une explication: Evaluating Explanation Failures as Explainability Pitfalls in Language Learning Systems

Ben Knight, Wm. Matthew Kennedy, Danielle Carvalho, Isaac Pattis, James Edgell

Key claim

AI feedback in language learning can reinforce misconceptions.

The paper highlights how AI language learning tools can provide misleading feedback that reinforces misconceptions. It introduces L2-Bench, a benchmark for assessing AI feedback quality across six critical dimensions. The key result is the identification of 'explainability pitfalls' that can harm learning outcomes.

Novelty

8.0/10

The paper introduces a new benchmark for evaluating AI feedback in language education.

Reliability

7.0/10

The analysis is based on critical dimensions and discusses specific failures, though the methodology could be more robust.

Deep reliability assessment

The methodology supports the identification of explainability pitfalls in AI-generated feedback for language learning, but it may overclaim the generalizability of these findings across all AI systems in education.

Reproducibility

Discussion questions

How can we ensure that AI systems provide contextually appropriate feedback across diverse cultural backgrounds?
What are the implications of these explainability pitfalls for the design of future AI educational tools?
What specific conditions would need to be met to demonstrate that these identified pitfalls do not occur in a different AI system?

Key figure

The paper does not include any figures or architectural diagrams.

Read on arXiv →