Workshop co-located with EACL 2026, in Rabat, Morocco


Introduction

Large language models are undeniably reshaping language technology. Yet as models are claimed to "support X languages", the research community still lacks clear answers to core questions, such as What does multilinguality really mean, and how should we evaluate it? Counting languages in training data or translating benchmarks likely isn't enough. Multilingual evaluation today suffers from duplicated efforts, inconsistent practices, limited comparability across works, and general poor understanding of theoretical and practical problems.

Workshop on Multilingual and Multicultural Evaluation (MME) aims to bring the community together with three goals:

LLMs in every language? Prove it. Showcase your work on rigorous, efficient, scalable, culture-aware multilingual benchmarking.

Call for Papers

click to expand

We invite submissions on topics, including, but not limited to:

  • Evaluation resources beyond English or Western-centric perspectives and materials
  • Annotation methodology and procedures
  • Evaluation protocols: ranking vs direct assessment, rubric-based vs reference-based vs reference-free, prompt variations, etc
  • Complex and practical tasks: multimodality, fairness, long I/O, tool using, code-switching, literary, etc
  • Sociocultural and cognitive variation affecting the use and evaluation across languages
  • Scalable evaluation of cultural and factual knowledge
  • Efficient evaluation of a massive number of languages and tasks
  • Metrics, LLM judges, and reward models
  • Standardization in reporting and comparison of multilingual performance
  • AI-assisted evaluation: data, methods, metrics, and standards
  • Other position, application, or theory contributions

We welcome both archival and non-archival papers, resulting in presentations at the workshop. In addition, archival papers will be published in ACL Anthology. An archival submission cannot be under review or accepted at another archival venue. ARR-reviewed papers can be directly committed.

All archival submissions must follow the ACL style guidelines and be anonymized for double-blind review. Short papers may have up to 4 pages and long papers up to 8 pages, excluding references and appendices. Upon acceptance, one additional page will be allowed for the camera-ready version. Non-archival submissions have no formatting or anonymity requirements.

Please submit your work by December 19 2025 through this link. ARR (meta-)reviewed papers can be committed by January 5 2026 using this link.

Key Dates

click to expand

All deadlines are 11:59PM UTC-12:00 ("Anywhere on Earth").

  • Direct submission deadline: December 19 2025, direct submission link
  • ARR-reviewed paper submission deadline: January 5 2026, commitment link
  • Notification of acceptance: January 23 2026
  • Camera-ready deadline: February 3 2026

Speakers


From Preferences to Proficiency: Evaluating Native-like Quality of LLMs

Sebastian Ruder
Meta
Challenges and Opportunities of Language Models as Digital Field Linguists

Freda Shi
University of Waterloo
How to Properly Evaluate LLMs at Multilinguality Space?

Wenda Xu
Google

Accepted Papers


Further, we welcome the non-archival presentation of the following works:

Organizers


Pinzhen Chen
Queen's University Belfast
Vilém Zouhar
ETH Zurich
Hanxu Hu
University of Zurich
Simran Khanuja
CMU
Wenhao Zhu
ByteDance
Barry Haddow
University of Edinburgh
Alexandra Birch
University of Edinburgh
Alham Fikri Aji
MBZUAI
Rico Sennrich
University of Zurich
Sara Hooker
Adaptable Intelligence

Please reach out to mme-workshop@googlegroups.com with any questions or inquiries. This workshop follows ACL's Anti-Harassment Policy.

Program Committee