University of Cambridge Language Technology Lab

The Problem: Communities with the greatest linguistic diversity often face severe infrastructure constraints.

[ figure loads here ]

Figure 1. Countries with high linguistic diversity have the most limited network connectivity (upper left). Internet penetration is sourced from ITU (2025), number of living languages (log-scale) from the Ethnologue (SIL International, 2025), and income groups from the World Bank (2025). Hover a point for the country.

The field has several names for this: the low-resource double bind (Ahia et al., 2021), the square-one bias (Ruder et al., 2022), Zeno's paradox of language technology (Nigatu et al., 2024), among others.

The Challenge How can we develop language models that are both multilingual and deployable on-device?

Our Approach: To understand the state of the art and the challenges of combining the two areas, we survey 232 papers that tackle this problem across the language modelling pipeline.

[ figure loads here ]

Figure 2a. Reported language coverage of edge LM papers. We show 78 papers (of 232) that report a concrete number of evaluated languages and bin them into four brackets: monolingual (1), few (2–10), many (11–50), and massive (50+), categorized by research focus.

[ figure loads here ]

Figure 2b. Model sizes (in billion parameters) of various language models. For each model family in our curated set of released models, we recorded publicly documented parameter counts and plotted the range of available sizes on a log scale.

The requirements for deploying on the edge and supporting multilinguality often have competing requirements that impose challenges across the language modelling pipeline. Click on each pipeline stage (or requirement) to read about the challenges and the state of the art.

[ pipeline diagram loads here ]

Analysis

We also looked into edge LM systems, which we define as completed efforts that have been integrated into real-world applications. To identify them, we manually classified each of the 232 papers on whether an actual model deployment took place, obtaining 36 systems in the process.

To examine how edge LM systems are made, we situate the 36 deployment papers within the broader 232 surveyed papers. We embed each abstract with MiniLM, reduce to 2D with UMAP, and cluster with HDBSCAN; KeyBERT extracts the top keywords per cluster. Hover any cluster to see representative papers.

[ chart loads here ]

Figure 3. Clustering of the 232 surveyed papers by abstract similarity. Real-world deployments (★) tend to concentrate near a few clusters such as model compression and dialog datasets, while clusters like reasoning performance or prompt compression have little to no representation, suggesting that edge LM deployments favor a relatively narrow set of methods.

Recommendations

  1. For NLP researchers and model developers: Evaluate edge models beyond memory (e.g., compute and energy), and explore underrepresented methods since current deployments cluster around a relatively narrow toolkit.
  2. For deployment practitioners and communities at the edge: Build cross-sector collaborations (academia, industry, research collectives, government), and involve local communities as active collaborators in development and deployment.
  3. For policymakers and funders: Invest not only in model development but also in infrastructure and devices that make deployment feasible in linguistically diverse, lower-resource settings; increase public-sector participation in edge LM efforts.

Citation

@misc{miranda2026multilingualityedgedevelopinglanguage,
  title={{M}ultilinguality at the {E}dge: {D}eveloping {L}anguage {M}odels for the {G}lobal {S}outh},
  author={Lester James Validad Miranda and Songbo Hu and Roi Reichart and Anna Korhonen},
  year={2026},
  eprint={2604.21637},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2604.21637},
}

Discussion

Have feedback, questions, or ideas? Join the conversation below.