Lj V. Miranda | A collection of notes, projects, and essays.
Profile

Hi! I'm Lj V. Miranda, and welcome to my website!

I'm a PhD student at the University of Cambridge, researching multilingual and equitable NLP, where I'm advised by Anna Korhonen. In the past, I've worked as an engineer, consultant, and researcher, mostly in the fields of NLP and AI.

I'm broadly interested in data-centric approaches to building language technologies at scale. I also have a special interest in improving the state of Filipino NLP. I'm happy to discuss research and collaborate, so feel free to reach out and chat!

 
 

Recent posts

What's new?

    May 2026: Gave a talk at the Analytics and AI Association of the Philippines about FilBench and on what it takes to build Filipino-centric LLMs. Here’s some notes from my talk.

    May 2026: Proud to release our survey on multilingual edge models (check the website!). If you know me, I care a lot about AI in the Global South, I’d like to continue doing these types of sociotechnical research, so reach out if you wanna chat!

    Apr 2026: Sharing my first PhD work, Polyglot Teachers! Here, I studied what makes a good teacher model for generating multilingual data. I’m excited to continue this research agenda on multilingual synthetic data generation!

    Dec 2025: So excited to see the release of OLMo 3! My small contribution was on creating the tool-use SFT mix during my last few months as a pre-doc.

    Oct 2025: I’m starting my PhD at the University of Cambridge - Language Technology Lab and will be advised by Anna Korhonen.

    Aug 2025: I’m proud to introduce FilBench, a comprehensive LLM benchmark for Filipino! Accepted at EMNLP 2025 Main. I also share some thoughts in this blog post.

    May 2025: Excited to share that I have three first & co-first author papers accepted at ACL Main: HyPER, M-RewardBench, and UD-NewsCrawl. A large collab project, SEA-VL, also got into Main!