Hi! I'm Lj Miranda, and welcome to my website!
I'm currently a predoctoral researcher at the AllenNLP team at Ai2. In the past, I've worked as an engineer, consultant, and researcher, mostly in the field of NLP and AI.
I'm broadly interested in data-centric approaches to building language technologies at scale. I'm happy to discuss research and collaborate, so feel free to reach out!
Recent Posts
-
The missing pieces in Filipino NLP in the age of LLMs
The rise of LLMs is forcing us to rethink Filipino NLP. But there's still a ton of work to do—just not the stuff you might think. Here's my take on what's worth doing, what's a waste of time, and where Filipino NLP research should be heading.
-
Guest lecture @ DLSU Manila: Artisanal Filipino NLP Resources in the time of Large Language Models
Last month, I had another guest lecture, this time in Dr. Charibeth Cheng's graduate class in DLSU. Here, I talked about the craft of building small-scale yet effective NLP models for Filipino in the face of today's large language models.
-
A lexical view of contrast pairs in preference datasets
Can we spot differences between preference pairs just by looking at their word embeddings? In this blog post, I want to share my findings from examining lexical distances between chosen and rejected responses in preference datasets.
What's New?
Nov 2024: Happy to have been part of the exciting Tülu 3 release! My main contribution is scaling-up our preference dataset using a synthetic data generation pipeline that led to improvements in our DPO models.
Oct 2024: Our paper on routing preference instances to human or LM annotators, Hybrid Preferences, is now available. This is the first work I co-led (with Yizhong Wang) at Ai2!
Oct 2024: Our paper on evaluating reward models in multilingual settings, M-RewardBench, is now available. This was a fun collab with folks from Cohere for AI!
Sep 2024: My cross-institutional collabs, Consent in Crisis and SEACrowd, were accepted to NeurIPS D&B and EMNLP 2024, respectively.
Aug 2024: 🏆 Our work on evaluating reward models in multilingual settings won Silver Prize in Cohere for AI’s Aya Expedition!
Jul 2024: I gave a guest lecture at DLSU about building Filipino NLP resources. Thanks to Dr. Charibeth Cheng for inviting me!
Mar 2024: Universal NER was accepted to NAACL 2024. I hope to still work on linguistic aspects of NLP in the future!
Mar 2024: We released RewardBench, the first benchmark for evaluating reward models.