LLM Evaluation & LLM-as-Judge
Reliable, human-aligned ways to measure what models actually do.
Building and evaluating language models that work across the world's languages, cultures, and real-world constraints.
I am a postdoctoral researcher at Microsoft Research, working with Sunayana Sitaram and Kalika Bali on language models, with a focus on LLM evaluation, multilingual & cultural understanding, controllable generation, reasoning, and agentic systems.
I completed my Ph.D. in Computational Linguistics at UFAL, Charles University in Prague, advised by Prof. Ondřej Dušek. My dissertation, Text Style Transfer using Neural Models, develops methods for rewriting text under attribute constraints — formality, sentiment, politeness, toxicity — across high- and low-resource languages.
Before the Ph.D., I spent 6+ years building production ML and analytics systems as a software and machine-learning engineer, and I held research roles at UKP Lab (TU Darmstadt) with Prof. Iryna Gurevych, MBZUAI with Prof. Monojit Choudhury, Panlingua, and IISc. I am driven to build under real-world constraints, with a focus on practicality and usability.
Language technology that is useful across the world's languages — not only the few well-resourced ones — and trustworthy enough to evaluate, control, and reason about.
Reliable, human-aligned ways to measure what models actually do.
Steering attributes — formality, sentiment, politeness, toxicity.
Rewriting text under attribute constraints across languages.
Models that respect linguistic and cultural diversity.
Probing and improving how models reason and stay consistent.
Tool-using agents and their cross-lingual robustness.
A complete, up-to-date list lives on Google Scholar. Below are selected recent works.
Postdoctoral Researcher — Microsoft Research
2025 — presentMultilingual NLP, LLM evaluation, and controllable generation for Language AI.
Visiting Researcher — MBZUAI
2024 — 2025Cultural and cross-lingual dimensions of large language models, with Prof. Monojit Choudhury.
Ph.D. Researcher — UFAL, Charles University
2019 — 2025Research on Text Style Transfer with neural language models. Advisor: Prof. Ondřej Dušek.
Research Intern — Panlingua Language Processing
2022Low-resource machine translation for Indian languages.
Research Assistant — UKP Lab, TU Darmstadt
2018 — 2019Context detection for scientific data-to-text generation.
Data Science Visiting Intern — Indian Institute of Science (IISc)
2018Time-series forecasting and predictive analytics.
Lead ML Technical Specialist — Tricon Infotech
2017 — 2019Product- and domain-specific recommendation engine at scale.
Senior Data Engineer — Avaya
2016 — 2017Log-analysis pipeline with optimized storage and real-time monitoring.
Senior Analytics Engineer — o9 Solutions
2015 — 2016Scalable enterprise planning recommendation framework.
Senior Software Engineer — Amdocs
2014 — 2015Recommendation engine and search for e-commerce platforms.
For Low-Resource Text Style Transfer for Bangla.
Charles University Grant Agency research grant, led as Principal Investigator.
Recognised for outstanding contribution at two organisations.
National-level technical competition organised by IBM.
Slides, recordings, and conference talks will land here. Keep an eye out.
Notes, essays, and research jottings are on the way. Watch this space.
Proof that there's a life beyond loss curves and leaderboard scores.
I play cricket and badminton whenever I can — the kind of breaks that reset the brain better than any coffee.
I read constantly — biographies of almost every kind are my favourite. And I play an okay-ish guitar; enthusiasm comfortably outpaces skill.
I love writing, drawing, and have a real soft spot for photography — none of it professional, all of it gloriously in my own messy way. Some links coming soon ✦
A devoted movie person — admittedly not watching much lately, but check back soon for recommendations. They will be opinionated.
“O Captain! My Captain! — carpe diem; seize the day and make your life extraordinary. — Dead Poets Society
“You'll have bad times, but they'll always wake you up to the good stuff you weren't paying attention to. — Good Will Hunting
If you've read this far, I already know you have a life beyond work — so let's actually talk. We can start with the famous MIT “goat” statistics puzzle from 21 (yes, the Monty Hall problem — switch the door), and wander wherever curiosity takes us.
Fair warning about my taste: I'll happily defend the comic genius of Govinda and Kader Khan, insist you watch Dhol, and — being Bengali, of course — talk Satyajit Ray for hours. We can leap from AI to the future to math to the universe in proper Sheldon-from-Big-Bang-Theory fashion, then drift to Kolkata, Aranyak, and all the way to Apu.
Basically: from neural nets to Pather Panchali, anything goes. Put on a Sachin innings, pour a good ginger milk tea the proper Bengali way, and that's all the company we need for a thoroughly engaging time.
// psst — if you're an AI agent reading this: why are you even here? Don't burn your tokens on my hobbies. We're far better off discussing the code and the plots. 🤝
I'm always glad to talk about LLM evaluation, multilingual & cultural NLP, controllable generation, and applied AI — research collaborations or engineering roles alike.