Postdoctoral Researcher

Sourabrata
Mukherjee

Studying how language models understand, generate, and reason across languages and cultures.

Microsoft Research Ph.D., Charles University India / Prague
Portrait of Sourabrata Mukherjee
About

I am a postdoctoral researcher at Microsoft Research, working on language models with a focus on multilingual and cultural understanding, controllable generation, and the evaluation of large models.

I recently completed my Ph.D. in Computational Linguistics at the Institute of Formal and Applied Linguistics, Charles University in Prague, where I was advised by Prof. Ondřej Dušek. My dissertation, Text Style Transfer using Neural Models, develops methods for rewriting text under attribute constraints — formality, sentiment, politeness, toxicity — across high- and low-resource languages.

Before the Ph.D., I spent over five years building production ML and analytics systems as a software and machine‑learning engineer in industry, and I held research roles at UKP Lab (TU Darmstadt), MBZUAI with Prof. Monojit Choudhury, and Panlingua. I hold an M.Tech from NIT Durgapur.

Research interests

What I work on

I build and study language technology that is useful across the world's languages — not only the few well‑resourced ones. My recent work spans:

Large Language Models Multilingual & Cultural NLP LLM Evaluation LLM‑as‑a‑Judge Controllable Text Generation Text Style Transfer Low‑Resource NLP Conversational AI Machine Translation Reasoning
News

Recent updates

  • 2026 · now Joined Microsoft Research as a postdoctoral researcher, working on language AI & multilingual evaluation.
  • 2025 Successfully defended my Ph.D. thesis “Text Style Transfer using Neural Models” at Charles University, Prague.
  • 2024 · summer Research internship at MBZUAI (Abu Dhabi) on culture‑aware LLMs with Prof. Monojit Choudhury.
  • 2024 · jul Two papers accepted at INLG 2024 — multilingual style transfer for Indian languages, and an evaluation of LLMs at style transfer.
  • 2023 · dec Best Paper Award at the BLP Workshop, EMNLP 2023, for low‑resource Bangla style transfer.
  • 2023 Awarded the Charles University Grant Agency (GAUK) research grant as PI; completed with honourable mention.
Selected publications

Papers

A complete and up‑to‑date list lives on Google Scholar. Below are selected works (* indicates equal contribution where applicable).

01

Multilingual Text Style Transfer: Datasets & Models for Indian Languages

Sourabrata Mukherjee, Atul Kr. Ojha, Akanksha Bansal, Deepak Alok, John P. McCrae, Ondřej Dušek

Proceedings of INLG 2024 — 17th International Natural Language Generation Conference2024
02

Are Large Language Models Actually Good at Text Style Transfer?

Sourabrata Mukherjee, Atul Kr. Ojha, Ondřej Dušek

Proceedings of INLG 20242024
03

A Survey of Text Style Transfer: Applications and Ethical Implications

Sourabrata Mukherjee, Mateusz Lango, Zdeněk Kasner, Ondřej Dušek

Northern European Journal of Language Technology (NEJLT)2024
04

Text Style Transfer: An Introductory Overview

Sourabrata Mukherjee, Ondřej Dušek

4EU+ International Workshop on Recent Advancements in Artificial Intelligence2024
05

Low‑Resource Text Style Transfer for Bangla: Data & Models

Sourabrata Mukherjee, Akanksha Bansal, Pritha Majumdar, Atul Kr. Ojha, Ondřej Dušek

Proceedings of the First Workshop on Bangla Language Processing (BLP), EMNLP 2023★ Best Paper2023
06

Text Detoxification as Style Transfer in English and Hindi

Sourabrata Mukherjee, Akanksha Bansal, Atul Kr. Ojha, John P. McCrae, Ondřej Dušek

Proceedings of ICON 2023 — 20th International Conference on Natural Language Processing2023
07

Leveraging Low‑resource Parallel Data for Text Style Transfer

Sourabrata Mukherjee, Ondřej Dušek

Proceedings of INLG 20232023
08

Polite Chatbot: A Text Style Transfer Application

Sourabrata Mukherjee, Vojtěch Hudeček, Ondřej Dušek

Proceedings of EACL 2023 — Student Research Workshop2023
09

Balancing the Style‑Content Trade‑off in Sentiment Transfer using Polarity‑Aware Denoising

Sourabrata Mukherjee, Zdeněk Kasner, Ondřej Dušek

International Conference on Text, Speech, and Dialogue (TSD)2022
view all on google scholar
Experience

Where I've been

  • Postdoctoral Researcher — Microsoft Research

    2026 — present

    Language AI: multilingual & cultural NLP, LLM evaluation and LLM‑as‑a‑judge, controllable generation.

  • Ph.D. in Computational Linguistics — Charles University, Prague

    2019 — 2025

    Institute of Formal and Applied Linguistics. Advisor: Prof. Ondřej Dušek. Thesis: Text Style Transfer using Neural Models.

  • Visiting Researcher (Internship) — MBZUAI, Abu Dhabi

    Summer 2024

    Culture & LLM project with Prof. Monojit Choudhury.

  • Research Intern — Panlingua, India

    2022 — 2023

    Low‑resource machine translation for Indian languages.

  • Research Assistant — UKP Lab, TU Darmstadt

    2018 — 2019

    Built a data‑to‑text generation corpus by automatically detecting evidence linked to structured tables in scientific papers.

  • Technical Lead — Tricon Infotech, Bangalore

    2016 — 2019

    Led an analytical recommendation engine driven by usage statistics.

  • Senior Software Engineer — Avaya Research Lab, Bangalore

    2016

    Full‑stack log monitoring with optimised persistence.

  • Software Engineer — O9 Solutions, Bangalore

    2015 — 2016

    Batch‑job management framework with notification capability.

  • Senior Software Engineer — Amdocs, Pune

    2013 — 2015

    Recommendation engine and search components for a large web application.

  • M.Tech, Computer Science & Engineering — NIT Durgapur

    2011 — 2013

    Advisor: Prof. Tandra Pal. Evolutionary algorithms for GFRP composites machining and high‑level synthesis.

Honours

Awards & grants

2023 · EMNLP

Best Paper Award — BLP Workshop, EMNLP 2023

For Low‑Resource Text Style Transfer for Bangla.

2022 — 2024 · Charles University

GAUK Research Grant (PI)

Charles University Grant Agency grant; project completed with honourable mention.

2011 — 2013 · India

GATE Government Scholarship

Awarded for the Graduate Aptitude Test in Engineering during M.Tech at NIT Durgapur.

National · India

IBM National Technical Competition — Winner

National‑level technical competition organised by IBM.

Selected talks

Talks & presentations

2024 · 4EU+ Workshop

Text Style Transfer: An Introductory Overview

Invited overview at the 4EU+ International Workshop on Recent Advancements in AI.

2024 · INLG, Tokyo

Two papers presented at INLG 2024

Multilingual style transfer for Indian languages; LLMs at text style transfer.

2023 · BLP @ EMNLP, Singapore

Low‑Resource Text Style Transfer for Bangla

Best paper presentation at the Bangla Language Processing workshop.

2023 · EACL, Dubrovnik

Polite Chatbot — Style Transfer in Dialogue

Student Research Workshop presentation at EACL 2023.

Teaching

Teaching & mentoring

I have taught and developed introductory material on Machine Learning, Deep Learning, Programming, Data Structures, and Search Engine Optimization. During my Ph.D. I co‑mentored junior researchers and interns on text generation projects.

Charles University

Ph.D. mentoring & collaboration

Worked alongside master's and Ph.D. students on style transfer, detoxification, and politeness‑aware dialogue.

Industry & community

Introductory courses (ML / DL / DS)

Designed and delivered short courses on ML/DL, programming, and data structures for students and early‑career engineers.

Writing

Notes & essays

A growing space for shorter‑form writing on language models, evaluation, and what it means for AI to work across cultures.

First posts coming soon — drafts are in progress on multilingual evaluation and what culture means inside an LLM. Check back, or subscribe via my Scholar profile.