Khondoker I. Islam

NLP Masters student @ University of Groningen & Universität des Saarlandes

khondoker.islam9 [AT] gmail.com

About

I am a NLP Master's student of the European Union Dual Degree Research Program, affiliated with both University of Groningen and Saarland University. Recently, I worked with the @GroNLP research group to evaluate the reasoning capabilities of LLMs across various languages [1]. In my Bachelor's thesis, I demonstrated the limitations of encoder-only LMs.[2],[3]. In between, I contributed to a Multilingual Multi-Modal lab, where I focused on enhancing the linguistic and visual capabilities of vLMs in ancient Hebrew, Dutch, and English to improve OCR for their respective historical documents[4]. Separately, I mapped tonal emotion values into textual representations to improve LM’s comprehension of audio features[5].

News
August 2025     I am honored to receive Erasmus+ scholarship!
May 2025     I am honored to receive Google & Zendesk Scholarship to attend LxMLS 2025!
April 2025     Our team won Best NLP Student Project Award! 🏆
September 2024     I am selected as Programme Committee Member for Rema Linguistics Program at RuG!
August 2024     I am honored to receive NL Scholarship from Nuffic!
April 2024     I am honored to receive Erasmus Mundus Consortium Award!
December 2023     I won Outstanding Reviewer Award at BLP Workshop @ EMNLP 2023! 🏆
September 2022     Our EmoNoBa paper has been accepted to AACL-IJCNLP 2022!
August 2021     Our SentNoB paper has been accepted to The Findings of EMNLP 2021!
December 2020     Our mBERT transfer learning evaluation for low-resource language was accepted to ICCIT 2020!

Education

University of Groningen, Universität des SaarlandesSeptember 2024 - now

Masters in NLP

Shahjalal University of Science and TechnologyJan. 2017 - Jan. 2021

B.Sc. (Engg.) in Computer Science & Engineering (advisor: Md Saiful Islam)

Publications

* indicates equal contribution.

Reveal-Bangla: A Dataset for Cross-Lingual Multi-Step Reasoning Evaluation

Khondoker Ittehadul Islam, Gabriele Sarti

arxiv 2025

EmoNoBa: A Dataset for Analyzing Fine-Grained Emotions on Noisy Bangla Texts

Khondoker Ittehadul Islam*, Tanvir Hossain Yuvraz*, Md Saiful Islam, Enamul Hassan

AACL-IJCNLP 2022

SentNoB: A dataset for analysing sentiment on noisy Bangla texts

Khondoker Ittehadul Islam, Sudipta Kar, Md Saiful Islam, Mohammad Ruhul Amin

Findings of EMNLP 2021

Reveal-Bangla: A Dataset for Cross-Lingual Multi-Step Reasoning Evaluation

Khondoker Ittehadul Islam, Gabriele Sarti

arxiv 2025

Improving OCR for Historical Texts of Multiple Languages

Hylke Westerdijk, Ben Blankenborg, Khondoker Ittehadul Islam

arxiv 2025

Joint Effects of Argumentation Theory, Audio Modality and Data Enrichment on LLM-Based Fallacy Classification

Hongxu Zhou, Hylke Westerdijk, Khondoker Ittehadul Islam

arxiv 2025

Leveraging sentiment for offensive text classification

Khondoker Ittehadul Islam

arxiv 2024

EmoNoBa: A Dataset for Analyzing Fine-Grained Emotions on Noisy Bangla Texts

Khondoker Ittehadul Islam*, Tanvir Hossain Yuvraz*, Md Saiful Islam, Enamul Hassan

AACL-IJCNLP 2022

SentNoB: A dataset for analysing sentiment on noisy Bangla texts

Khondoker Ittehadul Islam, Sudipta Kar, Md Saiful Islam, Mohammad Ruhul Amin

Findings of EMNLP 2021

Sentiment analysis in Bengali via transfer learning using multi-lingual BERT

Khondoker Ittehadul Islam, Md Saiful Islam, Mohammad Ruhul Amin

ICCIT 2020