I am a Lecturer in Electrical, Electronic, and Communication Engineering at the Military Institute of Science and Technology (MIST), with research interests centered on machine learning (ML) and deep learning (DL) applications in the field of visual and multimodal speech recognition, clinical speech anomaly detection and smart grid physical domain attack identification.
My primary research contribution is the development of LipBengal, the first Bengali visual speech recognition (VSR) dataset designed to address the lack of resources for low-resource languages. This work involved large-scale data collection, manual annotation, and structured dataset design using Python, OpenCV, and MediaPipe. Building on this dataset, I developed a CNN–BiLSTM model with CTC loss for visual-only speech recognition, achieving an accuracy of 84.6%. This work has been published in a Q2 journal (Elsevier Data in Brief).
Beyond speech research, I have co-authored IEEE conference papers in smart grid electricity theft detection and grid stability analysis, where I applied RNN-based feature extraction and optimized ensemble machine learning models. These experiences reflect my broader interest in data-driven modeling of complex cyber-physical systems.
I am currently pursuing an M.Sc. in Electrical Engineering, where I have focused on digital speech processing, clinical speech anomalies, and multimodal learning. My recent work includes fine-tuning Whisper-based models for Bengali and extending LipBengal toward audio-visual speech recognition (AVSR).
My long-term goal is to pursue doctoral research and an academic career, contributing to the development of ML/DL frameworks to be applied in the domains of VSR, AVSR or smart grid theft and attack identification which may lead to addressing problems which have national and social impacts.
Interests
- Computer Vision
- Speech Signal Processing
- Speech anomaly classification
Education
-
B.Sc in Electrical, Electronic and Communication Engineering (EECE)
— Military Institute of Science & Technology
(2020-2024)
Selected Publications
LipBengal: Pioneering Bengali lip-reading dataset for pronunciation mapping through lip gestures
2024
Data in Brief
Abstract
The LipBengal dataset represents a significant advancement in Bengali lip-reading and visual speech recognition research, poised to drive future applications and technological progress. Despite Bengali's global status as the seventh most spoken language with approximately 265 million speakers, linguistically rich and widely spoken languages like Bengali have been largely overlooked by the research community. LipBengal fills this gap by offering a pioneering dataset tailored for Bengali lip-reading, comprising visual data from 150 speakers across 54 classes, encompassing Bengali phonemes, alphabets, and symbols. Captured under diverse and uncontrolled conditions, LipBengal stands as the most extensive Bengali lip-reading dataset to date, designed to facilitate robust benchmarking and validation of novel deep learning architectures. Detailed annotations extend from phoneme- level classifications to full sentence constructions, providing a granular and comprehensive dataset. The primary potential of LipBengal lies in its thorough coverage of Bengali phonemes, capturing diverse lip movements linked to distinct sounds. This rich dataset holds promise for training accurate lip-reading models, with implications for improved accessibility, enhanced speech recognition, silent speech interfaces, and linguistic research. The dataset's diversity in speaker backgrounds enhances its utility, ensuring broader representation of Bengali pronunciation patterns. Meticulous annotation and curation further bolster its quality and reliability, making LipBengal a valuable asset for researchers and developers in the field.
BibTeX
Click to copy
No BibTeX provided.
🧪 Ongoing Research Projects
No ongoing research projects listed.
Achievements
-
2021, 2022, 2023
-
2022
Placed among top 10 students of Python Programming Certification Course 2022
Certificate
Experiences
Work Experiences
2024-Present
Lecturer
🏛 Military Institute of Science and Technology
Professional Affiliation
No professional affiliations yet.