Welcome! This is

Shafi Ud Doula

Senior AI R&D Engineer

8+ years of experience in Document Intelligence, GenAI, NLP, Computer Vision & Vision-Language Models. Based in Tokyo, Japan.

Md Shafi Ud Doula

About

Shafi Ud Doula is an AI R&D Engineer with over 8 years of professional experience spanning Document Intelligence, Payment Automation, and Digital Transformation. His work sits at the intersection of Computer Vision, Natural Language Processing, Generative AI, and Vision-Language Models, backed by hands-on cloud expertise on AWS and GCP.

Currently based in Tokyo, Japan, he leads GenAI, Agentic AI, and Foundation Model solutions for enterprise clients across healthcare, telecommunications, and insurance domains.

0
Years Experience
0
Public Projects
0
Publications

Education

AIT Logo

M.Eng. in Data Science & AI

Asian Institute of Technology, Thailand

Thesis: "A multi-modal framework for context-aware plant disease classification and segmentation integrating visual and textual features"

πŸ† Best Thesis Award
BRUR Logo

B.Sc. in Computer Science & Engineering

Begum Rokeya University, Bangladesh

Thesis: "A Language-Independent Deep Parts-of-Speech Tagger"

Work Experience

Senior AI R&D Engineer

2023 – Present

Deloitte Tohmatsu LLC, Tokyo, Japan

  • Lead GenAI, Agentic AI, and Foundation Model solutions for enterprise AI document processing across healthcare, telecommunication & insurance domains
  • Drive end-to-end requirements analysis, software specification, and solution architecture
  • Manage cross-functional onshore and offshore engineering teams
πŸ† Best Employee Award 2024 (Culture & Inclusion)

Guest Lecturer

2025 – Present

Department of Computer Science and Engineering, Begum Rokeya University, Rangpur, Bangladesh

  • Course: Soft Skill and Academic Writing

Researcher

2024 – Present

Smart Data Lab, Asian Institute of Technology, Thailand

  • Conducting research in data science and artificial intelligence

AI R&D Engineer

2021 – 2023

Deloitte Touche Tohmatsu LLC, Tokyo, Japan

  • Developed payment automation, legal document AI, and solutions using Low-Code / No-Code platforms

Junior AI Engineer

2019 – 2021

Deloitte Touche Tohmatsu LLC, Tokyo, Japan

  • Developed Deep ICR (AI OCR for JP/EN documents)

AI Researcher & Developer

2016 – 2019

Semantics Lab, Begum Rokeya University, Bangladesh

  • Worked on spell correction, facial analysis, emotion detection, Bangla OCR, and a safe driving assistance application

Projects

Industry Experience Highlights

Throughout the career, Shafi has contributed to large-scale enterprise AI systems including:

Multilingual Document AI

Production-grade OCR/ICR platforms for structured data extraction from scanned documents, integrating object detection, text recognition, NER, and layout analysis.

Payment Automation

Invoice field extraction and normalization pipelines feeding downstream RPA and ERP workflows with validation and exception handling.

Legal Document AI

LLM-powered contract review systems for clause extraction, obligation identification, and risk flagging with evidence-based outputs.

Healthcare GenAI

Retrieval-augmented generation services for policy and clinical document QA with guardrails and compliance controls.

Runtime Optimization

Model-free inference acceleration for document AI pipelines through batching, mixed precision, and I/O tuning strategies.

Low-Code GenAI Enablement

Enterprise-grade low-code AI applications with agent tools, connectors, guardrails, and standardized prompt management.

Open-Source & Public Projects

GenAI

medicalChatBot

Patient-facing medical Q&A using Seq2Seq vs GPT-2 on MedQuAD dataset. End-to-end from data curation to deployment with Flask.

GPT-2FlaskHuggingFaceNLTK
View on GitHub
NLP

distilBERT Student Layers Init

Comparative study on student layer initialization strategies (top-K, bottom-K, odd/even) for knowledge distillation.

BERTDistillationHuggingFaceGLUE
View on GitHub
NLP

Machine Translation EN↔BN

NMT with attention/Transformer ablations for English-Bangla translation. Explores low-resource MT with BLEU/SacreBLEU evaluation.

PyTorchTransformerSentencePieceOPUS-100
View on GitHub
GenAI

Customized GPT (AIT Chatbot)

RAG-powered campus assistant with scraping, vectorization, and Flask UI. Demonstrates full RAG architecture literacy.

LangChainFAISSFlaskRAG
View on GitHub
CV

ImageSimilaritySearch

Deep-feature indexing & KNN retrieval using VGG16 embeddings for near-duplicate detection and visual search.

VGG16FAISSTensorFlowKNN
View on GitHub
GenAI

LangChain RAG with Ollama

Minimal local RAG with HF embeddings β†’ FAISS β†’ Ollama LLM. Includes optional image-grounding via LLaVA/BakLLaVA.

LangChainFAISSOllamaLLaVA
View on GitHub
GenAI

LangGraph Contract Review

Agentic contract analysis with clause extraction, intent classification, risk flags using LangGraph nodes with policy-driven redlines.

LangGraphLLaMA-3OllamaYAML
View on GitHub
GenAI

Agentic RAG Chat

Local RAG chatbot framework integrating LangChain, FAISS, and Ollama. Demonstrates private, cost-efficient LLM deployment.

LangChainFAISSOllamaRAG
View on GitHub
NLP

Instruction-Tuning

Full instruction-tuning workflow (data prep β†’ SFT β†’ eval β†’ Flask web app) using DistilGPT-2 on Alpaca dataset.

SFTTrainerHuggingFaceFlaskBLEU
View on GitHub
NLP

Sentence Embedding with BERT

BERT-based sentence embeddings for retrieval, clustering, and semantic similarity tasks.

BERTHuggingFaceEmbeddings
View on GitHub
NLP

Language Model (LSTM)

Trains LSTM language models on Harry Potter text; deploys a Flask text-gen app. Classical LM training with gradient clipping.

PyTorchLSTMFlasktorchtext
View on GitHub
CV

Image Captioning with Attention

CNN-RNN image captioning pipeline with attention mechanism over visual features. Bridges CV and NLP.

CNNRNNAttentionDeep Learning
View on GitHub
CV

Debiasing VAE

Bias-aware face detection using Variational Autoencoder and adaptive re-sampling for algorithmic fairness.

VAEFairnessDeep Learning
View on GitHub
Cloud

CloudPlusNetwork

AWS serverless web app using Cognito, API Gateway, Lambda, DynamoDB, S3, CloudFront. Mapped to AWS Well-Architected pillars.

AWSLambdaDynamoDBCognito
View on GitHub
ML

SalaryPrediction

Tabular ML regression with EDA β†’ feature engineering β†’ baseline models for salary prediction.

scikit-learnPandasEDA
View on GitHub
ML

HeartDiseasePrediction

Clinical tabular ML for heart-disease risk classification. Demonstrates responsible ML patterns for health data.

scikit-learnClinical MLClassification
View on GitHub
ML

ML Car Pricing Classification

Tabular classification for vehicle pricing bands with end-to-end feature engineering and model selection.

scikit-learnClassificationFeature Eng.
View on GitHub
NLP

Resume Parser / Search Engine

PDF→text→skills extraction (spaCy + regex) and classic IR engine with BM25-style retrieval and evaluation.

spaCyBM25IRPython
View on GitHub

Publications

Journal

MMF-Gait: A Multi-Model Fusion-Enhanced Gait Recognition Framework Integrating Convolutional and Attention Networks

Symmetry, Vol. 17 (7), Article 1155, 19 July 2025

DOI: 10.3390/sym17071155

MMF-Gait Framework
Journal (Pending)

Precision Agriculture Using Deep Learning: A Transformer-Based Framework for Crop Disease Detection

Crop Design, Elsevier (ScienceDirect) β€” Pending final editorial approval

Journal

YoLIP Flood Response: Zero-Shot Flood Detection and Prioritization Pipeline

JBRP 2025

Zero-shot flood detection and prioritization pipeline combining YOLOv7 and CLIP to triage urgent scenes from drone imagery and social media streams.

YoLIP Flood Response Framework
Preprint

Mam-App: A Novel Parameter-Efficient Mamba Model for Apple Leaf Disease Classification

Preprint (arXiv)

DOI: 10.48550/arXiv.2601.21307

Mam-App Model
Conference

A Vision-Language Approach for Detecting and Classifying Floating Debris on Aquatic Surfaces

ICMVA 2025, Melbourne, Australia (June 12–14, 2025) β€” SPIE Conference Proceedings

DOI: 10.1117/12.3078506

Floating Debris Detection
Conference

A Language-Independent Deep Parts-of-Speech Tagger

iCATIS 2019, Dhaka, Bangladesh (International Conference on Applications and Techniques in Information Science)

Talks & Presentations

Guest Speaker

Vision-Language Models in Research

July 6, 2025 β€” Graduate Students in Data Science & AI, Asian Institute of Technology (Vietnam)

Guest Speaker

Cloud Computing and Industrial Research Trends

August 22, 2025 β€” PhD Students, Asian Institute of Technology, Thailand

Talk at AIT Thailand
Guest Speaker

Bridging Classroom Learning with Industry Needs: Preparing for the AI and Cloud Era

October 7, 2025 β€” Department of Computer Science and Engineering, Begum Rokeya University, Rangpur, Bangladesh

Talk at Begum Rokeya University
Conference Presenter

NLP in Research

January 21, 2019 β€” International Conference on Applications and Techniques in Information Science (iCATIS), Daffodil International University, Dhaka, Bangladesh

NLP in Research Talk at iCATIS 2019

Research Collaborations

Research conducted in collaboration with leading universities and research institutions across Asia, Europe, and North America.

Asian Institute of Technology

Pathum Thani, Thailand

Begum Rokeya University

Rangpur, Bangladesh

University of Tokyo Hospital

Tokyo, Japan

University of East London

London, United Kingdom

Kumamoto University

Kumamoto, Japan

Texas State University

San Marcos, Texas, USA

MDPI

Symmetry Journal, Switzerland

SPIE

ICMVA 2025, Melbourne, Australia

Technical Skills

NLP & GenAI

LLMsRAGTranslationSentiment Analysis Information ExtractionSummarizationText Generation

Computer Vision

OCRSegmentationObject DetectionVision-Language Models

ML / Deep Learning

ForecastingClassificationAnomaly Detection

Dev / Cloud

FastAPIDockerGitAWS (Certified)GCP (Certified)

Frameworks & Tools

PyTorchTensorFlowHugging FaceLangChain LangGraphDify

Project Management

JiraMS ProjectCross-functional Teams

Languages

Bangla
Native
English
Business
Japanese
Conversational

Get in Touch

Shafi is always open to discussing research collaborations or interesting AI projects. Feel free to reach out!