Welcome! This is
Senior AI R&D Engineer
8+ years of experience in Document Intelligence, GenAI, NLP, Computer Vision & Vision-Language Models. Based in Tokyo, Japan.
Shafi Ud Doula is an AI R&D Engineer with over 8 years of professional experience spanning Document Intelligence, Payment Automation, and Digital Transformation. His work sits at the intersection of Computer Vision, Natural Language Processing, Generative AI, and Vision-Language Models, backed by hands-on cloud expertise on AWS and GCP.
Currently based in Tokyo, Japan, he leads GenAI, Agentic AI, and Foundation Model solutions for enterprise clients across healthcare, telecommunications, and insurance domains.
Asian Institute of Technology, Thailand
Thesis: "A multi-modal framework for context-aware plant disease classification and segmentation integrating visual and textual features"
π Best Thesis AwardBegum Rokeya University, Bangladesh
Thesis: "A Language-Independent Deep Parts-of-Speech Tagger"
Deloitte Tohmatsu LLC, Tokyo, Japan
Department of Computer Science and Engineering, Begum Rokeya University, Rangpur, Bangladesh
Smart Data Lab, Asian Institute of Technology, Thailand
Deloitte Touche Tohmatsu LLC, Tokyo, Japan
Deloitte Touche Tohmatsu LLC, Tokyo, Japan
Semantics Lab, Begum Rokeya University, Bangladesh
Throughout the career, Shafi has contributed to large-scale enterprise AI systems including:
Production-grade OCR/ICR platforms for structured data extraction from scanned documents, integrating object detection, text recognition, NER, and layout analysis.
Invoice field extraction and normalization pipelines feeding downstream RPA and ERP workflows with validation and exception handling.
LLM-powered contract review systems for clause extraction, obligation identification, and risk flagging with evidence-based outputs.
Retrieval-augmented generation services for policy and clinical document QA with guardrails and compliance controls.
Model-free inference acceleration for document AI pipelines through batching, mixed precision, and I/O tuning strategies.
Enterprise-grade low-code AI applications with agent tools, connectors, guardrails, and standardized prompt management.
Patient-facing medical Q&A using Seq2Seq vs GPT-2 on MedQuAD dataset. End-to-end from data curation to deployment with Flask.
Comparative study on student layer initialization strategies (top-K, bottom-K, odd/even) for knowledge distillation.
NMT with attention/Transformer ablations for English-Bangla translation. Explores low-resource MT with BLEU/SacreBLEU evaluation.
RAG-powered campus assistant with scraping, vectorization, and Flask UI. Demonstrates full RAG architecture literacy.
Deep-feature indexing & KNN retrieval using VGG16 embeddings for near-duplicate detection and visual search.
Minimal local RAG with HF embeddings β FAISS β Ollama LLM. Includes optional image-grounding via LLaVA/BakLLaVA.
Agentic contract analysis with clause extraction, intent classification, risk flags using LangGraph nodes with policy-driven redlines.
Local RAG chatbot framework integrating LangChain, FAISS, and Ollama. Demonstrates private, cost-efficient LLM deployment.
Full instruction-tuning workflow (data prep β SFT β eval β Flask web app) using DistilGPT-2 on Alpaca dataset.
BERT-based sentence embeddings for retrieval, clustering, and semantic similarity tasks.
Trains LSTM language models on Harry Potter text; deploys a Flask text-gen app. Classical LM training with gradient clipping.
CNN-RNN image captioning pipeline with attention mechanism over visual features. Bridges CV and NLP.
Bias-aware face detection using Variational Autoencoder and adaptive re-sampling for algorithmic fairness.
AWS serverless web app using Cognito, API Gateway, Lambda, DynamoDB, S3, CloudFront. Mapped to AWS Well-Architected pillars.
Tabular ML regression with EDA β feature engineering β baseline models for salary prediction.
Clinical tabular ML for heart-disease risk classification. Demonstrates responsible ML patterns for health data.
Tabular classification for vehicle pricing bands with end-to-end feature engineering and model selection.
PDFβtextβskills extraction (spaCy + regex) and classic IR engine with BM25-style retrieval and evaluation.
Symmetry, Vol. 17 (7), Article 1155, 19 July 2025
Crop Design, Elsevier (ScienceDirect) β Pending final editorial approval
JBRP 2025
Zero-shot flood detection and prioritization pipeline combining YOLOv7 and CLIP to triage urgent scenes from drone imagery and social media streams.
Preprint (arXiv)
DOI: 10.48550/arXiv.2601.21307
ICMVA 2025, Melbourne, Australia (June 12β14, 2025) β SPIE Conference Proceedings
iCATIS 2019, Dhaka, Bangladesh (International Conference on Applications and Techniques in Information Science)
July 6, 2025 β Graduate Students in Data Science & AI, Asian Institute of Technology (Vietnam)
August 22, 2025 β PhD Students, Asian Institute of Technology, Thailand
October 7, 2025 β Department of Computer Science and Engineering, Begum Rokeya University, Rangpur, Bangladesh
January 21, 2019 β International Conference on Applications and Techniques in Information Science (iCATIS), Daffodil International University, Dhaka, Bangladesh
Research conducted in collaboration with leading universities and research institutions across Asia, Europe, and North America.
Pathum Thani, Thailand
Rangpur, Bangladesh
Tokyo, Japan
London, United Kingdom
Kumamoto, Japan
San Marcos, Texas, USA
Symmetry Journal, Switzerland
ICMVA 2025, Melbourne, Australia
Shafi is always open to discussing research collaborations or interesting AI projects. Feel free to reach out!
mdshafiud.doula@alumni.ait.asia
github.com/shaficse
Shafi Doula
Tokyo, Japan