Building Scalable AI Systems
from First Principles
About Me
The story behind the code.
I’m a Data Scientist at Indium Software, currently working as an external consultant with Uber.
I build and deploy production-grade AI systems, with a focus on transformer models, large language models, and inference efficiency. My work sits at the intersection of research and real-world constraints,turning ideas from papers into systems that are reliable, scalable, and cost-aware.
I graduated from IIT Palakkad, where I learned to approach problems from first principles,understanding why something works before deciding how to build it. I stay closely aligned with current developments in AI, regularly experimenting with new architectures, tooling, and workflows.
Alongside building systems, I write extensively about machine learning and create small tools to better visualize and understand how models learn and make decisions.
Professional Experience
Building impactful ML systems at scale, from startup velocity to enterprise reliability.
Indium (Client: Uber)
CurrentData Scientist
Working as an external data scientist for Uber, building production-grade AI systems for global earner document processing and onboarding.
- Built a Transformer-based document auto-transcription system for driver and vehicle documents.
- Fine-tuned and deployed LLMs for low-latency, real-time inference in production.
- Automated manual document review workflows, reducing operational effort and cost.
LTIMindtree
Sr. Software Engineer
Worked on applied AI and Generative AI systems for enterprise automation and analytics use cases.
- Built multimodal AI systems to analyze technical videos and generate summaries.
- Developed speech-based natural language interfaces for database querying.
- Implemented RAG-based GenAI chatbots and improved ML model accuracy in production systems.
- Applied transfer learning to achieve high accuracy with limited training data.
Building Systems That Scale
From research implementations to production-grade ML systems serving millions.
Language Identification Model
83.5M-parameter model optimized for source-code classification. Fine-tuned across 25+ programming languages to deliver high-accuracy language detection at scale. Adopted by 350k+ developers worldwide.
Vision Transformer (ViT) From Scratch
A from-scratch PyTorch implementation of Vision Transformer with intuitive explanations of self-attention, patch embeddings, and training logic — focused on learning fundamentals and clean engineering. Documented end-to-end on Medium.
Transformer From Scratch
A clean PyTorch implementation of Transformer encoder and decoder from first principles — deeply exploring attention mechanics and causal masking. Guided code and narrative help demystify how Transformer blocks actually work.
Distributed Training with PyTorch DDP
A practical deep dive into multi-GPU training using PyTorch DistributedDataParallel (DDP). Explains how gradient synchronization works under the hood, how to scale batch sizes correctly, and how to avoid common pitfalls when moving from single-GPU to distributed training.
Self-Attention Explained (From Intuition to Math)
A deep yet intuitive breakdown of self-attention, explaining how tokens interact, why attention works, and how queries, keys, and values emerge — using simple language, diagrams, and minimal math to build real understanding.
Knowledge Graph + LLM Chatbot
An RAG-powered conversational system that combines Neo4j knowledge graphs with large language models to answer natural language queries with structured and unstructured data. Includes Cypher integration and vector search strategies.
Recent Articles
Thoughts on ML systems, engineering practices, and research insights.