All Projects
A complete archive of my open source work, research implementations, and system designs.
Language Identification Model
83.5M-parameter model optimized for source-code classification. Fine-tuned across 25+ programming languages to deliver high-accuracy language detection at scale. Adopted by 350k+ developers worldwide.
Vision Transformer (ViT) From Scratch
A from-scratch PyTorch implementation of Vision Transformer with intuitive explanations of self-attention, patch embeddings, and training logic — focused on learning fundamentals and clean engineering. Documented end-to-end on Medium.
Transformer From Scratch
A clean PyTorch implementation of Transformer encoder and decoder from first principles — deeply exploring attention mechanics and causal masking. Guided code and narrative help demystify how Transformer blocks actually work.
Distributed Training with PyTorch DDP
A practical deep dive into multi-GPU training using PyTorch DistributedDataParallel (DDP). Explains how gradient synchronization works under the hood, how to scale batch sizes correctly, and how to avoid common pitfalls when moving from single-GPU to distributed training.
Self-Attention Explained (From Intuition to Math)
A deep yet intuitive breakdown of self-attention, explaining how tokens interact, why attention works, and how queries, keys, and values emerge — using simple language, diagrams, and minimal math to build real understanding.
Knowledge Graph + LLM Chatbot
An RAG-powered conversational system that combines Neo4j knowledge graphs with large language models to answer natural language queries with structured and unstructured data. Includes Cypher integration and vector search strategies.
Efficient LLM Fine-Tuning
A step-by-step practical demonstrations on fine-tuning large language models using parameter-efficient techniques (e.g., LoRA), reducing training cost while maintaining strong performance — backed by hands-on notebooks.
Understanding LoRA: Theory Behind Efficient LLM Fine-Tuning
A theory-first breakdown of Low-Rank Adaptation (LoRA), derived from the original research paper. Explains the mathematical intuition behind low-rank updates, why full fine-tuning is inefficient, and how LoRA enables scalable adaptation of large models in modern ML systems.
Understanding Categorical Correlations
A practical, intuition-first explanation of measuring relationships between categorical variables using the Chi-Square test and Cramér’s V. Covers when correlation metrics fail, how statistical dependence works for categorical data, and how to interpret results correctly in real-world data analysis.
Principal Component Analysis (PCA) Explained
A clear, intuition-driven breakdown of Principal Component Analysis (PCA), connecting the underlying linear algebra with practical data insights. Explains variance, eigenvectors, dimensionality reduction, and when PCA helps—or hurts—real-world machine learning pipelines.
Video Frame Deduplication for Efficient Processing
An applied computer vision approach to reducing video length by detecting and removing redundant frames. Explains similarity metrics, practical thresholds, and how frame deduplication improves storage efficiency, preprocessing speed, and downstream video ML pipelines.
Mathematics of Convolution & Deconvolution in Vision
A first-principles breakdown of convolution and deconvolution operations in computer vision. Explains the underlying mathematics, kernel interactions, stride and padding effects, and how these operations shape feature extraction and reconstruction in modern CNN-based systems.