Currently building production AI at GATC Health — architecting a hybrid RAG + agentic research platform that became the primary AI interface for scientific teams. I design systems at the intersection of LLMs, retrieval, and domain-specific ML: sparse–dense retrieval pipelines, LIGHT-style long-context memory, agentic tool orchestration, and GNNs for molecular property prediction. Also building Superscaled as a founder. Nine years of production engineering across Yahoo, upGrad, and Egen.ai underpin everything I ship.
Deep-stack AI engineering — retrieval, reasoning, and domain-specific ML.
Designed hybrid sparse–dense retrieval pipelines (BM25 + dense embeddings, late fusion, cross-encoder reranking) and LIGHT-style multi-million-token memory subsystems. Built agentic tool orchestration so LLMs autonomously invoke specialized ML models and external databases (ChEMBL, PubChem) as callable actions.
Trained GNN architectures for toxicity, ADMET, and blood-brain barrier permeability prediction (F1 ≈ 0.90, AUROC ≈ 0.92). End-to-end pipelines from ChEMBL/PubChem curation to Kubernetes-deployed inference APIs with GPU support.
Post-hoc analysis of a Phase 3 social anxiety trial — integrating longitudinal clinical outcomes, site operations, and speech features extracted from patient visits to uncover placebo response dynamics and habituation effects.
Self-hosted Qwen 2.5 27B on AWS SageMaker via SGLang for high-throughput inference. Evaluation harnesses, fine-tuned instruct models for domain alignment, and modular architectures that support drop-in model upgrades.
Founder-led product. Building in public at superscaled.com.
Visit Superscaled→Tools I use to build and ship.
AI & LLMs
ML Engineering
Retrieval & Vector DBs
Data Engineering
Cloud & APIs
From BASIC programs to distributed systems at scale.
Senior AI/ML Engineer · Apr 2025 – Present
Hybrid RAG + Agentic Research Platform
Architected and shipped a production-grade internal AI research platform — functioning as an enterprise Perplexity for scientific teams — that became the primary AI interface for lab workflows, significantly boosting researcher productivity and usage across the organization.
Designed a hybrid sparse–dense retrieval pipeline combining BM25/TF-IDF with dense embeddings, late fusion, and cross-encoder reranking to achieve high recall on scientific literature across large chemical and biological corpora.
Built a LIGHT-style memory subsystem that scales to millions of tokens of conversational history using episodic retrieval, structured working memory, and a compressed scratchpad — enabling persistent, context-aware interactions across long multi-session research workflows.
Developed domain-specific agentic tools: toxicity lookup, molecular property prediction, structure normalization, and external database connectors (ChEMBL, PubChem) — allowing the LLM to autonomously invoke specialized ML models and data sources as callable tool actions.
Self-hosted Qwen 2.5 27B on AWS SageMaker using SGLang for high-throughput, low-latency inference. Modular architecture supports drop-in upgrades to larger or newer models. Implemented evaluation harnesses and fine-tuned instruct models for domain-specific alignment.
PAL-3 Post-Hoc Clinical Analytics · VistaGen Therapeutics
Investigated a failed Phase 3 social anxiety trial (negative primary endpoint) by integrating longitudinal clinical outcomes, site operations data, enrollment timing, and turn-level speech features extracted from recorded patient visits.
Discovered that subject speech dynamics during treatment visits — conversation share, utterance length, total talk time — carried a consistent, leakage-aware predictive signal for placebo response. Developed a habituation hypothesis explaining unexpected placebo performance.
Explored recruitment channel effects, site-level calendar drift, psychometric symptom structure, and clinician vs. patient vocal dynamics as outcome moderators — providing actionable recommendations for future trial design.
Graph Neural Networks for Molecular Property Prediction
Designed and trained GNN architectures for toxicity, ADMET, and blood-brain barrier (BBB) permeability prediction — selecting GNNs over tree-based baselines to capture molecular graph topology and atom-level features. F1 ≈ 0.90 · AUROC ≈ 0.92.
Led end-to-end data curation from ChEMBL and PubChem: cleaning pipelines, deduplication, class-imbalance handling (SMOTE, weighted loss), and PCA-based exploratory analysis.
Built scalable Python inference services deployed on Kubernetes with GPU support, enabling low-latency internal consumption via well-documented APIs.
Senior ML Engineer · Sept 2021 – Apr 2025
Financial Risk & Pricing ML · DriveTime · Carvana
Led development of production ML systems for financial risk and pricing products used by enterprise customers including DriveTime and Carvana — directly impacting underwriting decisions at scale across hundreds of thousands of auto loan applications.
Designed, trained, and deployed predictive models for risk-adjusted APR, underwriting, LTV, depreciation curves, and delinquency prediction — achieving F1 scores close to 0.90 across multiple use cases with rigorous backtesting and holdout validation.
Owned customer data pipelines and feature engineering workflows, partnering closely with client data teams on EDA, labeling strategy, feature importance analysis, and model validation — including fairness and bias audits for regulatory alignment.
ML Inference Platform · AWS Kubernetes
Led the team to build a scalable microservices inference platform on AWS Kubernetes with full observability — ELK stack, structured logging, latency monitoring, and auto-scaling to handle burst inference workloads across multiple model families simultaneously.
Authored custom Docker images with CUDA toolkit support and container runtime integration for GPU inference, enabling hardware-accelerated model serving for deep learning workloads with consistent performance across environments.

Team Leadership & ML Best Practices
Managed and mentored a team of engineers, driving ML best practices across data ingestion, experiment tracking, model training, and production deployment — establishing standards for reproducibility, model versioning, and staged rollout that became the team's default workflow.
Lead Software Engineer · Dec 2019 – Sept 2021
LMS Rebuild [Case Study]
Led the full-stack rebuild of a large-scale Learning Management System serving millions of users — achieving approximately 75% improvement in frontend performance scores through architecture optimization, code splitting, and caching strategies tailored to 2G mobile demographics across India.

ML Product: “Shorts” — Micro-Learning Feed
Conceived and built an ML-powered micro-learning product — a TikTok-style short-form feed — end-to-end. Implemented the SM-2 (SuperMemo-2) spaced-repetition algorithm enhanced with a Feed-Forward Neural Network to personalize content sequencing and retention scheduling per user, adapting review intervals based on individual performance signals.

Student Dropout & Failure Prediction
Developed predictive models that flag students likely to fail or drop out months in advance — based on engagement patterns, attendance data, and social interaction signals — enabling proactive intervention by academic advisors before outcomes became irreversible. Worked cross-functionally with product, design, and analytics teams to integrate model outputs into actionable UX decisions.
Software Engineer · Mar 2018 – Dec 2019
Yahoo Ad.com — Self-Serve SMB Ad Platform
Contributed to Yahoo Ad.com, a self-serve advertising platform enabling small and medium businesses to create and manage ad campaigns across the Yahoo network with minimal setup friction.
Built ML-assisted onboarding flows that used basic business inputs — website URL, name, category — to auto-generate initial campaign configurations, measurably reducing time-to-first-campaign for new advertisers.


Business Knowledge Graph
Built and maintained a business knowledge graph by extracting structured signals from business websites, metadata, and third-party sources — enabling semantic understanding of advertiser intent and business context at scale.
Leveraged the graph to auto-select campaign parameters including CTA, target audience segments, geo-targeting, and bidding strategy — measurably reducing onboarding friction and improving activation rates for first-time advertisers.
Engineering Intern · 2017
Built the CMS infrastructure for Fulfil's marketing team — enabling non-technical stakeholders to manage content, landing pages, and product documentation independently without engineering involvement.
Led a product-wide UI redesign, modernizing the visual language of the B2B ERP platform and improving consistency across core modules. First exposure to production engineering, shipping cadences, and real-user feedback loops at a venture-backed SaaS company.
