Shivang Singh.
I build and scale GenAI systems in production — where latency, token limits, and failure modes matter as much as model quality.
look
Bodhi Atomize
Production multimodal GenAI platform decomposing 10,000+ marketing assets into 50+ structured signals per asset for Eli Lilly. Multi-stage LLM pipelines with token budgeting, backpressure, and KEDA-autoscaled microservices.
7-agent autonomous job intel pipeline · $0.06/app · LaTeX resume gen
Obsessed with structured outputs, LLM evaluation, and production reliability under burst traffic.
survive production traffic.
I design and operate LLM pipelines that handle real traffic. At Publicis Sapient I lead Bodhi Atomize — a multimodal GenAI platform that turns images, videos, and GIFs into structured signals for enterprise clients like Eli Lilly. Previously shipped object detection and defect detection systems improving accuracy and inference speed at scale.
My work sits at the intersection of GenAI systems, computer vision, and production ML engineering — where latency, token limits, retries, backpressure, and failure modes matter as much as model quality.
at scale.
- Architected Bodhi Atomize — production multimodal GenAI platform cutting marketing asset analysis from hours to ~2 min per asset (95% reduction) across 10,000+ assets for Eli Lilly. Outputs 50+ structured JSON signals per asset.
- Engineered multi-stage LLM inference pipelines with Gemini 2.5 Pro and Pydantic-validated structured outputs. Implemented token budgeting, exponential-backoff retry, and backpressure control to sustain production throughput under rate limits.
- Integrated YOLO and PaddleOCR into LLM workflows, extracting 50+ typed visual components (text, characters, emotions, branding) per asset. Established LLM evaluation with DeepEval (LLM-as-judge, G-Eval).
- Built FastAPI microservices with Redis (caching + task queuing) and Celery. Deployed on Kubernetes with KEDA autoscaling to sustain 1,000+ concurrent requests under burst traffic with low latency.
- Integrated RF-DETR into production pipelines — 1.8× faster inference and +7% mAP50 improvement over YOLOv8 baseline on industrial defect detection.
- Curated and preprocessed 30,000+ industrial images through targeted augmentation and annotation QA pipelines, lifting defect detection accuracy by 10%.
- Led the supervised modelling team predicting urban farming zones in Milan using geospatial data.
- Engineered XGBoost model achieving 93.68% accuracy. Conducted EDA on 106,000 rows with Geopandas.
- Implemented real-time predictions, optimising data handling and model efficiency.
- Led the Computer Vision domain for the campus AI/ML club. Mentored juniors, ran workshops, organized hackathons.
- Co-led campus tech club. Organized events, hosted talks, fostered project-driven learning.
shipped.
Dossier
Autonomous Agentic Job Search Intelligence
7-agent autonomous pipeline (Job Discovery, Watchlist, Company Intel, Market Intel, Gap Analysis, Resume Agent, Referral Finder) that discovers, scores, researches, and generates tailored applications end-to-end. Parallel LLM scoring across 550+ jobs/run, pre-LLM rule filters cut 65% of API calls, Claude generates ATS-optimised LaTeX resumes via 3-pass self-evaluation.
FedFV-CV
Federated Deep Learning for Biometric Auth
Federated deep learning framework for finger-vein biometric authentication using MobileNetV2. Engineered custom FedWPR aggregation on 122,600 images across 5 clients, outperforming FedAvg benchmarks. B.Tech Thesis, IIIT SriCity.
slackAgent
AI-Powered Slack Bot with RAG
Scalable FastAPI backend with LlamaIndex + ChromaDB semantic search over 20+ documents. Cut query response time by 40% and served 50+ daily queries via Slack API with end-to-end automation through n8n.
RAG-QA on AWS
Retrieval-Augmented QA, fully CI/CD
Retrieval-augmented QA system using LangChain, FAISS, and AWS Bedrock (LLAMA 3.1-70B). Deployed to AWS ECR + App Runner via Docker with full CI/CD through GitHub Actions.
reach for.
LLM & GenAI
10 itemsComputer Vision
5 itemsMLOps & Backend
7 itemsCloud & Infra
8 itemsProgramming & ML
6 itemsfrom real teams.
“Shivang treats production AI like a first-class engineering problem — not a research demo. His pipelines actually survive real traffic.”
“One of the few engineers who can hold both the model intuition and the infrastructure tradeoffs in his head at the same time. Rare combo.”
“He delivered Bodhi Atomize end-to-end — from prompt design to KEDA autoscaling. The kind of person you want building the AI layer of your product.”
Latest from the blog.
All postsWhat I Learned Shipping a Multimodal GenAI Platform to Production
Bodhi Atomize processes 10,000+ marketing assets for Eli Lilly. Here's what actually broke, what scaled, and what I'd build differently next time.
Designing Dossier: A 7-Agent Job-Search Pipeline That Costs $0.06 per Application
I built an autonomous agentic system that researches companies, scores roles, finds referrals, and writes tailored resumes — for less than a coffee. Here's the architecture.
that ships.
Open to conversations around GenAI systems, LLM infrastructure, ML engineering, and production AI challenges. Drop a line — I reply fast.