PK.
Machine Learning Engineer & Data Scientist

Hi, I'm PK.

I build AI agents for businesses at Meta, and the evaluation frameworks that prove they actually work. Data Scientist by training, ML Engineer by habit.

PythonGoReactPyTorchGCP / Vertex AIAWSKubernetesLLMs
Data Scientist 4, Meta Vancouver, Canada M.S. Data Science, Northeastern
What I build

Full-stack across the ML lifecycle.

I am comfortable owning a system from the data layer to the deployed service, not just the model in the middle.

01 / Engineering

Systems & infra

Backends in Python and Go, React frontends, distributed services, and ML pipelines on GCP and Vertex AI with Docker, Kubernetes, Airflow, and Terraform.

02 / Modeling

ML & GenAI

Production deep learning, computer vision, and NLP, plus LLM systems built with LangChain: RAG and retrieval over FAISS and Pinecone, fine-tuning, LlamaGuard-style guardrails, and agent tooling on PyTorch, HuggingFace, LLaMA, and GPT.

03 / Science

Evaluation & experiments

Evaluation frameworks, A/B and A/A testing, causal inference, and the metrics that turn model quality into decisions leadership can act on.

Selected impact

Shipped work, measured.

+15%
AI Deep Convos from one product-consolidation launch
Meta
+12%
Data Coverage Rate, the agent org's north-star metric
Meta
90%+
Production accuracy across deep-learning models, CNNs to attention BiLSTMs
Intellect Design Arena
$90M
Inference capacity allocated via an ROI framework
Meta
12%
Business price-correction volume cut by a pricing experiment
Meta
+10%
Moderation lift from automated fake-account detection, A/B tested
Nextdoor
+5%
MAU lift from the notification system I built and shipped
Citizen
+3%
DAU lift from triggering and live-video ranking systems
Citizen
Experience

Where I've worked.

Now · Nov 2024

Meta

Data Scientist 4, Machine Learning · Vancouver, CA
  • Built the evaluation framework for the knowledge base behind Meta's business AI agents, defining metrics (DCR, biz-DCR, Data Conflict Rate) that isolate whether a failure is hallucination, model gaps, missing knowledge, or bad retrieval.
  • Proved DCR correlates with business retention and made it the north-star metric for the agent quality org, then led the RCA that recovered a 12% DCR regression to baseline.
  • Owned the agent experimentation stack end to end (A/B, A/A, offline eval, simulation, replay), landing a consolidation launch at WAB +6%, DCR +12%, AIDC +15%.
  • Engineered and shipped Claude Code skills, plugins, and DS-agent swarms adopted by 10+ teams, validated $293M in savings, and owned 2025 to 2030 capacity planning.
Feb to Nov 2024

Citizen

Machine Learning Engineer 3 · San Jose, CA
  • Built and deployed a notification system using GPT to rewrite titles dynamically, lifting DAU 2 to 3% and MAU 5% with triggering and user-selection models.
  • Engineered a CLIP and INSTRUCTOR video-distribution system on Vertex AI to surface only high-engagement live video, raising CTR, DAU, and MAU.
  • Stood up a robust ML pipeline on GCP plus an offline-to-online review framework that improved model quality and deployment consistency.
May to Dec 2023

Nextdoor

Software Engineer 2, ML · Moderation · San Jose, CA
  • Engineered an automated fake account detection system and ran a winning A/B test for a 5 to 10% moderation improvement.
  • Augmented data-scarce regions with GPT-generated examples and built a GPT-based Probability-of-Infractions model deployed alongside classical ML and BERT.
Aug 2022 to May 2024

Khoury College, NULab

Research Assistant · Boston, MA
  • Built an NLP pipeline for topic modeling and demographics with a DistilBERT emotion module that cut inference time 45%.
  • Built a semantic retrieval system (Instructor, GTE, RoBERTa with FAISS) and generated synthetic instruction-tuning data with Llama-3.
May to Aug 2022

Meta

Data Scientist Intern · Seattle, WA
  • Built a churn framework with causal inference (X-learners, causal trees) and SHAP, forecasting advertiser churn at 92% accuracy.
Jun 2018 to Aug 2021

Intellect Design Arena

Data Scientist · Chennai, India
  • Designed and shipped CNNs, LSTMs, and attention BiLSTMs to production at 90%+ accuracy, plus a BERT and RoBERTa ensemble classifier.
  • Combined CRAFT with Tesseract for a 5% OCR accuracy gain and extracted MRZ data from passports via image processing.
Selected projects

Things I built end to end.

Answer Engine Optimization

Full-stack SaaS

A distributed B2B platform that tracks and optimizes visibility across LLM answer engines, with real-time prompt ranking, competitive intel dashboards, and AI content recommendations.

PythonGoReact

PromptJesus

Open source

An LLM-powered prompt-optimization platform on Llama with LlamaGuard guardrails, auto-enhancing free-form input through git-style, version-controlled iterative refinement.

LlamaLlamaGuardPrompt eng.

Vision Transformer

Computer vision

Fine-tuned Vision Transformers for image classification, reaching 95% accuracy on the CIFAR-10 dataset.

ViTPyTorchCIFAR-10

FaceCheck

Deployed model

A face-authentication system that flags counterfeits from StyleGAN, DCGAN, and PGGAN using a DenseNet classifier, deployed on GCP.

DenseNetGANsGCP

The Song Search

Information retrieval

A music IR system reaching 80% MAP, using MAGENTA's MT3 (a T5 architecture) for transcription over a custom dataset built on GTZAN.

MT3 / T5MAGENTAGTZAN

More on GitHub

Open source ↗

Models, notebooks, and experiments across NLP, computer vision, and LLM tooling, plus everything that did not fit above.

github.com/PraveenKumarSridhar
Toolkit

What I work with?

The full stack I reach for, from raw modeling to the infra that ships and monitors it.

Languages
Python · Go · Scala · R · C++ · Java · SQL
Backend & Web
Go · Python · React · REST APIs · distributed systems · microservices
ML & Deep Learning
PyTorch · TensorFlow · Keras · HuggingFace · scikit-learn · XGBoost · ONNX · OpenCV
GenAI & LLMs
LLaMA · GPT · Claude · Gemini · Mistral · HuggingFace Transformers · Ollama · vLLM
LLM Engineering
LangChain · LlamaIndex · RAG & retrieval · FAISS · Pinecone · embeddings (Instructor, GTE) · fine-tuning & PEFT/LoRA · LlamaGuard · prompt engineering · agents & eval · MCP
MLOps & Orchestration
Docker · Kubernetes · Kubeflow · Airflow · Dagster · MLflow · TFX · Terraform · GitHub Actions
Cloud & Data
GCP / Vertex AI · AWS · Spark · Hadoop · PostgreSQL · MongoDB · Redis · SQL Server
Methods
A/B testing · Causal inference · Statistics · NLP · Computer vision · Information retrieval · Recommender systems
Observability & Tools
Datadog · Grafana · wandb · Tableau · Git · Linux · vim

Education

Northeastern University

M.S., Data Science, Khoury College

GPA 4.0 / 4.0 · Boston, MA · 2021 to 2023

VIT University

B.Tech, Computer Science

Chennai, India · 2014 to 2018

Recognition

  • On the Technical Program Committee for 5+ IEEE conferences
  • Reviewer of academic papers for multiple IEEE conferences
  • Judge for 3+ Globee International Awards
  • Winner, SPOT Award, Nextdoor
  • Winner, Going the Extra Mile Award, Intellect Design Arena
Get in touch

Let's build something
measurably good.

© 2026 Praveen Kumar Sridhar Vancouver, Canada