profile picture

About Me

I'm currently interning at Meta (Facebook) as a data scientist. I'm a grad student at Northeastern University, where I’m majoring in Data Science. I used to work as a Data Scientist at Intellect Design Arena Ltd., where I used to work on building ML and deep learning models, and backend for the UI in python to capture user feedback. These models are actively being used by multiple insurance companies in real-time. My interests primarily lie in the field of NLP and computer vision.

A few interesting things about me. I love to listen to audio books and podcasts. My favorite podcasts right now are (NLP highlights,Making sense with Sam Harris, The data scientist show). I'm also an avid gamer. I love to play competitive games like DotA, Overwatch. I push myself to learn something about machine learning, software development or just random trivia everyday.

Contact Details

Praveen Kumar Sridhar
Boston, Massachusetts,
sridharpraveenkumar@gmail.com

Skills

python
R
SQL
Data Cleaning
Data Interpretation
Image Processing
Deep Learning
Machine Learning
Causal Inference
MongoDB
Redis
TensorFlow
Keras
Torch
NLP
OpenCV
tesseract
EasyOCR
Random Forest
SVM
Linear Regression
Logistic Regression
Naive-Bayes
FastAPI
Scala
C/C++
Java
Git
RabbitMQ
SQLAlchemy
Tableau
sklearn
plotly
NEAT

Education

Northeastern University

Master's in Data Science 2021-Present

GPA: 4.0/4.0

VIT University

B.Tech in Computer Science 2014-2018

Cumulative GPA: 8.93/10

Work

Graduate Research Assistant

Khoury College of Computer Science Northeastern University Aug 2022 – Present

  • I am working with Prof. Silvio Amir, on Characterizing the Impact of Influential Actors on the Dynamics of #MeToo Online Social Movement.
  • I'm revamping the NLP pipeline, which does the following tasks: pre-processing, topic modeling, inferring author demographics, and post-processing.

Data Scientist Intern

Meta (Facebook) - Ads & Business, May 2022 – Aug 2022

  • Redefined how the organization looks at advertiser churn. Created a strong framework to predict churn and identify levers that reduce churn.
  • Designed an exhaustive advertising lifecycle model and utilized it to construct a churn taxonomy.
  • Developed numerous ML models to forecast churn (with an 85% ACC) and used calibration curves to demonstrate the models' dependability. The critical features that are likely to generate churn, were extracted using SHAP values.
  • Built friction vectors to identify and isolate levers the organization can control to reduce friction(obstacles that lead to churn). Wrote highly optimized SQL queries to create these features.
  • Implemented causal inference models like X learners, & causal trees to measure the impact of each friction to cause churn.
  • Presented the findings to the organization and leadership. The churn model will be used in the future as a part of the broader lifetime model for advertisers.

Graduate Teaching Assistant

Khoury College of Computer Science Northeastern University Jan 2022 – May2022

  • I was a Graduate Teaching assistant for CS5008 – Data Structures, Algorithms, and Their Applications within Computer Systems under Prof. Sophine Clachar.
  • Assisted students in clarifying conceptual misunderstandings during office hours. I also assisted the lecturer in setting up auto graders on gradescope.
  • Graded student's assignments, midterms, and weekly reflections. In addition, assisted the lecturer in the students' final topic presentations.

Data Scientist

Intellect Design Arena Ltd. - R&D team Jul 2020 - Aug 2021

  • Designed, built and shipped Deep learning models like LSTMs, Bidirectional LSTMs and Bidirectional LSTMs with attention. These models achieved accuracies upward of 90% in the production environment.
  • Designed and built the back-end for the application with FastAPI which connects to multiple databases. Additionally, this captures feedback for retaining and further improve the models.
  • Experimented with the best OCRs like tesseract, easyOCR, paddleOCR , etc.
  • Combined CRAFT with tesseract to produce extremely accurate results.
  • Used image processing and tesseract with CRAFT to extract data from MRZ (Machine Readable Zone) in passport images.
  • Helped design, build and ship a complex ensemble text classifier that's built using BERT & ROBERTA .

Data Engineer

Intellect Design Arena Ltd.- R&D team Jun 2018 - Jul 2020

  • Built an entire NLP pipeline using RabbitMQ (from tokenization to spell-checking) which runs on multiple servers which are completely customizable wrt the number of workers/consumers and the flow.
  • Optimize t-SQL procedures by implementing them through spark modules written in scala, complete with auto spin EMR clusters, actively monitoring their status through custom spark listeners.

Data Analyst Intern

Allsec technologies Ltd. Feb 2018 - Mar 2018

  • Worked on employee attrition rate in both R and Python. I initially used many prominent algorithms like Classification Trees, SVM, Random Forest. Finally, implemented a simple artificial neural network which yielded better results.

Certifications

  • Natural Language Processing in TensorFlow
  • Neural Networks and Deep Learning
  • Structuring Machine Learning Projects
  • Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
  • Convolutional Neural Networks
  • Sequence Models

Awards

At Intellect Design Arena

  • Was conferred with the GEM award for building the models and achieving the accuracy expected by the clients and my general contribution to the organization and team.
  • My team won the Chairman’s Excellence Award for our contribution to the organization.