Projects

Netflix Data Analysis - Tableau
Jan – Feb 2023

  • Analyzed Netflix's content library from 2008 to 2021 using Tableau, uncovering significant trends and patterns
  • Developed a dynamic Tableau dashboard that visualizes and presents data on content distribution, viewership, and user engagement over the specified timeframe
  • Provided actionable insights and recommendations based on the dynamic analysis findings, assisting in optimizing content strategies for the evolving Netflix landscape
  • GitHub

Charles River Laboratory Stock Analysis - Python, Financial modelling, Deep Learning
Sept – Dec 2022

  • Developed a predictive model for CRL stock using LSTM to create a stock trading tool that recommends optimal trading strategies based on specific technical indicators
  • Implemented Bollinger Bands, CAPM, & Fama-French 5 Factor model to generate trading signals, with a 90% accuracy
  • Conducted a comprehensive study involving GARCH, Kalman Filter, and Supervised Machine Learning Methods, culminating in the creation of an XGBoost model

Plant detection using multiple GPUs - Python, PyTorch, Deep learning, Parallel computing
Sept – Dec 2022

  • Compared performance of multiple CNN architectures for plant disease detection tasks on the Kaggle dataset through distributed parallel training on multiple GPUs
  • Evaluated performance of DNNs (e.g. Mobilenetv2, VGG16, Resnet34) and ran parallel computations on smaller mini-batches using PyTorch.

Image captioning - Python, PyTorch, Deep learning, Image Processing
May – Sept 2022

  • Summarized text captions for images in the Flickr 8k Database using a combination of CNN and RNN for feature extraction.
  • Implemented LSTM and GRU for image caption generation. Conducted hyperparameter tuning to optimize the model and evaluated its performance against other models.
  • Evaluated caption accuracy through BLEU scores across multiple models for images from the Flicker 8k Database.

Spotify Recommendation and Prediction Analysis - Python, Machine learning, Tensorflow, Recommendation system
May – Aug 2022

  • Evaluated various Machine Learning models on Spotify API data and selected the best based on R2 Score evaluation metric
  • Developed two recommendation systems based on k-means clustering of numerical audio features by genre:
    • Built a content-based recommendation system that ingests data from the user’s playlists using Spotify API
    • Developed a neural Collaborative Filtering recommendation system using TensorFlow, resulting in playlists based on user's preferences using model rankings

Twitter marketing campaign analysis - ETL, python, Airflow, Apache Beam, GCP, AWS, Tableau, Hugging face, JWT, Big Query
Jan – May 2022

  • Designed an ETL pipeline for Twitter API data analysis based on keywords, utilizing Airflow and Apache Beam for data transformation and storage in BigQuery
  • Implemented sentiment analysis with Hugging Face models on AWS Lambda and deployed in FastAPI framework on GCP App Engine, with JWT authentication
  • Locations from the named entities were used to display relevant news using an open-source News API along with real-time metrics on a Tableau dashboard embedded in the WebApp

Storm forecasting using satellite imagery - ETL, python, Airflow, Apache Beam, GCP, AWS, Power BI, Hugging face, JWT, Big Query
Jan – May 2022

  • Created an ETL pipeline and deployed NLP models on Amazon ECR using Lambda functions and Serverless framework to summarize Storm Event descriptions in a Dockerized environment
  • Implemented sentiment analysis on data and added extra features such as real-time data, cached data, and threshold time to locate the nearest storm in the application
  • Developed a Streamlit UI for the nowcasting model to generate and display storm images as GIFs. Embedded a Power BI dashboard for analytics and analysis of storm BI reports

Political lean between news sources Analysis - Python, R, Sentiment analysis, Data mining
Sept – Dec 2021

  • Scraped web data of 80,000 comments from Reddit using R and official tweets of news sources to analyze the political leanings of the data
  • Utilized sentiment analysis, bi-grams, and word associations to demonstrate bias in various sources, effectively visualizing findings
  • Utilized multiple ML algorithms to achieve 85% accuracy in classifying articles based on political bias