Rule based system for Credit card fraud detection
In this project, I leveraged AWS, Hive, Spark and Sqoop tools to identify the fraudulent transactions in real time to mitigate
the financial risks and potential losses for the organization.
Lead Scoring for an EdTech Company
Finding the most promising leads for an education company for greater efficiency in Lead targeting
and minimising unnecessary contacts made to non potential leads.
Created a Logistic regression model for Prediction of Lead Score indicating the Chances of taking up a course in the platform.
Bank Transactions Batch ETL
Performing Batch ETL of ATM transactions data using Apache Sqoop, Apache PySpark, loading the table data into Amazon S3 and
warehousing using Amazon RedShift to analyze ATM withdrawl behaviours to optimally manage the refill frequency.
NYC Taxi Data ETL and Analysis in EMR
This ETL project demonstrates the process of extracting, transforming, and loading NYC taxi data for analysis.
The infrastructure setup, data ingestion into RDS and into HBase using Sqoop, MapReduce processing and exporting results to RDS are covered in this project.
Bike Rentals prediction using Regression
Multiple Regression model building for Prediction of Bike rentals using Sklearn and Statsmodels and analysis of relevant predictors using P-values and VIF
Credit Risk Analysis : EDA Case Study
Extensive EDA Case study of Loan applications of customers based on various factors and identifying the trends in Defaulters and Non Defaulters
Movie House Analysis : SQL Case Study
Analysis of RSVP movies dataset using Advanced SQL