Rule based system for Credit card fraud detection

In this project, I leveraged AWS, Hive, Spark and Sqoop tools to identify the fraudulent transactions in real time to mitigate the financial risks and potential losses for the organization.
👋 Hi there! I'm Sharad, a Data Engineer passionate about leveraging data-driven insights to solve complex problems.
💼 With 2+ years of experience in the Software Engineering field, I have developed a strong foundation in various domains, including machine learning, data engineering, cloud computing.
🔬 I am well-versed in using a wide range of tools and programming languages such as:
Python | SQL | PySpark | Hive | Hadoop |
AWS | Airflow | Hue | Impala | Docker |
HBase | Sqoop | Grafana | InfluxDB | Linux |
🌟 If you are looking for a data engineer who combines strong technical skills, a passion for problem-solving, and a collaborative mindset, I'd love to connect and explore potential opportunities. Let's make data-driven decisions together and unlock new possibilities!
In this project, I leveraged AWS, Hive, Spark and Sqoop tools to identify the fraudulent transactions in real time to mitigate the financial risks and potential losses for the organization.
Finding the most promising leads for an education company for greater efficiency in Lead targeting and minimising unnecessary contacts made to non potential leads. Created a Logistic regression model for Prediction of Lead Score indicating the Chances of taking up a course in the platform.
Performing Batch ETL of ATM transactions data using Apache Sqoop, Apache PySpark, loading the table data into Amazon S3 and warehousing using Amazon RedShift to analyze ATM withdrawl behaviours to optimally manage the refill frequency.
This ETL project demonstrates the process of extracting, transforming, and loading NYC taxi data for analysis. The infrastructure setup, data ingestion into RDS and into HBase using Sqoop, MapReduce processing and exporting results to RDS are covered in this project.
Multiple Regression model building for Prediction of Bike rentals using Sklearn and Statsmodels and analysis of relevant predictors using P-values and VIF
Extensive EDA Case study of Loan applications of customers based on various factors and identifying the trends in Defaulters and Non Defaulters
Analysis of RSVP movies dataset using Advanced SQL