Jai Bhagat (@JaiBhagat)
  • Stars
    star
    19
  • Global Rank 658,060 (Top 23 %)
  • Followers 10
  • Following 2
  • Registered about 7 years ago
  • Most used languages
    R
    53.8 %
    Python
    15.4 %
  • Location 🇮🇳 India
  • Country Total Rank 36,757
  • Country Ranking
    R
    115

Top repositories

1

END-Session-7

Python
2
star
2

Retail-Giant-Sales-Forecasting

“Global Mart” is an online store super giant having worldwide operations. It takes orders and delivers across the globe and deals with all the major product categories - consumer, corporate & home office. Now as a sales/operations manager, you want to finalise the plan for the next 6 months. So, you want to forecast the sales and the demand for the next 6 months, that would help you manage the revenue and inventory accordingly. The store caters to 7 different market segments and in 3 major categories. You want to forecast at this granular level, so you subset your data into 21 (7*3) buckets before analysing these data. But not all of these 21 market buckets are important from the store’s point of view. So you need to find out 2 most profitable (and consistent) segment from these 21 and forecast the sales and demand for these segments.
R
2
star
3

Chinese-automobile-pricing-problem

A Chinese automobile company Geely Auto aspires to enter the US market by setting up their manufacturing unit there and producing cars locally to give competition to their US and European counterparts. They have contracted an automobile consulting company to understand the factors on which the pricing of a car depends. Specifically, they want to understand the factors affecting the pricing of cars in the American marketing, since those may be very different from the Chinese market. Essentially, the company wants to know: Which variables are significant in predicting the price of a car How well those variables describe the price of a car Based on various market surveys, the consulting firm has gathered a large dataset of different types of cars across the American market.
R
2
star
4

ERA-Session-7-

Jupyter Notebook
1
star
5

ERA---Session-8

Jupyter Notebook
1
star
6

ERA-Session5

Jupyter Notebook
1
star
7

END-Session-3

Python
1
star
8

END-Session-6

1
star
9

ERA-Sesssion-6

Jupyter Notebook
1
star
10

HR-Analytics-CaseStudy

A large company named XYZ, employs, at any given point of time, around 4000 employees. However, every year, around 15% of its employees leave the company and need to be replaced with the talent pool available in the job market. The management believes that this level of attrition (employees leaving, either on their own or because they got fired) is bad for the company, because of the following reasons - The former employees’ projects get delayed, which makes it difficult to meet timelines, resulting in a reputation loss among consumers and partners A sizeable department has to be maintained, for the purposes of recruiting new talent More often than not, the new employees have to be trained for the job and/or given time to acclimatise themselves to the company Hence, the management has contracted an HR analytics firm to understand what factors they should focus on, in order to curb attrition. In other words, they want to know what changes they should make to their workplace, in order to get most of their employees to stay. Also, they want to know which of these variables is most important and needs to be addressed right away. Since you are one of the star analysts at the firm, this project has been given to you.
R
1
star
11

BFS-Capstone-Project

CredX is a leading credit card provider that gets thousands of credit card applicants every year. But in the past few years, it has experienced an increase in credit loss. The CEO believes that the best strategy to mitigate credit risk is to ‘acquire the right customers’. In this project, you will help CredX identify the right customers using predictive models. Using past data of the bank’s applicants, you need to determine the factors affecting credit risk, create strategies to mitigate the acquisition risk and assess the financial benefit of your project.
R
1
star
12

Data-Ingestion-and-Processing-HIVE

Problem Statement The New York City Taxi & Limousine Commission (TLC) has provided a data of trips made by the taxis in the New York city. The detailed trip-level data is more than just a vast list of taxi pickup and drop off coordinates. The records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations (location coordinates of the starting and ending points), trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts etc. You can download the data dictionary below.
1
star
13

MNIST-digit-Recognition

A classic problem in the field of pattern recognition is that of handwritten digit recognition. Suppose that you have an image of a digit submitted by a user via a scanner, a tablet, or other digital devices. The goal is to develop a model that can correctly identify the digit (between 0-9) written in an image.
R
1
star
14

Uber-Supply-Demand-Gap

The aim of analysis is to identify the root cause of the problem (i.e. cancellation and non-availability of cars) and recommend ways to improve the situation. As a result of your analysis, you should be able to present to the client the root cause(s) and possible hypotheses of the problem(s) and recommend ways to improve them.
R
1
star
15

NYC-Parking-Tickets-An-Exploratory-Analysis

Problem Statement Big data analytics allows you to analyse data at scale. It has applications in almost every industry in the world. Let’s consider an unconventional application that you wouldn’t ordinarily encounter. New York City is a thriving metropolis. Just like most other metros that size, one of the biggest problems its citizens face, is parking. The classic combination of a huge number of cars, and a cramped geography is the exact recipe that leads to a huge number of parking tickets. In an attempt to scientifically analyse this phenomenon, the NYC Police Department has collected data for parking tickets. Out of these, the data files from 2014 to 2017 are publicly available on Kaggle. We will try and perform some exploratory analysis on this data. Spark will allow us to analyse the full files at high speeds, as opposed to taking a series of random samples that will approximate the population. For the scope of this analysis, we wish to compare phenomenon related to parking tickets over three different years - 2015, 2016, 2017. All the analysis steps mentioned below should be done for 3 different years. Each metric you derive should be compared across the 3 years.
R
1
star