There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Twitter-Sentiment-Analysis
It is a Natural Language Processing Problem where Sentiment Analysis is done by Classifying the Positive tweets from negative tweets by machine learning models for classification, text mining, text analysis, data analysis and data visualizationStatistics-for-Data-Science
Learning Statistics is one of the most Important step to get into the World of Data Science and Machine Learning. Statistics helps us to know data in a much better way and explains the behavior of the data based upon certain factors. It has many Elements which help us to understand the data better that includes Probability, Distributions, Descriptive Analysis, Inferential Analysis, Comparative Analysis, Chi-Square Test, T Test, Z test, AB Testing etc.Fraud-Detection-in-Online-Transactions
Detecting Frauds in Online Transactions using Anamoly Detection Techniques Such as Over Sampling and Under-Sampling as the ratio of Frauds is less than 0.00005 thus, simply applying Classification Algorithm may result in OverfittingDrugs-Recommendation-using-Reviews
Analyzing the Drugs Descriptions, conditions, reviews and then recommending it using Deep Learning Models, for each Health Condition of a Patient.Data-Visualizations
Data Visualizations is emerging as one of the most essential skills in almost all of the IT and Non IT Background Sectors and Jobs. Using Data Visualizations to make wiser decisions which could land the Business to make bigger profits and understand the root cause and behavioral analysis of people and customers associated to it. In this Repository I have deeply discussed about Line Plots, Bar plots, Scatter Plots, and Pie Charts, Apart from that I have Discussed scientific plots, 3d plots, animated plots, interactive plots to visualize any kind of business problem and that too of any complexity.HR-Analytics
Analyzing the HR Criteria of a Company and how they promote their Employees and keep Balance between them using Data Analytics, Data Visualizations, and Machine Learning Models for Classification Purposes.Insurance-Claim-Prediction
In this Data set we are Predicting the Insurance Claim by each user, Machine Learning algorithms for Regression analysis are used and Data Visualization are also performed to support Analysis.Churn-Modelling-Dataset
Predicting which set of the customers are gong to churn out from the organization by looking into some of the important attributes and applying Machine Learning and Deep Learning on it.FIFA-2019-Analysis
This is a project based on the FIFA World Cup 2019 and Analyzes the Performance and Efficiency of Teams, Players, Countries and other related things using Data Analysis and Data VisualizationsBig-Mart-Sales-Prediction
Using Machine Learning Algorithms for Regression Analysis to predict the sales pattern and Using Data Analysis and Data Visualizations to Support it.Wine-Quality-Predictions
Predicting the Quality of Red Wine using Machine Learning Algorithms for Regression Analysis, Data Visualizations and Data Analysis.SECOM-Detecting-Defected-Items
Anamoly Detection for Detecting Defected Manufactured Semi-Conductors, as in this case of Classification, the Defected Chips would be very less in comparison to perfect Chips so we have apply either Over-Sampling or Under-Sampling.Students-Performance-Analytics
Students Performance Evaluation using Feature Engineering, Feature Extraction, Manipulation of Data, Data Analysis, Data Visualization and at lat applying Classification Algorithms from Machine Learning to Separate Students with different gradesBasic-Python-Course
The Course includes all the basic concepts of Python. Python Programming Language is one of the most trending programming languages in the world currently due to it's large applications in different domains and technologies. Python is the first choice for Data Science, Machine Learning, Deep Learning, Computer Vision, Block chain etc.,Facebook-Social-Network-Analysis
Social network analysis (SNA) is the process of investigating social structures through the use of networks and graph theory.[1] It characterizes networked structures in terms of nodes (individual actors, people, or things within the network) and the ties, edges, or links (relationships or interactions) that connect them. Examples of social structures commonly visualized through social network analysis include social media networks,[2] memes spread,[3] information circulation,[4] friendship and acquaintance networks, business networks, social networks, collaboration graphs, kinship, disease transmission, and sexual relationships.[5][6] These networks are often visualized through sociograms in which nodes are represented as points and ties are represented as lines. These visualizations provide a means of qualitatively assessing networks by varying the visual representation of their nodes and edges to reflect attributes of interest.Passenger-Prediction-Using-Time-Series-Analysis
I have used Time Series Analysis to predict the behavior and pattern of Passengers at a bus stop, Data Visualizations include Time-Series Plots.Weed-Detection
This Problem is based on a Image Data set consisting of different types of weeds, to detect them in crops and fields. I have used Deep Learning Model called CNN(Convolutional Neural Networks) with Dropout, Batch Normalization, ReduceLearning rate on plateau, Early stoppig rounds, and Transposd Convolutional Neural Networks.Stock-Market-Predictions
Predicting the stock market opening values using Deep learning's Model Recurrent Neural Networks which is a very powerful model.Education-Process-Mining
Mining Student's Performances through their results in final and intermediate exams using Machine LearningPredict-Future-Sales
It is from a kaggle competition where we have to predict the future sales using Machine Learning or Deep Learning. It is a Advanced Regression Problem where Statistics and time series analysis is also required. This problem can be very well done by Deep Learning's Model Recurrent Neural Networks.Credit-Card-Fraud-Detection
It is Based on Anamoly Detection and by Using Deep Learning Model SOM which is an Unsupervised Learning Method to find patterns followed by the fraudsters.Online-Shoppers-Purchasing-Intention
In this data set we have perform classification or clustering and predict the intention of the Online Customers Purchasing Intention. The data set was formed so that each session would belong to a different user in a 1-year period to avoid any tendency to a specific campaign, special day, user profile, or period.Online-Retail-Transactions-of-UK
Analyzing the Online Transactions in UK and the countries who are purchase stuff from them and analyzing the reviews from them using NLP and Machine LearningWorld-Food-Production
Comparing Top food and feed Producers around the globe and also seeking some interesting answers, solutions, patterns, hints and warnings through the power of Data Analysis and Data Visualization using Machine Learning.Ads-Optimization
Optimizing the best Ads using Reinforcement learning Algorithms such as Thompson Sampling and Upper Confidence Bound.Black-Friday-Regression-Analysis
Predicting Prices for the products to be sold on Black Friday in US using Regression Analysis, Feature Engineering, Feature Selection, Feature Extraction and Data analysis - Data Visualizations.Google-Job-Skills
Having an Exploratory Analysis at what kind of Jobs and Job Locations are provided by Google and Youtube, also we look into some specific details which are important to get hired by youtube and google.Numpy-and-Pandas
Numpy and Pandas are one of the most important building blocks of knowledge to get started in the field of Data Science, Analytics, Machine Learning, Business Intelligence, and Business Analytics. This Tutorial Focuses to help the Beginners to learn the core Concepts of Numpy and Pandas and get started with Machine Learning and Data Science.Restaurant-Reviews-Analysis
Using Natural Language Processing and Bag of Words for feature extraction for sentiment analysis of the customers visited in the Restaurant and at last using Classification algorithm to separate Positive and Negative Sentiments.Loan-Default-Prediction
L&T Financial Services & Analytics Vidhya presents ‘DataScience FinHack’. where I have predicted whether the customer will be defaulter in the first EMI payment using different algorithms from machine learningBoston-House-Price-Predictions
The most basic data set available to practice the concepts of regression analysis and explore the most basic concepts of machine learningMNIST-Using-K-means
It is One of the Easiest Problems in Data Science to Detect the MNIST Numbers, Using a Classification Algorithm, Here I have used a csv File which contains the Pixels of the Numbers from 0 to 9 and we have to Classify the Numbers Accordingly. I have Used K-Means Classification Algorithm.MNIST-Dataset
Recognizing the Digits from 0-9 using their pixel values as attributes, using Deep Learning Model to Classify the Digits.Market-Basket-Analysis
Using Apriori Algorithm to do Market Basket Analysis of Customers purchasing behaviours. It can predict what the customer is going to buy next by looking at the products he is buying.Heart-UCI-Dataset
Analyzing the Features which leads to heart diseases and visualizing the models' performance and important features using eli5, shap and pdp.Abalone-Age-Prediction
Predicting Algae's age using different attributes and Machine Learning Algorithms for Regression Analysis.Clustering-of-Mall-Customers
Clustering Analysis Performed on the Customers of a Mall based on some common attributes such as salary, buying habits, age and purchasing power etc, using Machine Learning Algorithms.Fraud-Detection-in-Insurace-Claims
This is a very Important part of Data Science Case Study because Detecting Frauds and Analyzing their Behaviours and finding reasons behind them is one of the prime responsibilities of a Data Scientist. This is the Branch which comes under Anamoly Detection.Cervical-Cancer-Prediction
In this data set, We have to predict the patients who are most likely to suffer from cervical cancer using Machine Learning algorithms for Classifications, Visualizations and Analysis.Amazon-Alexa-Reviews
Using Natural Language Processing, Data Visualizations and Classification Algorithms of Machine LearningLoan-Prediction
Predicting whether a person who has applied for a loan in a bank would get his/her loan approved or not using Classification Algorithms in Machine Learning, by looking at some common and useful attributes.Social-Networks-Ads
One of the most basic data sets to learn and implement some of the most easy and basic algorithms of machine learning and visualizationGraduate-Admissions-Analysis
Analyzing the Factors on which Graduates get Admissions in Abroad and Visualizing some of the most intriguing and interesting patterns followed onto it using Data Analysis and Data Visualizations Using Machine Learning.Avito-Demand-Prediction-Challenge
It is a Competition for Regression Challenge held by Kaggle, It is based on a Avito Dataset whose size is 123GB which can be accessed from Kaggle, I have done Data Pre-processing, feature engineering, feature extraction, data visualization, machine learning, stacking and boostingCar_Evaluation
Evaluating a Car based on some popular attributes which could be beneficial in decision making while purchasing a Car, Who do not have enough knowledge about Cars.Employee-Reviews
This is Project which contains Data Visualization, EDA, Machine Learning Modelling for Checking the Sentiments.Dutch-Energy
It is a heavy Data set which consists of many sub-parts.It is based on Electricity and gas usage in that country. It can be solved using different approaches of machine learning and deep learning.Breast-Cancer-Wisconsin
This is Data set to Classify the Benign and Malignant cells in the given data set using the description about the cells in the form of columnar attributes. There are Visualizations and Analysis for Support.Seeds-Dataset
It is one of the most basic repository to understand the basics and apply machine learning algorithms on it.Percentage-of-Women-in-Bacelor-s-Degree
Analyzing the Growth of Women in USA over Years in different departments and field of Bachelor's Degree offered in USA, using Data Analysis and Data VisualizationsSanfrancisco-Crime-Dataset
This Problem Data set of San Francisco Contains information about the crime in San Francisco, We are going to analyze the data, Visualize the data using folium maps for geographical understanding. In other words It is called Geo spatial Mapping. This Problem is the final assignment for Coursera and IBM's Data Visualization Course.Zoo-Dataset
It is one of the most basic repository to understand the basics and apply machine learning algorithms on it.Titanic-Passenger-Survival-Prediction
Using Classification Techniques, Data reprocessing, Feature Engineering, Feature Extraction and Classification Algorithms from Machine Learning to Predict who can Survive the attack of Tsunami.WHO-Suicide-Statistics
Statistics for Suicide Analysis for all the countries in the world using Machine Learning Algorithms to find some interesting patterns, solutions and Clues about Suicides using Data Analysis and Data VisualizationsPakistan-Suicide-Bombing-Dataset
Analyzing the Suicide Bombing Patterns and seeking some of the most tangled questions with good visualizations with the help of Machine Learning and Data Science.Predicting_Money_Spent_at_Resort
It is From Analytics Vidhya Hackathons, Sponsored by Club Mahindra. It is based on Regression Problem, Where Accuracy matters the most, It is measured by RMSE Score. Different Techniques such as Stacking, Ensembling, Boosting and Scientific Operations such box-cox Operations to reduce skewness of the data.Christiano-Ronaldo---Goal-Prediction-Top-40-
It is a Problem Which I got During the ZS Data Science Challenge From Interview Bit Hiring Challenge Where I secured a 40th Rank out of 10,000 Students across India. It is a Dataset which requires Intensive Cleaning and Processing. Here I have Performed Classification Using Random Forest Classifier and Used Hyper Tuning of the Parameters to achieve the Accuracy. I got a very Satisfiable Accuracy from the Model in both the Training and Testing Sets.Coursera-Reviews-Analysis
It is a Natural Language Processing Problem where we have to decide the sentiments of the users who reviewed the course. and then classifying the reviews into positive and negative.Spam-Detection
It is a Basic Natural Language Processing Project based on Sentiment Analysis, there are two types of mails Spam and Ham and we have to Classify them using Machine Learning and NLP Algorithms to detect the Spam Mails.Advanced-House-Price-Prediction
It is a Advanced Problem of Regression which requires advanced techniques of feature engineering, feature selection and extraction, modelling, model evaluation, and Statistics.Iris-Dataset
My first Machine Learning Repository, It is based on Classification of three different classes called setosa, virginica and versicolorCareerCon-Robots-Need-Help
It is Data Science and Machine Learning Competition Hosted by Kaggle where we have to perform multi-class classification on the labels, a very good amount of feature engineering, data preprocessing, data visualizations and modelling is done on the data set to get a good accuracy using random forest and xg boost classifierDon-t-Overfit
It is from Kaggle Competitions where the training dataset is very small and the testing dataset is very large and we have to avoid or reduce overfiting by looking for best possible ways to overcome the most popular problem faced in field of predictive analytics.Predicting-the-Trends-of-Qaulity-Oriented-Jobs
Aerial-Cactus-Identification
It is an Image Processing Challenge where we have to identify the Aerial Cactus using Deep Learning Techniques. I have used fastai library to make this work even easier.Bitcoin-Price-Prediction
Movie-Recommendation-System
In this Movie Dataset I have used my Basic Knowledge of Recommendation Systems. I have used the Concept of Content -Based Recommendation System to Recommend that If a User Watched a Particular Movie than Which Other Movies He/She is going to like or watch.Black-Friday
This time I am doing it using R language. let's see the results. The solutions includes eda(exploratory data analysis), data visualizations, modelling with Machine learning Models such as XgBoost and AdaBooost etc and check the performance using rmse metrics etc to compare the results.Text-Clustering
It is a very different task, as here I am going to cluster 200 different texts related to games and sports in 2 or more different clusters. we can also use zipf plot to determine how many useful clusters can be formed.Predicting-Tariff-Rates
Genetic-Algorithm-for-solving-an-Equation
I have used Genetic Algorithms for solving Equations. This is done using python.Career-Village
Applying Analytics and Machine and Deep Learning to answer some of the most intriguing questions from the company.Analyzing-Crimes-in-Indian-States
Anime-Recommendations
V2-Plant-Seedlings-Classification
It has 12 Classes for twelve different types of crops. This Dataset is in a zip file containing twelve folders of each plant containing their pictures. It is a Image Classification Problem, which can be easily solved Deep Learning Models such CNN(Convolutional Neural Networks)Web-Logging-Data
This is a dataset related to web logging with attributes such hit rate, visit date, exit rate, bounce rate, no. of imp. pages etc, A lot of Data Mining Technologies can be applied to extract better information out of it, I have applied clustering and classification and also created the report that is the model explanation is very important in terms of real life problems.Predicting-the-Likelihood-of-E-Signing-a-Loan-Based-on-Financial-History
It is a Simple and Starter Data Science Problem Solved Using Machine Learning Concepts such as Classification and Clustering etc. In this Dataset we have to Predict the Likelihood of the E-signing of a Loan Based Financial History.Udacity-Bike-Share-Data-Analysis
This was literally the first project that I have ever made, This is from Udacity's Python Nano degree Program. The data set consists of the bike sharing data for the big cities of United States, as it was my first project. I have done a very basic solution for it.Europe-Political-Data-Analysis
This is Europe's Political data which consists of information about life expectancy, pollution, population, unemployment, work hours, weather, trust in police, trust in legal authorities, Income, GDP, leisure satisfaction, trust in politics, environment satisfaction, low savings and crime for all the countries in Europe. I am going to compare these political situations or sentiments of people living in different parts of Europe using Data Analytics and Data Visualization.Immigrants-all-over-the-World
In this dataset, we will try to visualize different aspects of immigrants visiting to Canada and all over the world, I have tried to make most effective and ad-hoc visualizations to answer some of the intriguing questions. I have used Advanced Visualization Technques.Directing-Customers-through-App-Behaviour-Analysis
It is a Business Case Problem Used in Data Science Consulting and Engineering. Data Science always helps us to take important Business Decisions which leads to development of the Organization and also helps to avoid any Disaster by taking any gut-feeling decisions. Here In This Dataset I have Predicted the Behaviours of the Customers through their App Usage to Predict and Formulate different Policies and Rules for Different Set of Customers.Board-Game-Review-Prediction
Reviews can make or break a product; as a result, many companies take drastic measures to ensure that their product receives good reviews. When it comes to board games, reviews and word-of-mouth are everything. In this project, we will be using a linear regression model to predict the average review a board game will receive based on characteristics such as minimum and maximum number of players, playing time, complexity, etc. Before we get started, we will need to clone a GitHC-Basic-Tutorials
It is a Repository for Basic C++ Programs for PlacementsFeature-Matching-Using-SIFT-and-SURF
Crops-and-Productions-Analysis
Steel-Defect-Detection
Tourist-Places-and-Visitors-Analysis
Analyzing-the-Dimensions-of-Poverty
The Data set is picked from Kaggle which describes the Situation of the Multidimensional Measures around the globe. In this Analysis, I have tried to used Pandas, seaborn, and Ipywidgets for the End to End Analysis of the Subject.Data-Visualization-Coursera-Assignments
This is a Repository made for Coursera Assignments, and Tutorials which includes many interesting plots such as waffle charts, folium charts, chloropeth charts etc.Understanding-PCA
Principal Component Analysis is One of the Most Popular Dimensionality Reduction Algorithms used in Machine Learning Which comes under Unsupervised Way of Learning. It is also Used as a way of Feature Extraction where, More Information is Extracted from all the Existing Attributes, in just some 3-4 Attributes using the Concepts of Eigen Values and Eigen Vectors.Fashion-Class-Classification
Context Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Zalando intends Fashion-MNIST to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms. It shares the same image size and structure of training and testing splits.Object-Detection
Data-Structure-Algorithms
It is Repository for Basic Data Structure Algorithms. Algorithms Such as Bubble sort, Merge Sort are covered.Digit-Recognition-
Power-BI
I am making this Repository to post all my projects done using Power BI. Power BI is an analytical tool which helps to modify data and Visualiza data in an Interactive and Easy Way.Introduction-to-HighCharter-Visualizations
High Charter is a Premium package available for R programming Language Interface. It is a Expensive and Paid package and cannot be used for commercial and government use without payment. What makes it so special is the custom designing to the plots and endless options for different plots. There are more than 100 different types of plot available in High Charter. It basically supports Markdown.Employee-Attrition-Rate
This Data set consists of information about an employee, There are attributes such as education level, experience level, age, salary, gender, department, degree, ratings, work ethics, current company working experience, job level, job role, attrition rate, employee id, employee satisfaction etc to take some serious important decisions for the company regarding the company.FiveThirtyEight-Comics-Marvel-and-DC-
Comparison between Marvel and DC in terms of their Characters Popularity, their Gender, Hair Color, Eye Color, Character Alignment, Appearances, Launch day, names, etc. I have used Seaborn, matplotlib, networkx, and plotly to visualize Interactive plotsText-Classification
This is a Project Assignment where I have Learned to Classify the Different Texts Using Clustering Techniques. Natural Language Processing and Clustering both of these Concepts are Being Used. I have Used K-means Clustering Techniques to Implement the Problem.Image-Super-Resolution
Image Super Resolution is one of the most Intriguing and Interesting Projects in Deep Learning and It is done by an Architecture of Deep Learning called Super Resolution Convolutional Neural Networks or SRCNN. Using Image Super Resolution Technique we can convert the Low Resolution Images into High Resolution, Which can be really helpful for Domains where the Clarity and High Definitions are Highly Required.Love Open Source and this site? Check out how you can help us