Discover MMuttalib1326/Handling-Class-Imbalance Open Source project

Stars
1
Language
Jupyter Notebook
Created over 1 year ago
Updated about 1 year ago

MMuttalib1326

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

MMuttalib1326

Olympic-Games-Data-Analysis

we are going to see the Olympics analysis using Python. The modern Olympic Games or Olympics are leading international sports events featuring summer and winter sports competitions in which thousands of athletes from around the world participate in a variety of competitions. The Olympic Games are considered the world’s foremost sports competition with more than 200 nations participating. The total number of events in the Olympics is 339 in 33 sports. And for every event there are winners. Therefore various data is generated. So, by using Python we will analyze this data. Modules Used Pandas: It is used for analyzing the data, NumPy: NumPy is a general-purpose array-processing package. Matplotlib: It is a numerical mathematics extension NumPy seaborn: It is used for visualization statistical graphics plotting in Python

Jupyter Notebook

K-Fold-Cross-Validation

What is K-fold in cross-validation? K-fold Cross-Validation is when the dataset is split into a K number of folds and is used to evaluate the model's ability when given new data. K refers to the number of groups the data sample is split into. For example, if you see that the k-value is 5, we can call this a 5-fold cross-validation.

Jupyter Notebook

NYC-Taxi-Trip-Duration-Prediction

Task is to build a model that predicts the total ride duration of taxi trips in New York City. primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and many other variables

Jupyter Notebook

Punjabi-word-Detection

L1-and-L2-Regularization-Lasso-Ridge-Regression

Jupyter Notebook

.Punjabi-word-Detection-Yolov7-

Jupyter Notebook

Logistic-Regression-Practical-Implementation

Jupyter Notebook

-Machine-Learning-Intern-Hacklab-Solutions-

Jupyter Notebook

pandas_dataframes

This repository describes fundamental methods with the pandas Python library for data science.

Jupyter Notebook

bias---variance-tradeoff

In statistics and machine learning, the bias–variance tradeoff is the property of a model that the variance of the parameter estimated across samples can be reduced by increasing the bias in the estimated parameters.

Jupyter Notebook

Car-Price-Prediction

In this Project, we will learn Linear Regression and real time challenges during implementation for a business problem.

Jupyter Notebook

IPL-Win-Probability-Predictor-Project

Jupyter Notebook

Machine-Learning-Projects

We are required to model the price of cars with the available independent variables. It will be used by the management to understand how exactly the prices vary with the independent variables. They can accordingly manipulate the design of the cars, the business strategy etc. to meet certain price levels.

Jupyter Notebook

Industrial-Equipments-Detection-Yolov8-on-Custom-Dataset-and-deploy-it-on-Hugging-Face

Objective of this project is to build an accurate and efficient computer vision model capable of detecting industrial equipment in images.

Jupyter Notebook

1st-and-Future---Player-Contact-Detection-Detect-Player-Contacts-from-Sensor-and-Video-Data

The goal of this competition is to detect external contact experienced by players during an NFL football game. You will use video and player tracking data to identify moments with contact to help improve player safety.

Jupyter Notebook

Stacking-and-blending

What is stacking and blending? Most commonly, blending is used to describe the specific application of stacking where the meta-model is trained on the predictions made by base-models on a hold-out validation dataset. In this context, stacking is reserved for a meta-model that is trained on out-of fold predictions during a cross-validation procedure.

Jupyter Notebook

K-mean

What is meant by K means clustering? K-means clustering aims to partition data into k clusters in a way that data points in the same cluster are similar and data points in the different clusters are farther apart. Similarity of two points is determined by the distance between them. There are many methods to measure the distance.

Jupyter Notebook

Ensembles-of-Decision-Trees-Implementation-

Ensemble learning helps improve machine learning results by combining several models. This approach allows the production of better predictive performance compared to a single model.

Jupyter Notebook

Model-explainability

Explainability is how to take an ML model and explain the behavior in human terms. With complex models (for example, black boxes ), you cannot fully understand how and why the inner mechanics impact the prediction.

Jupyter Notebook

Quora-Question-Pairs-Similarity

Quora Question Pairs Similarity Problem,In this Project i have dealing with the task of pairing up the duplicate questions from quora. More formally, the followings are our problem statements Identify which questions asked on Quora are duplicates of questions that have already been asked. this could be useful to instantly provide answers to questions that have already been answered. We are tasked with predicting whether a pair of questions are duplicates or not.

Jupyter Notebook

Gradient-boosting

What is gradient boosting regression in machine learning? Image result for gradient boosting algorithm Gradient boosting Regression calculates the difference between the current prediction and the known correct target value. This difference is called residual. After that Gradient boosting Regression trains a weak model that maps features to that residual.

Jupyter Notebook

AdaBoost-Algorithm

What is the AdaBoost Algorithm? AdaBoost also called Adaptive Boosting is a technique in Machine Learning used as an Ensemble Method. The most common algorithm used with AdaBoost is decision trees with one level that means with Decision trees with only 1 split. These trees are also called Decision Stumps.

Jupyter Notebook

Topic-Modeling

Topic modeling is a machine learning technique that automatically analyzes text data to determine cluster words for a set of documents. This is known as 'unsupervised' machine learning because it doesn't require a predefined list of tags or training data that's been previously classified by humans

Jupyter Notebook

Logistic-regression-implementation

Logistic regression estimates the probability of an event occurring, such as voted or didn't vote, based on a given dataset of independent variables. Since the outcome is a probability, the dependent variable is bounded between 0 and 1.

Jupyter Notebook

L1-and-L2-Regularization

What is L1 and L2 regularization? L1 Regularization, also called a lasso regression, adds the “absolute value of magnitude” of the coefficient as a penalty term to the loss function. L2 Regularization, also called a ridge regression, adds the “squared magnitude” of the coefficient as the penalty term to the loss function.

Jupyter Notebook

Linear-Regression-implementation-

Linear regression algorithm shows a linear relationship between a dependent (y) and one or more independent (y) variables, hence called as linear regression. Since linear regression shows the linear relationship, which means it finds how the value of the dependent variable is changing according to the value of the independent variable.

Jupyter Notebook

Binary-Classification

This is the assignment solution for the datascience role of a Company. I have attempted a binary classification problem given the data, and have attempted feature selection, training (with validation) and presented the predictions.

Jupyter Notebook

K-Nearest-Neighbors

The k-nearest neighbors algorithm, also known as KNN or k-NN, is a non-parametric, supervised learning classifier, which uses proximity to make classifications or predictions about the grouping of an individual data point.

Jupyter Notebook

Naive-Bayes-classifier

What is naive in Naive Bayes classifier? Naive Bayes is called naive because it assumes that each input variable is independent. This is a strong assumption and unrealistic for real data; however, the technique is very effective on a large range of complex problems.

Jupyter Notebook

Decision-tree-implementation

A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks. It has a hierarchical, tree structure, which consists of a root node, branches, internal nodes and leaf nodes.

Jupyter Notebook

Support-Vector-Machines

SVM or Support Vector Machine is a linear model for classification and regression problems. It can solve linear and non-linear problems and work well for many practical problems. The idea of SVM is simple: The algorithm creates a line or a hyperplane which separates the data into classes.

Jupyter Notebook

JalanTechnologiesAssignment

Python

face-detection

Mini Project(Python)

Python

Human-Voice-to-Text-and-Deploy-it-on-Hugging-Face

This project focuses on developing a Human-Voice-to-Text system using speech recognition technology and deploying it on the Hugging Face platform

Python

Logistic-Regression-.

Jupyter Notebook

Data-Visualization-Dashboard-Blackcofer-

test123

Kaggle-Repository

portfolio

TensorGo-Technologies

Jupyter Notebook

Implementation-Of-Tree

Insertion,inorder,Preorder,Levelorder,Searching,Deletion etc.

Python

Coronavirus-Tweet-Sentiment-Analysis

This challenge asks you to build a classification model to predict the sentiment of COVID-19 tweets.

Jupyter Notebook

General-Modelling-Technique

Jupyter Notebook

Create-LinkedList-in-python

Python programming

Industrial-Equipments-Detection-Yolov8

Datasets of Industrial Equipments

Text-Summarization

Jupyter Notebook

portfolio-test

HTML

assignment-Leadzen.ai

JavaScript

Queue

Implementation of queue

Python

Logistic-Regression-

Types-of-gradient-descent

Jupyter Notebook

Taiyo-Machine-Learning-NLP-Modeling-

Jupyter Notebook

Netflix-Movies-and-TV-Shows-Clustering

In this project, we worked on a text clustering problem wherein we had to classify/group the Netflix shows into certain clusters such that the shows within a cluster are similar to each other and the shows in different clusters are dissimilar to each other.

Jupyter Notebook

Elastic-net-regression

Elastic net is a penalized linear regression model that includes both the L1 and L2 penalties during training. Using the terminology from “The Elements of Statistical Learning,” a hyperparameter “alpha” is provided to assign how much weight is given to each of the L1 and L2 penalties.

Jupyter Notebook

Stack-using-list-maxsize-

Python

Stack-using-linkedlist

Python

-Loan-Status-Prediction-using-Machine-Learning-with-Python

Machine Learning Project

Jupyter Notebook

Introduction-to-NLP

Natural Language Processing (NLP) is a subfield of artificial intelligence (AI). It helps machines process and understand the human language so that they can automatically perform repetitive tasks.

Jupyter Notebook

LinkedList2

Write a code to remove dublicate from an unsorted linkedlist.

Python

Anomaly-detection

Anomaly detection is the process of identifying unexpected items or events in data sets, which differ from the norm. And anomaly detection is often applied on unlabeled data which is known as unsupervised anomaly detection.

Jupyter Notebook

Principal-component-analysis

Principal component analysis, or PCA, is a dimensionality reduction method that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

Jupyter Notebook

Neural-Network

A neural network is a series of algorithms that endeavors to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates. In this sense, neural networks refer to systems of neurons, either organic or artificial in nature.

Jupyter Notebook

Polynomial-regression

What is polynomial regression in machine learning? Image result for polynomial regression in machine learning Polynomial regression, like linear regression, uses the relationship between the variables x and y to find the best way to draw a line through the data points.

Jupyter Notebook

Gradient-descent

Gradient descent machine learning Gradient descent is an optimization algorithm which is commonly-used to train machine learning models and neural networks. Training data helps these models learn over time, and the cost function within gradient descent specifically acts as a barometer, gauging its accuracy with each iteration of parameter updates

HTML

lasso-regression

What is lasso regression used for? The goal of lasso regression is to obtain the subset of predictors that minimizes prediction error for a quantitative response variable. The lasso does this by imposing a constraint on the model parameters that causes regression coefficients for some variables to shrink toward zero.

Jupyter Notebook

Time-series

A time series is a data set that tracks a sample over time. In particular, a time series allows one to see what factors influence certain variables from period to period. Time series analysis can be useful to see how a given asset, security, or economic variable changes over time.

Jupyter Notebook

Time-Series-Krish-Naik-

Jupyter Notebook

Hierarchical-Clustering

What is meant by hierarchical clustering? Image result for hierarchical clustering Hierarchical clustering is a popular method for grouping objects. It creates groups so that objects within a group are similar to each other and different from objects in other groups. Clusters are visually represented in a hierarchical tree called a dendrogram.

Jupyter Notebook

k-means-clustering

What is k-means clustering used for? The K-means clustering algorithm is used to find groups which have not been explicitly labeled in the data. This can be used to confirm business assumptions about what types of groups exist or to identify unknown groups in complex data sets.

Jupyter Notebook

classification-matrix

The classification matrix is a standard tool for evaluation of statistical models and is sometimes referred to as a confusion matrix. A classification matrix is an important tool for assessing the results of prediction because it makes it easy to understand and account for the effects of wrong predictions

Jupyter Notebook

Working-with-CSV-files

CSV (Comma Separated Values) is a simple file format used to store tabular data, such as a spreadsheet or database. A CSV file stores tabular data (numbers and text) in plain text. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format. For working CSV files in python, there is an inbuilt module called csv.

Jupyter Notebook

.Movie-Recommendaton-System-using-Machine-Learning

Built a content based movie recommender system using cosine similarity, where the recommendations are based on the item metadata (i.e - movies, products, songs etc.) Contains the idea of a user liking an item, thereafter the other user gets recommended with the similar items.

Jupyter Notebook

Logistic-regression

Jupyter Notebook

regularised-linear-models

Regularization is a technique in machine learning that tries to achieve the generalization of the model. It means that our model works well not only with training or test data, but also with the data it'll receive in the future

Jupyter Notebook

random-forest

Random forests or random decision forests is an ensemble learning method for classification, regression and other tasks that operates by constructing a multitude of decision trees at training time. For classification tasks, the output of the random forest is the class selected by most trees.

Jupyter Notebook

Recommender-Systems---Collaborative-Filtering

Collaborative Filtering This method makes automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating).

Jupyter Notebook

MMuttalib1326/Handling-Class-Imbalance

MMuttalib1326

Reviews

Repository Details

More Repositories