• Stars
    star
    1
  • Language
    Jupyter Notebook
  • Created over 3 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multi-Class Text Classification (Model Comparision & Selection)

More Repositories

1

Document-Forgery-Detection

Document-Forgery-Detection
Python
9
star
2

Speech-to-Text

Flask application that takes speech as input and returns text as output
HTML
3
star
3

AdverAttack-on-Text

Adversarial Attack on Text data using Nlpaug library
Jupyter Notebook
3
star
4

COVID19-World-Analysis

Jupyter Notebook
3
star
5

Titanic-ML-Disaster-Prediction

Titanic Data Science Solutions This notebook is the solution to the Titanic: MACHINE LEARNING for Disaster Workflow stages The competition solution workflow goes through seven stages described in the Data Science Solutions book. Question or problem definition. Acquire training and testing data. Wrangle, prepare, cleanse the data. Analyze, identify patterns, and explore the data. Model, predict and solve the problem. Visualize, report, and present the problem solving steps and final solution. Submiting the results. Problem Statement Competition sites like Kaggle define the problem to solve or questions to ask while providing the datasets for training your data science model and testing the model results against a test dataset. The question or problem definition for Titanic Survival competition is described here at Kaggle. Workflow goals The data science solutions workflow solves for seven major goals. Classifying. We may want to classify or categorize our samples. We may also want to understand the implications or correlation of different classes with our solution goal. Correlating. One can approach the problem based on available features within the training dataset. Which features within the dataset contribute significantly to our solution goal? Statistically speaking is there a correlation among a feature and solution goal? As the feature values change does the solution state change as well, and visa-versa? This can be tested both for numerical and categorical features in the given dataset. We may also want to determine correlation among features other than survival for subsequent goals and workflow stages. Correlating certain features may help in creating, completing, or correcting features. Converting. For modeling stage, one needs to prepare the data. Depending on the choice of model algorithm one may require all features to be converted to numerical equivalent values. So for instance converting text categorical values to numeric values. Completing. Data preparation may also require us to estimate any missing values within a feature. Model algorithms may work best when there are no missing values. Correcting. We may also analyze the given training dataset for errors or possibly innacurate values within features and try to corrent these values or exclude the samples containing the errors. One way to do this is to detect any outliers among our samples or features. We may also completely discard a feature if it is not contribting to the analysis or may significantly skew the results. Creating. Can we create new features based on an existing feature or a set of features, such that the new feature follows the correlation, conversion, completeness goals. Charting. How to select the right visualization plots and charts depending on nature of the data and the solution goals.
Jupyter Notebook
3
star
6

Data-augmentation-for-text

Sentiment Analysis and Data Augmentation
Jupyter Notebook
2
star
7

Covid19_Analysis

Jupyter Notebook
2
star
8

Data-Cleaning

Basic data cleaning tasks required to be done on scrapped or raw dataset
Jupyter Notebook
2
star
9

PYTHON-DATA-MINING

minor project data
Python
2
star
10

TopicModeling_gensim

Topic modeling of a headline dataset using Gensim. LDA is used to evaluate performance.
Jupyter Notebook
2
star
11

Frequency-Distribution

Frequency distribution of words in a text file using Counter and YellowBrick using Python
Jupyter Notebook
2
star
12

Sentence_Similarity

Uniqueness & Analysis of Sentence(Log Loss is used for performance evaluation)
Jupyter Notebook
2
star
13

FineTune-Phi-2-LLM-using-PEFT-QLora

Fine-tuning Large Language Models (LLMs) is a crucial step in adapting these powerful models to specific tasks or domains. In this seminar code tutorial, we will explore how to perform fine-tuning using QLoRA (Quantized LoRA), a memory-efficient iteration of LoRA (Low-Rank Adaptation), for parameter-efficient fine-tuning.
Jupyter Notebook
2
star
14

Remove-floats-from-String

Removing floats from string of a text file
Jupyter Notebook
2
star
15

shivitg.github.io

Personal Website : shivitg.github.io
CSS
1
star
16

Scraping_Website

Jupyter Notebook
1
star
17

Terminology_Extraction

Terminology Extraction NLP
Jupyter Notebook
1
star
18

Mysql_Python

Creating Database and storing information in MySql using Python
Jupyter Notebook
1
star
19

Signature_Detection_Analysis

Authentication of handwritten signatures using digital image processing and neural networks.
Jupyter Notebook
1
star
20

NLP-through-NLTK

Graphing Live Twitter Sentiment Analysis with NLTK.
Python
1
star
21

Movie-Genre-Classification

A natural language processing based approach to classify movies into classes of genres based on their wikipedia text.
Jupyter Notebook
1
star
22

SEMANTIC-ANALYSIS

tweets semantic analysis using Bow and TF-IDF Feature extraction model; For prediction Logistic Regression is used
Jupyter Notebook
1
star
23

Predict_News

Jupyter Notebook
1
star
24

ScantaXPPLM

Web Demo for Personality Engine using Uber AI | PPLM
Python
1
star
25

AirBnb-NYC-Listing

New York City Airbnb Open Data | Peforming Data Wrangling, Analysis, Visualization, Regression, Classification, Hypothesis-Testing
Jupyter Notebook
1
star
26

Paraphrasing

Paraphrasing using Google Trans
Python
1
star
27

WEB-CRAWLER

Web Scrapping of a news website and performing EDA on it
1
star