There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Document-Forgery-Detection
Document-Forgery-DetectionSpeech-to-Text
Flask application that takes speech as input and returns text as outputAdverAttack-on-Text
Adversarial Attack on Text data using Nlpaug libraryCOVID19-World-Analysis
Titanic-ML-Disaster-Prediction
Titanic Data Science Solutions This notebook is the solution to the Titanic: MACHINE LEARNING for Disaster Workflow stages The competition solution workflow goes through seven stages described in the Data Science Solutions book. Question or problem definition. Acquire training and testing data. Wrangle, prepare, cleanse the data. Analyze, identify patterns, and explore the data. Model, predict and solve the problem. Visualize, report, and present the problem solving steps and final solution. Submiting the results. Problem Statement Competition sites like Kaggle define the problem to solve or questions to ask while providing the datasets for training your data science model and testing the model results against a test dataset. The question or problem definition for Titanic Survival competition is described here at Kaggle. Workflow goals The data science solutions workflow solves for seven major goals. Classifying. We may want to classify or categorize our samples. We may also want to understand the implications or correlation of different classes with our solution goal. Correlating. One can approach the problem based on available features within the training dataset. Which features within the dataset contribute significantly to our solution goal? Statistically speaking is there a correlation among a feature and solution goal? As the feature values change does the solution state change as well, and visa-versa? This can be tested both for numerical and categorical features in the given dataset. We may also want to determine correlation among features other than survival for subsequent goals and workflow stages. Correlating certain features may help in creating, completing, or correcting features. Converting. For modeling stage, one needs to prepare the data. Depending on the choice of model algorithm one may require all features to be converted to numerical equivalent values. So for instance converting text categorical values to numeric values. Completing. Data preparation may also require us to estimate any missing values within a feature. Model algorithms may work best when there are no missing values. Correcting. We may also analyze the given training dataset for errors or possibly innacurate values within features and try to corrent these values or exclude the samples containing the errors. One way to do this is to detect any outliers among our samples or features. We may also completely discard a feature if it is not contribting to the analysis or may significantly skew the results. Creating. Can we create new features based on an existing feature or a set of features, such that the new feature follows the correlation, conversion, completeness goals. Charting. How to select the right visualization plots and charts depending on nature of the data and the solution goals.Covid19_Analysis
Data-Cleaning
Basic data cleaning tasks required to be done on scrapped or raw datasetPYTHON-DATA-MINING
minor project dataTopicModeling_gensim
Topic modeling of a headline dataset using Gensim. LDA is used to evaluate performance.Frequency-Distribution
Frequency distribution of words in a text file using Counter and YellowBrick using PythonSentence_Similarity
Uniqueness & Analysis of Sentence(Log Loss is used for performance evaluation)FineTune-Phi-2-LLM-using-PEFT-QLora
Fine-tuning Large Language Models (LLMs) is a crucial step in adapting these powerful models to specific tasks or domains. In this seminar code tutorial, we will explore how to perform fine-tuning using QLoRA (Quantized LoRA), a memory-efficient iteration of LoRA (Low-Rank Adaptation), for parameter-efficient fine-tuning.Remove-floats-from-String
Removing floats from string of a text fileshivitg.github.io
Personal Website : shivitg.github.ioScraping_Website
Terminology_Extraction
Terminology Extraction NLPMysql_Python
Creating Database and storing information in MySql using PythonSignature_Detection_Analysis
Authentication of handwritten signatures using digital image processing and neural networks.Multi-text-classification
Multi-Class Text Classification (Model Comparision & Selection)NLP-through-NLTK
Graphing Live Twitter Sentiment Analysis with NLTK.Movie-Genre-Classification
A natural language processing based approach to classify movies into classes of genres based on their wikipedia text.SEMANTIC-ANALYSIS
tweets semantic analysis using Bow and TF-IDF Feature extraction model; For prediction Logistic Regression is usedPredict_News
ScantaXPPLM
Web Demo for Personality Engine using Uber AI | PPLMAirBnb-NYC-Listing
New York City Airbnb Open Data | Peforming Data Wrangling, Analysis, Visualization, Regression, Classification, Hypothesis-TestingParaphrasing
Paraphrasing using Google TransWEB-CRAWLER
Web Scrapping of a news website and performing EDA on itLove Open Source and this site? Check out how you can help us