Rohit Verma (@deadskull7)
  • Stars
    star
    495
  • Global Rank 57,055 (Top 2 %)
  • Followers 165
  • Following 13
  • Registered about 7 years ago
  • Most used languages
    MATLAB
    3.7 %
    HTML
    3.7 %
  • Location ๐Ÿ‡ฎ๐Ÿ‡ณ India
  • Country Total Rank 1,359
  • Country Ranking
    MATLAB
    1,700

Top repositories

1

Pneumonia-Diagnosis-using-XRays-96-percent-Recall

BEST SCORE ON KAGGLE SO FAR , EVEN BETTER THAN THE KAGGLE TEAM MEMBER WHO DID BEST SO FAR. The project is about diagnosing pneumonia from XRay images of lungs of a person using self laid convolutional neural network and tranfer learning via inceptionV3. The images were of size greater than 1000 pixels per dimension and the total dataset was tagged large and had a space of 1GB+ . My work includes self laid neural network which was repeatedly tuned for one of the best hyperparameters and used variety of utility function of keras like callbacks for learning rate and checkpointing. Could have augmented the image data for even better modelling but was short of RAM on kaggle kernel. Other metrics like precision , recall and f1 score using confusion matrix were taken off special care. The other part included a brief introduction of transfer learning via InceptionV3 and was tuned entirely rather than partially after loading the inceptionv3 weights for the maximum achieved accuracy on kaggle till date. This achieved even a higher precision than before.
Jupyter Notebook
97
star
2

Human-Activity-Recognition-with-Neural-Network-using-Gyroscopic-and-Accelerometer-variables

The VALIDATION ACCURACY is BEST on KAGGLE. Artificial Neural Network with a validation accuracy of 97.98 % and a precision of 95% was achieved from the data to learn (as a cellphone attached on the waist) to recognise the type of activity that the user is doing. The dataset's description goes like this: The sensor signals (accelerometer and gyroscope) were pre-processed by applying noise filters and then sampled in fixed-width sliding windows of 2.56 sec and 50% overlap (128 readings/window). The sensor acceleration signal, which has gravitational and body motion components, was separated using a Butterworth low-pass filter into body acceleration and gravity. The gravitational force is assumed to have only low frequency components, therefore a filter with 0.3 Hz cutoff frequency was used.
Jupyter Notebook
79
star
3

One-Stop-for-COVID-19-Infection-and-Lung-Segmentation-plus-Classification

โœ‹๐Ÿผ๐Ÿ›‘ This one stop project is a complete COVID-19 detection package comprising of 3 tasks: โ€ข Task 1 --> COVID-19 Classification โ€ข Task 2 --> COVID-19 Infection Segmentation โ€ข Task 3 --> Lung Segmentation
Jupyter Notebook
68
star
4

New-York-Stock-Exchange-Predictions-RNN-LSTM

BEST SCORE ON KAGGLE SO FAR. Mean Square Error after repeated tuning 0.00032. Used stacked GRU + LSTM layers with optimized architecture, learning rate and batch size for best model performance. The graphs are self explanatory once you click and go inside !!!
Jupyter Notebook
57
star
5

Agricultural-Price-Prediction-and-Visualization-on-Android-App

In Agriculture Price Monitioring , I have used data provided by open government site data.gov.in, which updates prices of market daily . Working Interface Details: We have provided user choice to see current market prices based on two choices: market wise or commodity wise use increase assesibility options. Market wise: User have to provide State,District and Market name and then select market wise button. Then user will be shown the prices of all the commodities present in the market in graphical format, so that he can analyse the rates on one scale. This feature is mostly helpful for a regular buyer to decide the choice of commodity to buy. He is also given feature to download the data in a tabular format(csv) for accurate analysis. Commodity Wise: User have to provide State,District and Commodity name and then select Commodity wise button. Then user will be shown the prices of all the markets present in the region with the commodity in graphical format, so that he can analyse the cheapest commodity rate. This feature is mostly helpful for wholesale buyers. He is also given feature to download the data in a tabular format(csv) for accurate analysis. On the first activity user is also given forecasting choice. It can be used to forecast the wholesale prices of various commodities at some later year. Regression techniques on timeseries data is used to predict future prices. Select the type of item and click link for future predictions. There are 3 java files Forecasts, DisplayGraphs, DisplayGraphs2 ..... Please change the localhost "server_name" at time of testing as the server name changes each time a new server is made. Things Used: We have used pandas , numpy , scikit learn , seaborn and matplotlib libraries for the same . The dataset is thoroughly analysed using different function available in pandas in my .iPynb file . Not just in-built functions are used but also many user made functions are made to make the working smooth . Various graphs like pointplot , heat-map , barplot , kdeplot , distplot, pairplot , stripplot , jointplot, regplot , etc are made and also deployed on the android app as well . To integrate the android app and machine learning analysis outputs , we have used Flask to host our laptop as the server . We have a separate file for the Flask as server.py . Where all the the necessary stuff of clint request and server response have been dealt with . We have used npm package ngrok for tunneling purpose and hosting . A different .iPynb file is used for the time series predictions using regression algorithms and would send the csv file of prediction along with the graph to the andoid app when given a request .
Jupyter Notebook
41
star
6

White-Blood-Cells-Classification

An important problem in blood diagnostics is classifying different types of blood cells. Various samples of dead WBCs have been used to identify the nuclearity (mononuclear or polynuclear) of them and classified as well using Convolutional Neural Networks . The WBCs in the data set are lymphocytes , monocytes , neutrophils , eosinophils , basophils. I have obtained an accuracy of 98.6% on this validation set .
Jupyter Notebook
16
star
7

Rossmann-Store-Sales-Predictions

Kaggle top performer(Grandmaster) had a score of 0.10021. I had a self validation score of 0.10874 and a public score of 0.12516. Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied. Prediction is of 6 weeks of daily sales for 1,115 stores located across Germany.
Jupyter Notebook
14
star
8

Cats-Vs-Dogs-CNN-using-Keras-

The training set consisted of 25,000 images out of which 5,000 images were taken out as validation data. Separate test data folder consisted of 12,500 images for which the labels were predicted using trained model. My work includes preprocessing for model, Data augmentation to prevent overfitting, callbacks in keras to reduce learning rate timely, various CNN architecture trials with different layers and hyperparameters for best fit and learning curve wrt epochs. I gained a validation accuracy of 87.15 % without using any pretrained imagenet models . VGG-16 gave around 89 % as validation accuracy.
Jupyter Notebook
13
star
9

Gender-Recognition-by-Voice-0.97004-Accuracy-

The voices of different people are tested for 20 properties. These properties include mean-frequency, standard deviation, kurtosis, skew, mode frequency , modulation index , fundamental freq....,etc . My work includes the demonstation of the much probable properties showcased by females as well as males. The study of important attributes for voice recognition and their varied concentration in each gender using the inferences drawn from the various regression plots, pair plots, scatter plots , etc. Dataset is also standardized or normalized prior to training for better performance. Different models are tried . Also plotted their accuracy curves to understand the variation of parameters wrt accuracy. The parameters were tuned using repetitive piecewise gridsearch to compute things efficiently wrt time. Support Vector Machines are taken much care off till end and gave a cross-validated accuracy of 97.004 %. Further a train test spilt accuracy of 99.36 % given by XGBoost Classifier.
Jupyter Notebook
11
star
10

Mortality-Prediction-in-ICU-using-ANN-89-percent

The data used for the challenge consist of records from 12,000 ICU stays. ICU stays of less than 48 hours have been excluded.Up to 42 variables were recorded at least once during the first 48 hours after admission to the ICU. Some example of 42 variables for a patient are Cholesterol, TroponinI, pH, Bilirubin, etc. Feature selection has been very much focussed upon. Not all 42 variables were trained but a best set of selected features after trials of numerous sensible combinations to obtain maximum model performance and avoid overfitting. Used metrics like precision and recall along with accuracy and explained why precision is more significant than recall in a brief way. The model architecture has been tuned repeatedly and was tested for different hyperparameters to obtain an accuracy of 89 % and with a precision of 80 %.
Jupyter Notebook
10
star
11

MNIST-digit-recognition-and-classification-using-CNN-with-Keras-99.70

The training dataset consists of 42000 rows each of 784 pixel values thus representing 28 x 28 sized 42000 images of different digits from 0 to 9 . I have used Convolutional Neural Networks to train the model with the help of Keras and made predictions on the 28000 images of the test dataset, also achieved 99.321 % valid accuracy with just 10 epochs . Also tuned ImageDataGenerator to promote generalization and avoid overfitting problem .
Jupyter Notebook
10
star
12

Face-Emotion-Classification-for-dementia-patients

The product being developed is a mobile application for android operating system. It is an emotion and pain assessment tool and can be incorporated on other platforms also, which satisfy the minimum requirements of system. The application will allow the doctors to select or capture an image of the patient to be assessed. Then the image will be uploaded to the server and given to the Convolutional Neural Network model to process. The model is trained to generate score of each possible emotion. Then the severity algorithm will work on generated scores. The result will be sent to app.
Jupyter Notebook
9
star
13

Pokemon-Analysis-with-Visualization

Used seaborn and matplotlib libraries to deeply visualize the data . Histograms , bar plots , violin plots , pie chart , heat map , box plots , strip plots , swarm plots .... are used to analyse the different categorical variables .
Jupyter Notebook
8
star
14

Mercedes-Benz-Challenge-78th-Place-Solution-Private-LB-0.55282-Top-2-Percent-

I have used various methods and techniques to reach this place on the private leaderboard. Actually, most of it is an art. Considerable feature engineering, transformations, redundancy, duplicate features, feature count of 378, so less rows that model could overfit easily ,inconsistent categories in training and test set and many more.....This dataset was something real to work on. And more than that anonymous features to engineer, that was a whole lot different thing to come across. Still after making around 45 different versions of my script I was able to come to top 2%.
Jupyter Notebook
7
star
15

Object-Recognition-with-Convolutional-Neural-Networks-using-Keras

CIFAR-10 dataset with 60,000 images is used to compare the two different CNN models with different number of layers, different parameters and hyperparameters like epochs , batch size ,etc... and finally the different validation accuracies and loss is obtained .One model is more complex than the other and produces better results than the former .
Jupyter Notebook
6
star
16

House-Prices-Predictions-with-81-features

The dataset consists of 81 features of each house(lot prop. , garage prop. , basement prop. , year built, .....) which almost contains each and every minute detail of each house along with their respective sale prices . The whole kernel goes along with the missing value imputations in a very detailed and explorative manner . Since we have many features (both categorical and numerical) , the inter-relationship of each feature with any other is quite difficult and cumbersome to analyse . One of the feature's missing value imputation ( apart from Sale Price ) is done by modelling and prediction as a demo and rest using the cross-tabulations ,localizing mean , mode or median . Finally normalizing the matrix and predicting the values along with cross validation and root-mean-square error .
Jupyter Notebook
6
star
17

Basic-ML-Algos-from-scratch

KNearestNeighbour , Logistic Regression , Linear Regression , Naive Bayes , K Means Clustering
Jupyter Notebook
5
star
18

Titanic-survival-analysis-and-predictions

Data being provided of people travelling on titanic , analysis done using matplotlib and seaborn libraries along with pandas manipulation , finally a particular machine learning model after comparison is trained to obtain maximum accuracy on the data which is formerly cleaned and converted to be trained and at last the survival of a person is predicted based on the trained model .
Jupyter Notebook
5
star
19

Statistical-Distributions-with-Examples

Statistical-Distributions-with-Examples , Normal Distribution ,Poison Distribution, Binomial Distribution , Measures of Spread , Quantile , RegressionPlot , TimeSeriesPlot , HeatMap , KdePlot , Statistical Inference , Median Absolute Deviation (MAD) , Point Estimates , Skewness , Confidence Intervals , Sampling Distribution and The Central Limit Theorem , Margin of Error , Statistical Hypothesis Testing: The T-Test , T-critical , One-Sample T-Test , Two-Sample T-Test , Type I & II Errors .
Jupyter Notebook
4
star
20

Driver-Drowsiness-Detection

Driver Drowsiness Detection using Open CV , python , Jupyter Notebooks . The project as a system detects your eyes every time using a webcam and gives a Alert message (can be in form of alarm also) when a set threshold is reached .
Jupyter Notebook
4
star
21

My-Experiments-with-Neural-Networks-from-Scratch

In the above, I have dealt with a simple neural network in order to work closely with the neural network basics which mainly includes gradient descent , feed forward algorithm , back-propagation algorithm. Doing things at the fundamental level to have better understanding of neural networks. My work also includes varying the hyperparameters such as epochs , learning rates and neurons in dense layer or hidden layer and studied their effects on cost function and speed of convergence . Changes in error with learning rates and epochs for different number of neurons in hidden layers was also studied.
Jupyter Notebook
4
star
22

World-from-the-Eyes-of-CNN

My work includes the study of image transformation in various steps of a convolutional neural network when it passes the Conv2D layer, maxpool layer and an activation layer. And how the CNN percieves it or on which part of the image the CNN is focussing in order to distinguish the image into various classes. I have also studied GRAD-CAMs a bit using VGG16 weights till the last fully connected layer and tried to apply the same on those images. Studied the activated feature maps. You shall study the same using other imagenet models like Resnet , inceptionV3, etc. I have tested the GRAD-CAMs on a photo of mine and a potrait of a lady.
Jupyter Notebook
4
star
23

Movie-Recommendations-using-Collaborative-Filtering

My work includes movie recommendations using collaborative filtering with a least mean square error of 0.8254. Also Experimented on embedded matrices of different sizes and feature extraction. I have experimented with the size of embedding layer , batch_size and learning rate much and came to the final conclusion that particularly for this dataset the embedding layer of 30 was best in the sizes of 30,50 and 64 . Also the mean square error was better when kept layer size small. This might be due to the fact that the data might not be that complex to have feature extraction of upto 50 or 64. Extracting features upto this limit might have led the model to memorize and thus overfitting the data.
Jupyter Notebook
4
star
24

Deep-Learning-Coursera

Deep learning assignments submitted while taking Andrew Ng Deep learning course at coursera .
Jupyter Notebook
3
star
25

batman-BLOG

Django based blog with full database handling of user and all utility function a user would need. (registration , login (along with facebook and google) , create , read , update , delete (CRUD) , profile , subscription , email sending (webmailer), uploading files and cropping , rendering pdfs, managing passwords , searching, making queries with database present , exclusive permissions to some users of interest)
HTML
2
star
26

Coursera-Applied-Data-Science-with-Python

Repository for coursera specialization Applied Data Science with Python by University of Michigan
Jupyter Notebook
1
star
27

Coursera-ML

Machine learning assignments submitted while taking Andrew Ng machine learning course at coursera .
MATLAB
1
star