• Stars
    star
    4
  • Rank 3,304,323 (Top 66 %)
  • Language
    Jupyter Notebook
  • Created over 5 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

MNIST is a simple computer vision dataset. It consists of 28x28 pixel images of handwritten digits.Every MNIST data point, every image, can be thought of as an array of numbers describing how dark each pixel is. Since each image has 28 by 28 pixels, we get a 28x28 array. We can flatten each array into a 28∗28=784 dimensional vector. Each component of the vector is a value between zero and one describing the intensity of the pixel. Thus, we generally think of MNIST as being a collection of 784-dimensional vectors. Not all vectors in this 784-dimensional space are MNIST digits. Typical points in this space are very different! To get a sense of what a typical point looks like, we can randomly pick a few points and examine them. In a random point – a random 28x28 image – each pixel is randomly black, white or some shade of gray. The result is that random points look like noise.Images like MNIST digits are very rare. While the MNIST data points are embedded in 784-dimensional space, they live in a very small subspace. With some slightly harder arguments, we can see that they occupy a lower dimensional subspace. People have lots of theories about what sort of lower dimensional structure MNIST, and similar data, have. One popular theory among machine learning researchers is the manifold hypothesis: MNIST is a low dimensional manifold, sweeping and curving through its high-dimensional embedding space. Another hypothesis, more associated with topological data analysis, is that data like MNIST consists of blobs with tentacle-like protrusions sticking out into the surrounding space. But no one really knows, so lets explore!

More Repositories

1

Human-Activity-Recognition------UCI

Human Activity Recognition ---ML models & Divide and Conquer approach
Jupyter Notebook
9
star
2

Social-network-Graph-Link-Prediction---Facebook-Challenge

Jupyter Notebook
7
star
3

Jigsaw-Unintended-Bias-in-Toxicity-Classification

At the end of 2017 the Civil Comments platform shut down and chose make their ~2m public comments from their platform available in a lasting open archive so that researchers could understand and improve civility in online conversations for years to come. Jigsaw sponsored this effort and extended annotation of this data by human raters for various toxic conversational attributes. In the data supplied for this competition, the text of the individual comment is found in the comment_text column. Each comment in Train has a toxicity label (target), and models should predict the target toxicity for the Test data. This attribute (and all others) are fractional values which represent the fraction of human raters who believed the attribute applied to the given comment. For evaluation, test set examples with target >= 0.5 will be considered to be in the positive class (toxic). The data also has several additional toxicity subtype attributes. Models do not need to predict these attributes for the competition, they are included as an additional avenue for research. Subtype attributes are: severe_toxicity obscene threat insult identity_attack sexual_explicit Additionally, a subset of comments have been labelled with a variety of identity attributes, representing the identities that are mentioned in the comment. The columns corresponding to identity attributes are listed below. Only identities with more than 500 examples in the test set (combined public and private) will be included in the evaluation calculation. These identities are shown in bold. male female transgender other_gender heterosexual homosexual_gay_or_lesbian bisexual other_sexual_orientation christian jewish muslim hindu buddhist atheist other_religion black white asian latino other_race_or_ethnicity physical_disability intellectual_or_learning_disability psychiatric_or_mental_illness other_disability Note that the data contains different comments that can have the exact same text. Different comments that have the same text may have been labeled with different targets or subgroups.
Jupyter Notebook
7
star
4

Personalized_cancer_Diagnosis

Source: https://www.kaggle.com/c/msk-redefining-cancer-treatment/data
Jupyter Notebook
6
star
5

DeepLearning.ai_Assigments

Complete Assignments of Andrew NG course
Jupyter Notebook
6
star
6

Amazon-fashion-discovery-engine-Content-Based-recommendation-

Jupyter Notebook
5
star
7

Image-Augmentation-using-Keras

Jupyter Notebook
4
star
8

Advance-Machine-Learning-Coursera

Jupyter Notebook
4
star
9

3-D-Animation-Cube-using-Html5-CSS3

CSS
4
star
10

LPU-FOP-KUJ09-

This repo contains all of the codes for practice session taken by me in KUJ09 batch of LPU
Java
4
star
11

-Stack-Overflow-Tag-Prediction

Jupyter Notebook
4
star
12

WNS-Analytics-Wizard-2019

Jupyter Notebook
4
star
13

PCA-on-Boston-House-price-Data-Set

Jupyter Notebook
4
star
14

Semantic-Text-Similarity

Jupyter Notebook
4
star
15

Amazon-Future-Engineering-May-Batch-Java

Java
4
star
16

DenseNet-on-CIFAR-10-

In this repository we are going to implement Dense Net architecture from scratch on CIFAR-10 data-set
Jupyter Notebook
4
star
17

Mask_RCNN_

# run_obj contain live object detection using mask RCNN
Python
4
star
18

Donor_choose-Various-Models-

DonorsChoose.org receives hundreds of thousands of project proposals each year for classroom projects in need of funding. Right now, a large number of volunteers is needed to manually screen each submission before it's approved to be posted on the DonorsChoose.org website. Next year, DonorsChoose.org expects to receive close to 500,000 project proposals. As a result, there are three main problems they need to solve: How to scale current manual processes and resources to screen 500,000 projects so that they can be posted as quickly and as efficiently as possible How to increase the consistency of project vetting across different volunteers to improve the experience for teachers How to focus volunteer time on the applications that need the most assistance The goal of the competition is to predict whether or not a DonorsChoose.org project proposal submitted by a teacher will be approved, using the text of project descriptions as well as additional metadata about the project, teacher, and school. DonorsChoose.org can then use this information to identify projects most likely to need further review before approval.
Jupyter Notebook
4
star
19

Quora_question_pair_similarity

Quora is a place to gain and share knowledge—about anything. It’s a platform to ask questions and connect with people who contribute unique insights and quality answers. This empowers people to learn from each other and to better understand the world. Over 100 million people visit Quora every month, so it's no surprise that many people ask similarly worded questions. Multiple questions with the same intent can cause seekers to spend more time finding the best answer to their question, and make writers feel they need to answer multiple versions of the same question. Quora values canonical questions because they provide a better experience to active seekers and writers, and offer more value to both of these groups in the long term. > Credits: Kaggle __ Problem Statement __ Identify which questions asked on Quora are duplicates of questions that have already been asked. This could be useful to instantly provide answers to questions that have already been answered. We are tasked with predicting whether a pair of questions are duplicates or not.
Jupyter Notebook
4
star
20

Pancreatic-cancer-Analysis-using-Gene-Expression-data

This Repository contain analysis of Pancreatic cancer data in stored gct format using PCA plots and GSVA Algorithm
Jupyter Notebook
3
star
21

Data-Visualization-using-TSNE

For Data Visualization , one of the best dimensional reduction algorithm is TSNE. Here I use AMAZON FINE FOOD REVIEWS and perform TSNE algorithm on Text features by converting them in vectors using BOW, TFIDF, AVG-W2V & TFIDF-W2V
Jupyter Notebook
3
star
22

WorkSpace

WorkSpace is a desktop application based on Electron framework similar to MS-EXCEL
JavaScript
2
star
23

Receipts-dates-extractor

Python
2
star
24

Mercari-price-suggestion-challenge

Jupyter Notebook
2
star
25

AmExpert-2019

Jupyter Notebook
2
star
26

RoboFriends

Meet with these RoboFriends in this React Web App
JavaScript
2
star
27

Smart-Brain-API

This contain Back-end code base for Smart Brain App
JavaScript
1
star
28

AFE-May-Batch

This repo contains all codes for AFE-May-Grp6
1
star
29

github-slideshow

A robot powered training repository 🤖
HTML
1
star
30

Taxi-demand-prediction-in-New-York-City

Jupyter Notebook
1
star
31

s2wconversion

This repository contain library which convert spoken english like "Triple A" to written English like AAA
Python
1
star
32

TensorFlow-in-Practice-Assignments

This Repo contain all assignments and workbook in TensorFlow In specialization course
Jupyter Notebook
1
star
33

Smart-Brain-App

This App is detect faces in your pictures :)
JavaScript
1
star