• Stars
    star
    1
  • Language
    Jupyter Notebook
  • Created over 4 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Case Study 2.2: Gender Wage Gap Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Estimate the difference in predicted wages between men and women with the same job characteristics. Why this Case Study? Participants can pose an economic question and investigate that question using a linear regression model. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 1.6. Self-help-package.zip Codebook.txt contains the description of worker job-relevant characteristics. pay.discrimination.Rdata: the CPS (2012) data on wages and job-relevant worker characteristics, such as experience exp, gender, education. Regression1.6.CaseStudy.R estimates gender wage gap, i.e., difference in predicted wages between men and women with same job-relevant characteristics. The gap is estimated in two steps: (1) residualizing the outcome (wages) and covariate of interest (gender) (taking residuals from corresponding regressions on worker characteristics), and (2) computing the correlation between residualised wages on residualised gender. Both linear and quadratic specifications are tried at residualizing step. Regression.1.6.pdf is the set of slides that describes the estimation technique and present the results. .Rhistory

More Repositories

1

MITXpro-DSx-PCA---Identifying-Faces

DO IT YOURSELF Case Study 1.2.1: PCA - Identifying Faces Instructor: Stefanie Jegelka Activity Type: Optional Case Study Description: Classifying and identifying human faces. Why this Case Study? Build your own implementation of an image classification algorithm that helps classify new photos of humans! This can help you understand how it is possible for Facebook to suggest, very accurately, who to tag in a given photo with people's photos. Self-Help Documentation: In this document, we walk through some helpful tips to get you started with building your own application for classifying faces in photo images using Principle Component Analysis (PCA). In this tutorial, we provide examples and some pseudo-code for the following programming environment: Matlab. Download Self-Help Documentation Download Pictures DataSet Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
4
star
2

MITXpro-DSx-Genetic-Code

Do It Yourself Case Study 1.1.1: Genetic Codes Instructor: Tamara Broderick Activity Type: Optional Case Study Description: Using K-means to help figure out that DNA is composed of 3-letter words. Self-Help Documentation: From this document, you will learn how data visualization can help in genomic sequence analysis and start with a fragment of genetic text of a bacterial genome and analyze its structure. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience. Have questions? Feel free to discuss the case study with other participants in the Discussion Forum under Module 1 - Case Studies Section.
HTML
4
star
3

Spectral-Clustering---Grouping-News-Stories

Case Study 1.3.2: Spectral Clustering - Grouping News Stories Instructor: Stefanie Jegelka Activity Type: Optional Case Study Description: Auto-clustering News stories. Why this Case Study? Build your own clustering for news stories on the web similar to how you see Google News organize news stories by auto-generated topics/groupings! Self-Help Documentation: In this document, we walk through some helpful tips to get you started with building your own application for automating the clustering of news stories using Spectral Clustering. In this tutorial, we provide examples and some pseudo-code for the following programming environment: Python. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience. Have questions? Feel free to discuss the case study with other participants in the Discussion Forum under Module 2 - Case Studies Section.
Jupyter Notebook
3
star
4

MIT-xPRO-DSxCase-Study-5.2.1---Kalman-Constant-Velocity

Case Study 5.2.1: Kalman Constant Velocity 2D Instructor: Guy Bresler Activity Type: Optional Case Study Description: Generate data, build the model for the motion dynamics, perform the Kalman Filtering algorithm. Kalman Filtering: Tracking the 2D Position of an Object when moving with Constant Velocity Why this Case Study? Build your own implementation of the Kalman Filter on a simple example which can help you understand the building blocks of the GPS system. Self-Help Documentation: In this document, we walk through some helpful tips to get you started with tracking the state of an object moving with a constant velocity using Kalman Filtering when we have noisy measurements of its velocity in 2 dimensions. In this tutorial, we provide examples and some pseudo-code for the following programming environment: Python. Download Self-Help Documentation Kalman-Constant-Velocity-Case-Study-Package.zip Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
1
star
5

MIT-xPRO-DSxCase-Study-4.3---Products

Case Study 4.3: Products Recommender Instructor: Devavrat Shah Activity Type: Optional Case Study Description: Building your own recommendation system for products on an e-commerce website, like the examples discussed for Amazon.com. Make new Product Recommendations Why this Case Study? By following some simple steps you can develop your own version of a recommendation engine which forms the basis of several e-commerce websites like Amazon.com. This will help you develop an appreciation for the sorts of innovations that underpin the rise and ubiquity of online retailers and e-commerce portals. You can apply these tools to new retail ventures of your choice. Self-Help Documentation: In this document, we walk through some helpful tips to get you started with building your own Recommendation engine based on the case studies discussed in the Recommendation systems module. In this tutorial, we provide examples and some pseudo-code for the following programming environments: R, Python. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
1
star
6

MIT-xPRO-DSxCase-Study-2.4-Predicting-Wages-2

Case Study 2.4: Predicting Wages II Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Predict wages using several machine learning methods and data splitting. Why this Case Study? Participants can apply several machine learning methods to any prediction problem. They can also obtain out-of-sample prediction to assess their models. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 3.3. Self-Help-Package.zip Codebook.txt contains the name of the variables and a brief description. wage2015.Rdata: The dataset contains the variables used in the prediction analysis. Regression3.3.CaseStudy.R: performs a prediction analysis where the weekly wage is predicted using several demographic and job-related characteristics. Several machine learning methods are used for prediction and their performance is compared based on out-of-sample mean squared error and R-squared. The final part of the program aggregates the predictions obtained by each machine learning method. Regression.3.3.pdf is the set of slides that describes the estimation technique and present the results. .Rhistory Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
1
star
7

MIT-xPRO-DSxCase-Study-2.1-Predicting-Wage-1

Case Study 2.1: Predicting Wage I Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Predict wages using various characteristics of workers and assess predictive performance. Why this Case Study? Prediction is getting important these days in the age of big data. Participants can apply a simple model from this class and assess the prediction performance of their model. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 1.4. Self-Help-Package.zip Codebook.rtf contains the description of worker job-relevant characteristics. pay.discrimination.Rdata: the CPS (2012) data on wages and job-relevant worker characteristics, such as experience, gender, education. Regression1.4.CaseStudy.R predicts expected wage given worker characteristics using linear model with linear and quadratic specifications. In addition, it evaluates the performance of the predictor by: r.squared and mean squared error, with and without sample splitting.Regression.1.4.pdf is the set of slides describing the wage prediction model. .Rhistory Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
1
star
8

MITXpro-DSx-LDA

Case Study 1.1.2: Finding Themes in Project Descriptions Instructor: Tamara Broderick Activity Type: Optional Case Study Description: Using Latent Dirichlet Allocation to discover topics in a corpus of text. Finding Themes in Project Descriptions - LDA Analysis. Self-Help Documentation: In this document, we walk through some tips to help you with doing your own analysis on MIT EECS faculty data using stochastic variational inference on LDA. We provide some examples for the following programming environment: Python. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience. Have questions? Feel free to discuss this case study with other participants in the Discussion Forum under Module 1 - Case Studies Section.
HTML
1
star
9

MIT-xPRO-DSxCase-Study-2.3-Do-Poor-Countries-Grow-Faster-than-Rich-Countries-

Case Study 2.3: Do Poor Countries Grow Faster than Rich Countries? Instructor: Victor Chernuzkov Activity Type: Optional Case Study Description: Answer the question: ¨Do poor countries grow faster than rich countries?¨by using a large dimensional dataset. Why this Case Study? Participants are equipped with tools which can handle high dimensional datasets. They can apply these tools to any high dimensional dataset. Self-Help Package Contents: The video that covers this case study is given in Module 2, Segment 2.4. Self-Help-Package.zip Codebook.txt contains the name of the variables and a brief description. growth.Rdata: The dataset contains the variables used in the regression. Regression 2.4.CaseStudy.R: looks at how the rates at which economies of different countries grow related to initial wealth levels in each country controlling for several country-specific characteristics. This relationship is estimated in two ways. In the first analysis, a simple regression linear model is used. In the second analysis control variables are partialled out using the Lasso method and then residuals of the dependent variable are regressed on residuals of the indepedent variable. Regression.2.4.pdf is the set of slides that describes the estimation technique and present the results. .Rapp.history .Rhistory
Jupyter Notebook
1
star
10

MIT-xPRO-DSx-DECISION-BOUNDARY-AND-DEEP-NETWORK

Case Study 3.2: Alpha Go Instructor: Ankur Moitra Activity Type: Optional Case Study Description: Playing with one or two-layer perceptrons to get a feeling of their decision boundaries. Decision boundary of a deep neural network Why this Case Study? The case study shows the advantage of adding nonlinearity and using deeper neural networks over shallow ones. Self-Help Documentation In this document, the following is discussed. Layering perceptrons are mysterious. What types of patterns can deep neural networks recognize? And in what ways are they more powerful than just a single perceptron? In this case study, we will explore these questions in two dimensions, so that we can visualize them easily. What happens in higher dimensions is much more complex, and an area of active research. Download Self-Help Documentation Time Required: The time required to do this activity varies depending on your experience in the required programming background. We suggest planning somewhere between 1 & 3 hours. Remember, this is an optional activity for participants looking for hands-on experience.
Jupyter Notebook
1
star