• Stars
    star
    123
  • Rank 290,145 (Top 6 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 7 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Course notes for MSDS501, computational boot camp, at the University of San Francisco

MSDS501 Computational Data Science Bootcamp

This 5-week computational bootcamp is part of the MS in Data Science program at the University of San Francisco and is specifically designed as an introduction to data science programming for those who are not yet skilled programmers.

Writing software is about problem solving, computer languages, algorithms, data structures, libraries, tools, and computing devices. In this bootcamp, I'm hoping to teach you how to approach programming, review the key elements of Python, teach you some of the core libraries, give you an introduction to the command line, and finally introduce you to cloud computing. You will go over algorithms and data structures in more detail in the data acquisition, machine learning, and data structures courses.

Class details

INSTRUCTOR. Terence Parr. I’m a professor in the computer science and data science program departments and was founding director of the MS in Analytics program at USF (which became the MS data science program). Please call me Terence or Professor (β€œTerry” is not ok).

SPATIAL COORDINATES:

  • Class is held at 101 Howard in 1st floor classroom 155-156. Recorded via zoom for remote, asynchronous students in different time zones.
  • Exams will likely be online via Canvas at 3:30pm with another opening at midnight for those in different time zones.
  • My office is room 525 @ 101 Howard

TEMPORAL COORDINATES. Wed July 7 to Wed Aug 11.

  • Lectures: Mon/Wed 10AM - 11:50AM

INSTRUCTION FORMAT. Class runs for 1:50 hours, 2 days/week. Instructor-student interaction during lecture is encouraged and we'll mix in mini-exercises / labs during class. All programming will be done in the Python 3 programming language, unless otherwise specified.

TARDINESS. Please be on time for class. It is a big distraction if you come in late.

LAPTOP POLICY. My policy is that all student laptops must be closed during class unless we are doing a lab or I specifically ask you to follow along as I type into my computer. All materials for the course are available in this repository, which reduces your need to take notes considerably.

Student evaluation

Artifact Grade Weight Due date
Image processing 16% Wed, July 21
Word similarity and relationships 12% Wed, July 28
Exploration of Enron email data set 16% Sat, August 14 @ 11:59PM CA time
Test 1 12% 3:30-4:15pm Wed, July 14
Test 2 12% 3:30-4:30pm Mon, July 26
Test 3 12% 3:30-4:30pm Mon, August 2
Final exam 20% 10:00-11:30am Wed, August 11 (last day of class)

Tests 1-3 and the final exam are online, there are timeslots for international students and domestic; each student will have one attempt at each exam. Exams are proctored by Honorlock (see below). The exams are available electronically for a fairly long period so that local and different time zones can participate at a reasonable hour. You are strictly forbidden from discussing exam contents with your fellow students until after the final exam deadline. (Remember that helping other students in this way is a violation of the honor code and potentially reduces your own score.)

All projects will be graded with the specific input or tests given in the project description, so you understand precisely what is expected of your program. Consequently, projects will be graded in binary fashion: They either work or they do not. The only exception is when your program does not run on my machine because of some cross-platform issue. This is typically because a student has hardcoded some file name or directory into their program. In that case, I will take off at least 10%, instead of giving you a 0. Please go to github and verify that the website has the proper files and that those files look correct for your solution before the deadline. That is what I will download for testing.

Each project has a hard deadline and only those projects working correctly before the deadline get credit (100%). Late projects are given 0%. My grading script pulls from github at the deadline. All projects are due at the start of class on the date indicated, unless otherwise specified.

No partial credit. Students are sometimes frustrated about not getting partial credit for solutions they labored on that do not actually work. Unfortunately, "almost working" just never counts in a job situation because nonfunctional solutions have no value. We are not writing essays in English that have some value even if they are not superb. When it comes to software, there is no fair way to assign such partial credit, other than a generic 30% or whatever for effort. The only way to determine what is wrong with your project is for me to fix and/or complete the project. That is just not possible for 90 students. Even if that were possible, there is no way to fairly assign partial credit between students. A few incorrect but critical characters can mean the difference between perfection and absolute failure. If it takes a student 20 hours to find that problem, is that worth more or less partial credit than another project that is half-complete but could be finished in five hours? To compensate, I try to test multiple pieces of the functionality in an effort to approximate partial credit.

Each project has a hard deadline and only those projects working correctly before the deadline get credit. My grading script pulls from github at the deadline.

I reserve the right to change projects until the day they are assigned.

Grading standards

I consider an A grade to be above and beyond what most students have achieved. A B grade is an average grade for a student or what you could call "competence" in a business setting. A C grade means that you either did not or could not put forth the effort to achieve competence. Below C implies you did very little work or had great difficulty with the class compared to other students.

Honorlock

All tests use HonorLock via Canvas and have strict time limits. You will be unable to do anything other than take the test; no access to the Internet etc. A proctor will monitor you during exams to ensure you do not communicate with anyone else during the test. Generally speaking, HonorLock will record all your web, computer, and personal activities (e.g., looking at your phone) during the quiz. It will flag suspicious behavior for my review and will save the recordings for 6 months if I need to go back and check it.

Please see the How to use" page for students. Either I or another instructor will launch a practice quiz on Canvas during the first week of class to ensure everything is set up properly.

  • Google Chrome and a webcam are required. At the beginning of the quiz, you will be able to add the Chrome extension for Honorlock, then follow the instructions to share your screen and record your quiz.
  • You might be asked to change settings on your computer while doing this. You can change the setting and come back to the quiz. This change should only be expected once.
  • If you are showing us the side view of your face we don’t know if you’ve got an earbud in your other ear. This is not allowed.
  • Make sure you are facing into the camera as Honorlock will shut down the system and force you to restart.
  • Make sure that you are not looking down and to the right as if you are looking at notes or using your phone. Honorlock will flag this as cheating.
  • You must not start and stop your browser; Honorlock will flag this is cheating.
  • You must not use other applications or visit non-Canvas-quiz URLs during the exam unless the exam indicates this is permitted.
  • Do not have your phone visible as the proctor will stop the quiz

Side notes:

  • Start the quiz with a single Chrome window and single tab in that window.
  • When the "share screen button" is grey, you can still click it and it will work.
  • HonorLock flags activities other than the allowed ones: for example when you are accessing a website other than canvas or looking at your phone. I will evaluate these cases and make a judgment myself. I will reach out to you when necessary. If you have followed the guidelines, you don’t need to worry.
  • If you have an honorlock software issue during the test, you must take a screen picture with your phone or ipad and notify me immediately via private slack to timestamp the situation with the picture and reason why you cannot proceed. Please contact tech support on the screen to resolve (they are very quick). I will check the Honorlock recording and timestamp of your pictures to grade.
  • Privacy statement from HonorLock just in case you are worried about privacy. Since access to Honorlock is very limited, and you are expected to only work on the quiz during the proctoring time, the data that Honorlock records is very limited too. The data storage and sharing agreement don’t have a higher risk than your regular school actives (Zoom, email, Canvas, ...).

Syllabus

Timeline

Notes and notebooks supporting lectures

Administrivia

ACADEMIC HONESTY. You must abide by the copyright laws of the United States and academic honesty policies of USF. You may not copy code from other current or previous students. All suspicious activity will be investigated and, if warranted, passed to the Dean of Sciences for action. Copying answers or code from other students or sources during a test, exam, or for a project is a violation of the university’s honor code and will be treated as such. Plagiarism consists of copying material from any source and passing off that material as your own original work. Plagiarism is plagiarism: it does not matter if the source being copied is on the Internet, from a book or textbook, or from tests or problem sets written up by other students. Giving code or showing code to another student is also considered a violation.

The golden rule: You must never represent another person’s work as your own.

If you ever have questions about what constitutes plagiarism, cheating, or academic dishonesty in my course, please feel free to ask me.

All students are expected to know and adhere to the University's Honor Code.

Note: Leaving your laptop unattended is a common means for another student to take your work. It is your responsibility to guard your work. Do not leave your printouts laying around or in the trash. All persons with common code are likely to be considered at fault.

USF policies and legal declarations

Students with Disabilities

If you are a student with a disability or disabling condition, or if you think you may have a disability, please contact USF Student Disability Services (SDS) for information about accommodations. Students should contact SDS at the beginning of the semester. Accommodations are not retroactive.

Illnesses and Emergencies

If you fall ill or have an emergency (personal or otherwise) that significantly affects your ability to complete a project or take an exam, you must notify the instructor before the task or artifact is due. Do not simply skip an exam or an assignment and say you were sick after the fact. Always make arrangements with the instructor beforehand, rather than declaring illness or emergency later. Accommodations are not retroactive. Illness and emergency related situations must be disclosed to both the instructor and program director in writing. Illness-related issues must be accompanied by a doctor’s note.

Behavioral Expectations

All students are expected to behave in accordance with the Student Conduct Code and other University policies.

Counseling and Psychological Services (CAPS)

CAPS provides confidential, free counseling to student members of our community.

Confidentiality, Mandatory Reporting, and Sexual Assault

For information and resources regarding sexual misconduct or assault visit the Title IX coordinator or USFs Callisto website.

More Repositories

1

dtreeviz

A python library for decision tree visualization and model interpretation.
Jupyter Notebook
2,921
star
2

lolviz

A simple Python data-structure visualization tool for lists of lists, lists, dictionaries; primarily for use in Jupyter notebooks / presentations
Jupyter Notebook
823
star
3

tensor-sensor

The goal of this library is to generate more helpful exception messages for matrix algebra expressions for numpy, pytorch, jax, tensorflow, keras, fastai.
Jupyter Notebook
746
star
4

random-forest-importances

Code to compute permutation and drop-column importances in Python scikit-learn models
Jupyter Notebook
596
star
5

bookish

A tool that translates augmented markdown into HTML or latex
Java
449
star
6

msds621

Course notes for MSDS621 at Univ of San Francisco, introduction to machine learning
Jupyter Notebook
346
star
7

simple-virtual-machine

A simple VM for a talk on building VMs
Java
207
star
8

simple-virtual-machine-C

Same as simple-virtual-machine but in C
C
136
star
9

msds692

MSAN692 Data Acquisition
HTML
125
star
10

cs652

University of San Francisco CS652 -- Programming Languages
Java
112
star
11

fundamentals-of-deep-learning

Course notes and notebooks to teach the fundamentals of how deep learning works; uses PyTorch.
Jupyter Notebook
73
star
12

msds689

Course syllabus, notes, projects for USF's MSDS689
Jupyter Notebook
64
star
13

stratx

stratx is a library for A Stratification Approach to Partial Dependence for Codependent Variables
TeX
62
star
14

ml-articles

Articles on machine learning
Jupyter Notebook
61
star
15

cs601

USF CS601 lecture notes and sample code
Java
54
star
16

msds593

MSDS593 -- Exploratory data analysis (EDA) at the University of San Francisco
Jupyter Notebook
25
star
17

website-explained.ai

The website content for explained.ai
Jupyter Notebook
23
star
18

msan501-old

USF MSAN501 lecture notes and sample code
TeX
21
star
19

mini-markdown

Parser for small subset of markdown
Java
20
star
20

cs345

CS345 Programming Languages at University of San Francisco
19
star
21

AniML-java

A Java implementation of random forest machine learning algorithm / classifier
Java
9
star
22

website-mlbook

Public repo to host website for public releases of mlbook html
HTML
8
star
23

bash-git-prompt

My own variation on the bash git prompt
Python
8
star
24

autodx

Simple automatic differentiation via operator overloading for educational purposes
TeX
7
star
25

data-acquisition

Data acquisition certificate (part of http://www.sfdatainstitute.org Course number CAS-DI-DAPY-001.
HTML
7
star
26

parrtlib

Parrt's Java library with useful functions
Java
6
star
27

gmdh

Experiment with GMDH polynomial computation-graph nodes
Python
5
star
28

msan501-starterkit

A starter kit with tests and skeleton code for the computational analytics boot camp, MSAN501, at the University of San Francisco.
Python
5
star
29

bild

A simple build utility written in Python, though I'll use to build java projects.
Python
5
star
30

c_unit

A C unit testing rig in the spirit of junit.
C
4
star
31

sample-jetbrains-plugin

A sample jetbrains plugin that uses ANTLR for lexing/parsing.
Java
4
star
32

java-neural-net

A simple neural network in java using particle swarm optimization.
Java
4
star
33

playdl

Playing with deep learning
Jupyter Notebook
3
star
34

antlr4-demo-simple-lang

Simple language grammar and listener for talk demos
Java
3
star
35

hash-duo

Explore building a hash table with two different hash functions that balances chain length
C++
3
star
36

selfnet

Playing with self-organizing deep learning neural networks
Jupyter Notebook
2
star
37

pltvid

A simple library to capture multiple matplotlib plots as a movie.
Jupyter Notebook
2
star
38

gpu-test

A test of OpenCL use on OS X, XCode. Simple vector squaring.
C
2
star
39

learn-git

1
star
40

gradle-antlr-plugin

The Official Gradle ANTLR plugin
1
star
41

cs601-webmail-skeleton

Some goodies to help start the CS601 webmail project
Java
1
star
42

cs601-webmail-st-skeleton

StringTemplate-based version of webmail skeleon
Java
1
star
43

inclass

1
star
44

foobar

1
star
45

website-book.explained.ai

HTML
1
star
46

demo

test for class
Java
1
star
47

website-faculty-parrt

My faculty web page
HTML
1
star
48

annotation-processor

Java
1
star