• Stars
    star
    163
  • Rank 231,141 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Codes for the book "Julia for Data Analysis"

Julia for Data Analysis

This repository contains source codes for the "Julia for Data Analysis" book that has been written by Bogumił Kamiński and has been published by Manning Publications Co.

Contents

Additional teaching materials

  • in the /exercises folder for each book chapter you can find 10 additional exercises with solutions (they are meant for self study and are not discussed in the book)
  • in the /lectures folder for each book chapter you can find a Jupyter Notebook file with code from this chapter (note that the code is slightly adjusted in comparison to code contained in .jl files in the root folder to accommodate it for running in Jupyter Notebook).

Setting up your environment

General instructions

In order to prepare the Julia environment before working with the materials presented in the book please perform the following setup steps:

  • download and install Julia; all the codes were tested under Julia 1.7 (under never versions of Julia the code will work, but you might get warning messages when loading packages due to the fact that their versions are pinned in this repository);
  • make sure you can start Julia by running julia command in your terminal;
  • download this repository to a local folder on your computer;
  • start Julia in a folder containing the downloaded material using the command julia --project; the folder must contain the Project.toml and Manifest.toml files prepared for this book that allow Julia to automatically set up the project environment that will allow you to work with material presented in this book (a more detailed explanation what these files do and why they are required is given in appendix A to the book);
  • press ], write instantiate and press Enter (this process will ensure that Julia properly configures the working environment for working with the codes from the book); in some cases running the resolve command also might be required;
  • press Backspace, write exit() and press Enter; now you should exit Julia and everything is set up to work with the materials presented in the book.

Additional instructions how to manage your Julia installation are given in appendix A to the book. In particular I explain there how to perform a correct configuration of your environment when doing:

  • integration with Python using the PyCall.jl package;
  • integration with R using the RCall.jl package;
  • installation of Plots.jl (which by default uses the GR Framework that requires installation of extra dependencies on operating system level under Linux).

In particular, if you use Visual Studio Code with Julia extension then open the folder with the materials contained in this repository (you can open it in Folder/Open Folder... menu option). Then if you run Start Julia REPL command (e.g. under Windows its keyboard shortcut is Alt-J Alt-O) a proper project environment will be automatically activated (the Julia extension will use the Project.toml and Manifest.toml files that are present in this folder).

Note for Linux users

Installation of Julia under Linux requires that you choose the folder to which you extract the precompiled binaries you have downloaded. Next, assuming that you extracted Julia in, for example, the /opt folder, the simplest way to make sure that your system can find julia executable is to add it to your system PATH environment variable. A standard way to do it is to edit your ~/.bashrc (or ~/.bash_profile) file and add there the:

export PATH="$PATH:/opt/julia-1.7.2/bin"

line (assuming you have downloaded Julia 1.7.2 and extracted it to /opt folder).

Dev Containers

Folder /.devcontainer contains configuration files for Dev Containers.

Organization of the code

The codes for each chapter are stored in files named chXX.jl, where XX is chapter number. The exceptions are

  • chapter 14, where additionally a separate ch14_server.jl is present along with ch14.jl (the reason is that in this chapter we create a web service and the ch14_server.jl contains the server-side code that should be run in a separate Julia process);
  • appendix A, where the file name used is appA.txt because it also contains other instructions than only Julia code (in particular package manager mode instructions).

Solutions to the exercises that are presented in appendix B in the book are stored in appB.jl file. These solutions assume that they are executed in the same Julia session as the codes from the chapter where the question was posted (so that appropriate variables and functions are defined and appropriate packages are loaded).

Running the example codes

To work with codes from some given chapter:

  • it is recommended to use a machine with at least 8GB of RAM when working with the examples in this book (some examples require more RAM, which is clearly indicated in the book);
  • start a fresh Julia session using the julia --project command in a folder containing the downloaded material (or alternatively use Visual Studio Code to activate the appropriate project environment automatically);
  • execute the commands sequentially as they appear in the file; the codes were prepared in a way that you do not need to restart Julia when working with material from a single chapter, unless it is explicitly written in the instructions to restart Julia (some of the codes require this); when you move to a new chapter start a new Julia session;
  • before each code there is a comment allowing you to locate the relevant part of the book where it is used; if in the code there is a blank line between consecutive code sections this means that in the book these codes are separated by the text of the book explaining what the code does;

Accompanying materials

There are the following videos that feature material related to this book:

Data used in the book

For your convenience I additionally stored data files that we use in this book. They are respectively:

Citation

Plain text (Chicago style):

Kamiński, Bogumił. 2023. Julia for Data Analysis. Manning.

BibTeX:

@book{Kaminski2023,
  title     = "Julia for Data Analysis",
  author    = "Kamiński, Bogumił",
  year      = 2023,
  publisher = "Manning",
  address   = "Shelter Island, NY"
}

Errata

You can find errata for the book in this file.

More Repositories

1

Julia-DataFrames-Tutorial

A tutorial on Julia DataFrames package
Jupyter Notebook
487
star
2

The-Julia-Express

A concise Julia language introductory manual for programmers.
TeX
257
star
3

JuliaCon2021-DataFrames-Tutorial

A tutorial on DataFrames.jl prepared for JuliaCon2021
Jupyter Notebook
107
star
4

JuliaCon2020-DataFrames-Tutorial

Jupyter Notebook
52
star
5

JuliaCon2022-DataFrames-Tutorial

A Complete Guide to Efficient Transformations of DataFrames
Jupyter Notebook
35
star
6

EventSimulation.jl

An event based Discrete Event Simulation engine
Julia
21
star
7

ReadOnlyArrays.jl

A wrapper type around AbstractArray that is read-only
Julia
21
star
8

WooldridgeCode.jl

Julia code for "Introductory Econometrics" A Modern Approach", Seventh Edition by Jeffrey M. Wooldridge
Julia
18
star
9

ABCDGraphGenerator.jl

Artificial Benchmark for Community Detection (ABCD) - A Fast Random Graph Model with Community Structure
Jupyter Notebook
18
star
10

PyDataGlobal2020

An introduction to DataFrames.jl for pandas users
Jupyter Notebook
17
star
11

DataFrames-Showcase

A short showcase of DataFrames.jl
Jupyter Notebook
14
star
12

JuliaCon2023-Tutorial

An introductory part of the workshop prepared for JuliaCon2023
Jupyter Notebook
14
star
13

JuliaCon2019-DataFrames-Tutorial

A hands-on tutorial on the DataFrames.jl package
Jupyter Notebook
10
star
14

WarszawskieForumJulia

Materiały ze spotkań Warszawskiego Forum Julia
Jupyter Notebook
8
star
15

ODSC-EUROPE-2021

Dataframes.jl: a Perfect Sidekick for Your Next Data Science Project
Julia
7
star
16

UEP-Workshop-20190405

Materials for Julia workshop in UEP on 2019-04-05
Julia
5
star
17

Workshop-on-Optimization-Techniques

Workshop on Optimization Techniques for Data Science in Python and Julia
HTML
4
star
18

MakieCon2023

Presentation for MakieCon2023
Jupyter Notebook
4
star
19

ComplexNetworks2019

Summer School on Data Science Tools and Techniques in Modelling Complex Networks
Jupyter Notebook
3
star
20

ABCDHypergraphGenerator.jl

Artificial Benchmark for Hypergraphs Community Detection (ABCDH)
Julia
3
star
21

BinderOptimizationNotebook

An example how to set up a repository for optimization with MyBinder
Jupyter Notebook
2
star
22

Nanocsv.jl

A minimal implementation of CSV reader/writer for Julia
Julia
2
star
23

SmartMobilityOptimization

HTML
1
star
24

JuliaStrBenchmark

A collection of benchmarks for strings in Julia
Julia
1
star
25

FWF-deprecated

A simple package for working with fixed width format files. To be merged into https://github.com/RandomString123/FWF.jl
Julia
1
star
26

UEP-Workshop-binder20190405

mybinder version of UEP-Workshop-binder20190405
Jupyter Notebook
1
star
27

StochasticSimulation

Materials for doing stochastic simulations with Julia
Jupyter Notebook
1
star