• Stars
    star
    345
  • Rank 122,285 (Top 3 %)
  • Language
  • License
    MIT License
  • Created almost 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Inventory of all the educational content that I share on spatial data analytics, geostatistics and machine learning. I hope these resources are helpful, Prof. Michael Pyrcz

Prof. Michael J. Pyrcz, @GeostatsGuy, Resources

Howdy Folks, I'm Michael Pyrcz, an Associate Professor at The University of Texas. I teach and conduct research on data analytics, geostatistics and machine learning. I'm appointed in the Hildebrand Department of Petroleum and Geosystem Engineering, the Jackson School of Geosciences and the Bureau of Economic Geology. I'm also a principal investigator in the College of Natural Sciences Energy Analytics Freshmen Research Initiative and Inventors' Program and a core faculty in the Machine Learning Laboratory in Computer Sciences, all at The University of Texas at Austin.

I feel that the role of professor is a role of service, so I post all my lectures and supporting content online resulting in evergreen content that outlasts the semester and reaches beyond campus. I hope this content supports:

  • my students for ongoing learning content long after they finish my courses
  • working professionals facing the digital transformation and interested to learn new skills
  • potential students by breaking down barriers and making our university a welcoming place for all interested to learn

Here's an inventory of my online resources that I have made to help people learn about spatial data analytics, geostatistics and machine learning. I have produced these resources to support my students and I thought they would be useful to my students after completion of the class (an evergreen resource), to other students and working professionals interested in this topic.

I hear from students, working professionals and potential students everyday that benefit from these products!

Michael Pyrcz, Associate Professor, University of Texas at Austin

Novel Data Analytics, Geostatistics and Machine Learning Subsurface Solutions

With over 17 years of experience in spatial, subsurface data analytics consulting, research and development, and leadership, Michael has returned to academia driven by his passion for teaching and enthusiasm for enhancing engineers' and (geo)scientists' impact in spatial, subsurface resource development.

For more about Michael, my research group (15 PhDs), my consortium (DiReCT), my publications, my background, my education startup etc. check out these links:

Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn | DiReCT Website | DiReCT GitHub | daytum

About Michael

Want to learn more about my story, my publications and other contributions to open source, check this out:

  1. My story of how I got started in engineering and ended up as a professor at The University of Texas at Austin My Story

  2. My research, approach to research and views on building an inclusive and diverse team My Research

  3. Nothing is possible without awesome graduate students My Students

  4. I've written a bit, here's the books My Books

  5. My peer-reviewed publications My Papers

  6. My other contributions My Other Contributions

  7. I wrote an open source Python package for spatial data analytics and geostatistics. Much of it is a translation of GSLIB (Deutsch and Journel, 1998) from the original Fortran to Python for 2D geostatistical methods. I did this to support my students in my Spatial Data Analytics and Geostatistics courses. Check it out and consider contributing and become a coauthor at GeostatsPy on PyPi Repository and GitHub.

  • NOTE, since GeostatsPy relies on the Numba package for code acceleration, and Numba is not updated to Python >= 3.9, please use Python < 3.9 with GeostatsPy.
  1. I do quite a bit on social media, here's why I do it, My Social Media Efforts.

  2. Check out my TEDx talk on 'A Professor's Secret Weapon' TED Talk

  3. Check out my Twitter feed for resources, ideas and possitivity most days, where I'm the GeostatsGuy Twitter.

  4. I post a lot of code, demonstration workflows and course material to support anyone that wants to learn My GitHub

  5. I partnered with Prof. John Foster (UT Austin) and Bazean, a technology-enabled energy investment firm, to start the energy-focussed data science education company, daytum. We are currently offering short courses in Energy Data Science.

Michael Pyrcz, Associate Professor, University of Texas at Austin

Online Resources on Spatial Data Analytics, Geostatistics and Machine Learning

Recorded Lectures

I record all my university lectures and post them on YouTube. You are welcome to join my classes!

  1. Introduction - Howdy, I'm Michael

  2. YouTube Channel GeostatsGuy Lectures

  3. Introduction to Data Analytics, Geostatistics and Machine Learning Undergraduate Lectures (Lec00-Lec21)

  4. Subsurface Modeling Graduate Course (Lec00 - Lec22)

  5. Subsurface Machine Learning Graduate Course (Lec00 - Lec18)

  6. Data Science Basics in Python (Chapter I - III)

  7. Open Source Spatial Data Analytics in Python with GeostatsPy

  8. My TED Talk, A Professor’s Secret Weapon

  9. Introduction to Spatial Continuity

  10. Tutorial: Open Source Spatial Data Analytics in Python with GeostatsPy

  11. Geostatistical Workflows for Unconventional Reservoirs)

  12. Geostatistical Workflows for Unconventional Reservoirs at BEG

  13. What Does a Geoscientist Need to Know About Geostatistics? And Why It Would Be Helpful?

  14. Center for Petroleum and Geosystems Engineering Webinar - Big Data Analytics for Petroleum Engineering: Hype or Panacea?

  15. Michael's Unsolicited Advice and Ideas for a Successful and Happy Career in Our Industry

  16. My interview on AAPG's Digging Deeper podcast with the awesome host Vern Stefanic.

GeostatsPy Python Package Workflows

I wrote a Python Package called GeostatsPy for spatial data analytics and geostatistics. Here's a set of demonstration workflows in Python Jupyter Notebook for many of the fundamental workflow steps from data preparation, statistical inference to spatial prediction with uncertainty. They go along with my recorded lectures from my courses on my YouTube channels:

Here's the workflows:

  1. GeostatsPy: Reimplementation of GSLIB in Python
  2. Data Distributions with GeostatsPy
  3. Feature Ranking with GeostatsPy
  4. Volume Variance Relations with GeostatsPy
  5. Confidence Intervals and Hypothesis Testing with GeostatsPy
  6. Monte Carlo Simulation with GeostatsPy
  7. Bootstrap with GeostatsPy
  8. Data Distributions
  9. Data Distribution Transformations with GeostatsPy
  10. Declustering with GeostatsPy
  11. Ensemble Declustering with GeostatsPy
  12. Inverse Distance Interpolation with GeostatsPy
  13. Indicator Kriging with GeostatsPy
  14. Kriging with GeostatsPy
  15. Multivariate Analysis with GeostatsPy
  16. Overfitting Models with GeostatsPy
  17. Plotting Spatial Data with GeostatsPy
  18. Directional Spatial Continuity with GeostatsPy
  19. Spatial Updating with GeostatsPy
  20. Spatial Trend Modeling with GeostatsPy
  21. Multivariate Feature Ranking with GeostatsPy
  22. Variogram Calculation with GeostatsPy
  23. Variogram Modeling with GeostatsPy
  24. Spatial Bootstrap with GeostatsPy
  25. Spatial Simulation with GeostatsPy
  26. Spatial Indicator Simuluation with GeostatsPy
  27. Spatial Simulation Post-processing with GeostatsPy

Interactive Python Worklfows to Support Education

I think interactive workflows are excellent tools to support education. For data analytics and machine learning, turning a dial and watching a system or machine change is a great method to gain intuition and experience. I started to put together interactive workflows with ipywidgets and matplotlib. Check them out here:

  1. General Bootstrap
  2. Parametric Distributions
  3. Monte Carlo Simulation
  4. Bootstrap Colored Balls in a Cowboy Hat
  5. Norms
  6. Optimization
  7. Overfit
  8. DYI Central Limit Theorem
  9. Confidence Interval by Bootstrap and Analytical
  10. Sivia's Bayesian Coin
  11. Spurious Correlation
  12. Correlation Coefficient
  13. LASSO Regression
  14. Principal Components Analysis
  15. Ridge Regression
  16. Simple Kriging
  17. String Effect
  18. Stochastic Simulation
  19. Uncertainty with Spatial Aggregation
  20. Kriging String Effect
  21. Uncertainty Model Checking
  22. Variogram Calculation
  23. Variogram Modeling
  24. Combined Variogram Calculation and Modeling
  25. Spectral Clustering
  26. Artificial Neural Networks
  27. Checking Uncertainty Models
  28. Shapley Values

Resources on Statistics and Probability

  1. Probability Theory – my undergraduate lecture
  2. Statistics – undergraduate lecture
  3. Marginal, Joint & Conditional Probability – slides

Parametric Distributions

Parametric Distributions are fundamental to statistics and data analytics inferential and predictive workflows. Sometimes they are required by theory and often they result from nature. Many students struggle with them so I made simple demonstrations in Microsoft Excel that cover how to make them from scratch and how to work with them:

  1. How to make them in Excel
  2. Poisson distribution in Excel
  3. Gaussian transform in Excel and Python
  4. Log normal distribution in Excel
  5. Interactive parametric distributions in Python

Hypothesis Testing

Hypothesis Testing is all about recognizing the difference that makes a difference. These tests protect us from the belief in small numbers and are bias to see patterns in random phenomenon.

  1. Difference in means in Excel and in Python
  2. Difference in variances in Excel and in Python
  3. Difference in distributions in Excel
  4. Interactive hypothesis testing in Python

Demos of Bayesian Statistics

Bayesian Apporaches are powerful. They integrate prior belief with new observations, provide explicit uncertainty models and more intuitive credible intervals for uncertainty in model parameters. Here's some accessible demonstrations to get you started thinking like a Bayesian statician.

  1. The Coin Problem from Sivia (1996) in Excel
  2. Bayesian updating with Gaussian in Excel
  3. Probability given a positive test in Excel
  4. Sivia's Bayesian Coin in Interactive Python
  5. Bayesian Regression in Python
  6. Naive Bayes Regression and Classification in Python

Other

  1. Bootstrap in Excel, in Python and in R
  2. Spatial Bootstrap in Python
  3. Linear regression in Excel and in R
  4. Loss functions in Excel
  5. Multivariate Analysis

Heterogeneity

Our subsurface systems are heterogeneous and heterogeneity matters in many subsurface prediction problems. Here are some accessible demonstrations to help you get started quantifying heterogeneity.

  1. Making an example well in Excel
  2. Lorenz coefficient in Excel
  3. Hurst coefficient in R
  4. Ripley Cross K in R
  5. Ripley K-function in Python
  6. Lozenz coefficient in Python
  7. Lorenz coefficient functions in Python

Machine Learning

I have an new Subsurface Machine Learning Course that builds from fundamental probability to artificial neural networks. The recorded lectures are available here:

You are welcome to follow along! The demonstration workflows from the lectures are here:

  1. Feature Imputation in Python
  2. Feature Ranking in Python
  3. Feature Transformations in Python
  4. Feature Uncertainty in Python
  5. Dimensional Reduction in Python and in R
  6. Clustering in Python
  7. Principal Components Analysis in Python
  8. Multidimensional Scaling and Random Projection in Python
  9. Linear Regression in Python
  10. Ridge Regression in Python
  11. LASSO Regression in Python
  12. Isotonic Regression in Python
  13. Bayesian Regression in Python
  14. Polynomial Regression in Python
  15. Naive Bayes Regression and Classification in Python
  16. Time Series Analysis
  17. k Nearest Neighbour
  18. Decision tree in PythonPython Advanced and in R
  19. Gradient Boosting in Python and Advanced Gradient Boosting in Python
  20. Support Vector Machines in Python
  21. Neural Networks in Python
  22. Convolution Operators in Python
  23. Convolutional Neural Networks in Python
  24. Convolutional Neural Networks Classifier in Python
  25. Generative Adversarial Networks in Python
  26. Conditional Generative Adversarial Network in Python
  27. Course Conclusion
  28. scikit learn Overview

Geostatistics

  1. GeostatsPy: Reimplementation of GSLIB in Python
  2. Introduction to Data Analytics, Geostatistics and Machine Learning Undergraduate Lectures (Lec00-Lec21)
  3. What Does a Geoscientist Need to Know About Geostatistics? And Why It Would Be Helpful? and PPT
  4. Exercises, hands-on and demonstrations PPT Inventory
  5. Functions that reimplement or call GSLIB exes in Python
  6. Demo of the functions in Python
  7. Declustering in Python and with PyGSLIB Package
  8. Declustering and Debiasing in Excel
  9. Variogram calculation in Excel and in R
  10. Full variogram Calculation and Modeling in Excel and in PyGSLIB Package

Supplemental Slides

  1. Facies criteria in PPT
  2. Value of quantification in PPT
  3. Stationarity in PPT
  4. Uncertainty in PPT
  5. Suggested books in PPT
  6. Simple kriging in Excel and in R
  7. Uncertainty Away from Data in Excel
  8. Convolution methods in Python
  9. LU Simulation in Pyton
  10. Sequential Gaussian simulation in Excel and in R
  11. Truncated Gaussian simulation in Excel
  12. Spatial uncertainty in Excel
  13. Volume-variance relations in Excel
  14. Working with realizations in R
  15. Lecture on value in industry in PPT

I hope these resources are useful.

Want to Work Together?

I hope that this is helpful to those that want to learn more about subsurface modeling, data analytics and machine learning. Students and working professionals are welcome to participate.

  • Want to invite me to visit your company for training, mentoring, project review, workflow design and consulting, I'd be happy to drop by and work with you!

  • Interested in partnering, supporting my graduate student research or my Subsurface Data Analytics and Machine Learning consortium (co-PIs including Profs. Foster, Torres-Verdin and van Oort)? My research combines data analytics, stochastic modeling and machine learning theory with practice to develop novel methods and workflows to add value. We are solving challenging subsurface problems!

  • I can be reached at [email protected].

I'm always happy to discuss,

Michael

Michael Pyrcz, Ph.D., P.Eng. Associate Professor The Hildebrand Department of Petroleum and Geosystems Engineering, Bureau of Economic Geology, The Jackson School of Geosciences, The University of Texas at Austin

More Resources Available at: Twitter | GitHub | Website | GoogleScholar | Book | YouTube | LinkedIn

More Repositories

1

PythonNumericalDemos

Well-documented Python demonstrations for spatial data analytics, geostatistical and machine learning to support my courses.
Jupyter Notebook
1,325
star
2

GeostatsPy

GeostatsPy Python package for spatial data analytics and geostatistics. Mostly a reimplementation of GSLIB, Geostatistical Library (Deutsch and Journel, 1992) in Python. Geostatistics in a Python package. I hope this resources is helpful, Prof. Michael Pyrcz
Jupyter Notebook
442
star
3

GeostatsPy_Intro_Course

Introduction to spatial data analytics and machine learning with GeostatsPy Python package
Jupyter Notebook
138
star
4

ExcelNumericalDemos

A set of numerical demonstrations in Excel to assist with teaching / learning concepts in probability, statistics, spatial data analytics and geostatistics. I hope these resources are helpful, Prof. Michael Pyrcz
95
star
5

2DayCourse

My 2-day short course on spatial data analytics and geostatistics. I hope these resources are helpful, Prof. Michael Pyrcz
82
star
6

GeoDataSets

Synthetic datasets for geoscience (geo)statistical modeling
71
star
7

MachineLearningCourse

My graduate level machine learning course, including student machine learning projects.
Jupyter Notebook
61
star
8

PGE383_SubsurfaceModeling

Graduate course on subsurface modeling
Jupyter Notebook
32
star
9

geostatsr

Geostatistical utilities and tutorial in R. For the tutorials I have included Rmarkdown html files.
HTML
32
star
10

GeostatsGuy

Information about me.
28
star
11

MachineLearning_StudentProjects

My graduate students complete Machine Learning projects that they have agreed to share.
Jupyter Notebook
24
star
12

Machine_Learning

1 Day Machine Learning Course
Jupyter Notebook
21
star
13

LectureExercises

The exercises from my Introduction to Geostatistics available on YouTube on the GeostatsGuy Lectures Channel.
R
19
star
14

5DayGeostats_DataAnalytics

5-day course on Geostatistics, Data Analytics and Machine Learning
Jupyter Notebook
19
star
15

MultivariateModeling

Short course on multivariate modeling
Jupyter Notebook
16
star
16

MLTrainingImages

Machine learning training images.
Python
15
star
17

InteractivePython

Jupyter Notebook
15
star
18

GeostatsLectures

(Geo)statistical course materials released for anyone to use (.pdf format). Enjoy! I'm happy to discuss.
15
star
19

GeostatsMachineLearning_Course

14
star
20

DataAnalytics_Geostatistics

2 Day short course on spatial stat analytics, geostatistics and machine learning.
Jupyter Notebook
12
star
21

2DayCourse_Exercises

Jupyter Notebook
12
star
22

SubsurfaceMachineLearning

Short course on subsurface data analytics and machine learning.
10
star
23

Geostats_ML_2Day

Two day course on geostats and machine learning
10
star
24

GeostatsPy_Course_2

Course on the GeostatsPy Python geostatistics package covering uncertainty modeling with declustering and simulation.
Jupyter Notebook
8
star
25

RandomTools

Random tools to support decision making in like
Jupyter Notebook
8
star
26

Undergraduate_Research

Undergraduate research projects.
Jupyter Notebook
6
star
27

GSLIB_MacOS

Executables for GSLIB on Mac OS
6
star
28

EnergyAI_2021_Hackathon

Jupyter Notebook
6
star
29

GeostatsPyDemos

Well-documented demonstrations of the GeostatsPy package for geostatistics and spatial data analytics.
6
star
30

PGE379_SubsurfaceMachineLearning

Course in subsurface machine learning.
4
star
31

Heterogeneity_Course

4
star
32

DIRECT

Digital Reservoir Characterization Technology Consortium, UT Austin
4
star
33

interactive_geostatr

A collection of interactive geostatistical tutorials in Jupyter Notebooks / Binder.
Jupyter Notebook
4
star
34

GSLIB_Windows

Static builds of GSLIB for Windows to solve issues with missing DLL files.
2
star
35

RepeatableResearch

Workflows for my published papers for repeatability.
Jupyter Notebook
2
star
36

DataScience_Interactive_Python

Python interactive dashboards for learning data science
Jupyter Notebook
2
star
37

GSLIBTools

FORTRAN tools to assist with building geostatistical workflows.
Fortran
1
star
38

MachineLearningDemos

well-documented demonstration Python Jupyter workflows for many common machine learning workflows
Jupyter Notebook
1
star
39

MachineLearningDemos_Book

Applied Machine Learning in Python e-Book
Jupyter Notebook
1
star