Geophysics for Data Scientists
Introduction
More and more Deep Learning will play a role not only in society in general but also in the geosciences. Deep Learning resorts under the overall heading of Machine Learning / Artificial Intelligence. In this domain often the word “Algorithms” is used to indicate that computer algorithms are used to obtain results. Also, “Big Data” is mentioned, indicating that these algorithms need a large amount of training data to produce useful results.
Many scientists mention “Let the data speak for itself” when referring to Deep Learning, indicating that hidden or latent relationships between observations and classes or values of (desired) outcomes can be derived using these algorithms. Examples are in the field of seismic processing (first arrival picking), interpretation (facies prediction), etc. Often, we resort to statistical relationships. Then Deep Learning enters the game. From a range of labelled data (called instances) we can derive a linear/nonlinear relationship (model in DL terminology) that predicts the label or value (supervised learning) of new data (instances in DL terminology). But sometimes it is already useful if an algorithm can define separate groupings / clusters, which then still need to be interpreted (unsupervised learning). Even more sophisticated is Semi-supervised learning: labelled and unlabelled data together are clustered whereby the unlabelled data receives the label of the dominant class present in the cluster.

Domain Experts and Data Scientists
In discussions at the EAGE Digital conference 2024, it was emphasized that not only the Subject Matter Experts (SME’s) had to become familiar with the terminology and methods used by the Data Scientists, but also the Data Scientists must understand what geology and geophysics is about. That doesn’t mean they need to know the ins-and-outs of these subjects, but at least know the terminology and the overall context for which they need to provide the Machine / Deep learning tools. Therefore, this course will be a first step in providing the necessary geophysical background.

The Course
As it is assumed that the Data Scientists are familiar with mathematics and statistics, the course will include advanced geophysical subjects. A general overview of seismic and non-seismic acquisition, processing and interpretation will be followed by various uses of Machine / Deep learning for Geophysical Applications. We will predict lithology and pore fluids as well as Facies to learn
the Deep Learning workflows and algorithms needed in geophysics. Use will be made of open-source software: TensorFlow and Keras. Power-point presentations and videos will introduce various aspects, but the emphasis is on computer-based exercises. The exercises deal with pre-conditioning the datasets and applying several methods to classify / cluster the data: Multilayer Perceptron, Support Vector, Nearest Neighbour, AdaBoost, Trees. Non-linear Regression is used to predict porosity. Use will be made of Google Colab and Scikit-Learn. It runs on the Cloud and allows use of a GPU. It is “the way” to learn using a whole range of open-source Deep Learning algorithms for geophysical applications. The course consists of many exercises as I am a strong believer in the paradigm: Tell me and I will forget, show me and I might remember, involve me (through exercises) and I will truly learn.

Learning
At the end of the course participants will have a clear idea of what goes on in Geophysics and how Artificial Intelligence will impact the future of Geosciences. Interactive quizzes using “Mentimeter” are used to enhance the learning.

Intended Audience
Data Scientists who will be cooperating with geoscientists to develop AI methods for exploration and development of hydrocarbons or mineral resources. Also, application for geothermal and CO2 storage are discussed.

Pre-requisites
A good understanding of mathematics, statistics and to some degree of physics. Go to Quiz 0 to test yourself
Note: The course can be adapted to comply with the needs of participants.