Geophysical curiosity 39
Robust Estimation of Primaries by Sparse Inversion (EPSI) avoids these difficulties by solving directly for the multiple-free wavefield. It seeks the sparsest surface-free Green’s function in a way that resembles a basis pursuit optimization. In addition, it can find solutions in arbitrary transform domains, preferably ones with a sparser representation of the data (for example, curvelet transform in the spatial directions and wavelet transform in the time direction). It can also handle simultaneous-source data.
As the Green’s function is expected to be impulsive, the method uses the sparsity of the solution wavefield as the regularization parameter. By reformulating EPSI under an optimization framework, the “Pareto analysis” of the Basis Pursuit Denoising optimization can be used.
Basis Pursuit Denoising (BPDN) refers to mathematical optimization. Exact solutions to basis pursuit denoising are often the best computationally tractable approximation of an underdetermined system of equations. Basis pursuit denoising has potential applications in statistics (Lasso method of regularization), image compression and compressed sensing.
Lasso (Least Absolute Shrinkage and Selection Operator) in statistics and Machine learning is a regression analysis method that performs both variable selection and regularization to enhance the prediction accuracy and interpretability of the resulting statistical model.
Lasso was originally formulated for linear regression models.
Pareto analysis is a formal technique useful where many possible courses of action are competing for attention. In essence, the problem-solver estimates the benefit delivered by each action, then selects a number of the most effective actions that deliver a total benefit reasonably close to the maximal possible one.
Hence, it helps to identify the most important causes that need to be addressed to resolve the problem. Once the predominant causes are identified, then tools like the “Ishikawa diagram” or “Fish-bone Analysis” can be used to identify the root causes of the problem. While it is common to refer to pareto as "80/20" rule, under the assumption that, in all situations, 20% of causes determine 80% of problems, this ratio is merely a convenient rule of thumb and is not nor should it be considered an immutable law of nature. The application of the Pareto analysis in risk management allows management to focus on those risks that have the most impact on the project.
Ref: Lin & Herrmann, Robust estimation of primaries by sparse inversion via L1 minimization. Geophysics, 2013, V78, N3, R133–R150.
Geophysical curiosity 38
The core idea of GAN is based on the "indirect" training through the discriminator, which itself is also being updated dynamically. This basically means that the generator is not trained to minimize the distance to a specific image, but rather to fool the discriminator. This enables the “model” (ML terminology) to learn in an unsupervised manner.
The generative network generates candidates while the discriminative network evaluates them. The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space (containing the distributions of the features of the real data) to a data display of interest, while the discriminative network distinguishes candidates produced by the generator from the true data. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized, but are part of the true data set).
A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically, the generator is seeded with randomized input that is sampled from a predefined latent space (e.g., a multivariate distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Independent backpropagation procedures are applied to both networks so that the generator produces better samples, while the discriminator becomes more skilled at flagging synthetic (fake) samples.
An example of a GAN network with a Generator and a Discriminator
“Real” and “Fake” data, with fake data generated by the Generator network can be presented to the Discriminator, which must decide what the real class (0-6) of the sample is or whether it is fake (7). The Generator samples randomly from a latent space which contains the distributions related to the real samples, construct a seismic display, and presents it to the Discriminator. If the Discriminator decides it is fake, it will feed that back to the Generator. In the iterations both networks try to improve their performance: The Generator to construct seismic that fools the Discriminator, and the Discriminator to discriminate real from fake. After many epochs, a set quality criterium will be reached defining the final Generator and Discriminator. Note over- and under-fitting should be avoided.
1.The generator tries to create samples that come from a targeted yet unknown probability distribution and the discriminator to distinguish those samples produced by the generator from the real samples.
2. The generator G(z, g) extracts a random choice from the latent space (of random values) and maps it into the desired data space x.
3. The discriminator D(x, d) outputs the probability that the data x comes from a real facies.
4. g and d are the model parameters defining each neural network, which need to be learned by backpropagation.
5. When a real sample from the training set is fed regardless of being labelled or unlabelled, into the discriminator, it will be assigned a high probability of real facies classes and a low probability of fake classes, whereas it is the opposite for samples generated by the generator.
Below is shown an example comparing a stand-alone CNN with a GAN. It shows the seismic facies distribution before and after applying 3D CNN and 3D GAN with 10 and 50 labelled samples per class (10 makes up a very limited training data set). The displays lead to the following conclusions: 1) CNN as well as GAN result in a much better separation between the classes/facies, 2) the larger the training data set the better the performance (% accuracy) and 3) GAN performs better than 3D CNN in case of a limited learning set.
What happens after training & validation? If the aim is the increase the Learning dataset, then the Generator can be used to create additional samples. If the purpose is to classify the rest of the seismic cube, then we discard the Generator and use the trained Discriminator.
Ref: Liu et al., Geophysics 2020, V85, N4
Geophysical curiosity 37
Finally, I found a publication (Liu et al., Geophysics 2020, V85, N4) which showed me several nice examples, using a display called t-SNE (See below for a description). The display has the same characteristics a Self-Organised-Map (SOM) in that instances or samples that are closely located in the multidimensional attribute space are mapped close together in a lower dimensional (2D or 3D) space.
Below is an example showing the facies distribution in 2D for seismic before and after CNN. The CNN “attributes” show a better separation between the different facies.
Geophysical curiosity 36
Let us look at the way the energy travels from shot to receiver. On the left we see the sensitivity kernel for a single frequency. By interference we have the fundamental and the higher order “Fresnel zones” along which the energy travels, by constructive interference. On the right we see that for a broadband signal the higher order Fresnel zones have disappeared by destructive interference. But we do not have a pencil sharp ray-path, rather a fat ray or Gaussian beam.
Another example shows turning waves going from shot to different receiver locations, shown as “fat rays”. Again, the ray area/volume shows which part of the subsurface influences the wave propagation and thus can be updated using the difference between modelled and observed wave phenomenon, using inversion minimising a loss function.
Let us look at another geophysical phenomenon: Full Waveform Inversion or FWI for short, where we model the complete wavefield and compare the modelled wavefield with the observed one to update the model parameters. What does the sensitivity kernel
For a homogeneous overburden above a reflector at 4 km depth the sensitivity kernel is shown below.
We observe basically 3 parts. An ellipse, a banana shape and so-called rabbit ears. The ellipse is related to scattering from the interface at 4 km depth, the banana shape deals with turning waves and the rabbit ears relate to to the reflection from the interface. The rabbit ears are usually an order of magnitude smaller in amplitude as they are scaled by the reflection coefficient. As seen in figure d above, they deal with the low spatial wavenumbers of the velocity and density above the reflector, or in other words with the background model.
Let us say we want only to update the background velocity model. Clearly, we should keep all parts related to the low spatial wavenumbers. The corresponding filtered sensitivity kernel is shown below.
We clearly have suppressed the part related
to the specular (single point) reflection/scattering, although not completely as can be seen just above the reflector.
This suppression is obtained by applying so-called dynamic (time-varying) weights to the velocity sensitivity kernel.
So, by suppressing the high wavenumber parts not related to the background velocity model we will enhance the sensitivity to the low wavenumber components in the response. Using that response, the background velocity can be updated by minimizing the difference between the observed and the model background velocity.
This is also called “separation of scales”.
Ref: Martin et al., FB, March 2021
Martin et al., EAGE 2017, Th B1 09
Ramos-Martinez et al., EAGE 2016, Th SRS2 03
Geophysical curiosity 35
R( ) can be expressed as ½ EI/EI=½ ln(EI), this results in:
EI( )=Vp(1+tan² )Vs(-8Ksin² ) (1-4Ksin² )
Neglecting the third term in the linear approximation results in:
EI( )=Vp(1+sin² )Vs(-8Ksin² ) (1-4Ksin² )
So, replacing tan2 by sin2 in the EI expression is equivalent to ignoring the third term in the linear approximation.
Note that for K=0.25, EI(90°)=(VP/VS)2 in the last expression.
Ref: Conolly, The Leading Edge 1999, V18, N4, P438-452.
Another simplification, based on neglecting the third term is given in
Veeken & Silva, First Break 2004, V22, N5, P47-70.
Geophysical curiosity 34
Given a plane interface between two anisotropic TVI media. That means in each medium we have different vertical and horizontal velocities: V1v, V1h, V2v and V2h .
Question: Which velocities are used to describe Snell’s law?
Answer: Ref: James, First Break 2020, V38, N12
Based on the above figure, the following relationships can be derived:
AD=ABsin 2 and BC=ABsin 1 ,
tAD=AD/V2h=ABsin 2/V2h and tBC=BC/V1h=ABsin 1/V1h
as tAD=tBC it follows that ABsin
This is Snell’s law, but involves the horizontal velocities.
A generalisation showing that Snell’s law is valid for any anisotropic medium.
Note that for a horizontal interface the projections equal the horizontal velocities, otherwise they are the velocities parallel to the interface at the reflection/transmission point.
Geophysical curiosity 33
Performing the wavefield separation above the sea floor results in an
- Up-going wavefield containing the primary reflections, refractions and all multiples other than receiver-side ghosts and a
- Down-going wave field containing the direct arrivals plus the receiver-side ghosts of each upgoing arrival.
Another use of separating the wavefields is to deconvolve the upgoing wavefield by the down-going wave field. This has the following advantages:
- Fully data-driven attenuation of free-surface multiples without acquiring adaptive subtraction
- Mitigating the effects of time- and space-varying water column velocity changes (4D)
Deconvolves the 3D source signature in a directionally consistent manner.
Geophysical curiosity 32
Physics Informed Neural Networks (PINN)
A disadvantage of the “traditional” Neural Networks is that the resulting model might not satisfy the physics of the problem considered. Luckily, there are 2 ways of overcoming this issue, using as an example the eikonal equation.
One way is to use synthetic data that has been generated using physics. For example, if the travel-time data from a source to multiple receiver points (x, y, z) are calculated using the eikonal equation, then the relationship between a velocity model and the travel-times satisfies the physics of wave propagation. If that (training) data is used to build a machine learning model, then that model will satisfy the underlying physics. The Neural network, we will use, consists of 3 input nodes (x, y, z) and one output node (Travel time) and will be trained using many receiver locations. This model can then be used to calculate the travel times for other locations in the same model or even for calculating travel times in other velocity-depth models (Transfer Learning).
The other way is to include the eikonal equation in the loss function. That means that the loss function now consists of an eikonal expression, a data difference between observed and estimated travel time and a regularization term. The resulting ML model will (to a very large degree) satisfy the wave propagation physics. Note that in this case the data even does not need to be labelled, that means the data does not need to contain the true travel times.
An interesting application is to build a ML model for determining the source location for observed arrival times. Using distributed source points in a velocity-depth model the algorithm can be trained to find the source points from the observed times. Applying this algorithm to a new micro-seismic event, the location of this event can then be predicted (in no time, as the application of a ML model needs very little compute time).
Geophysical curiosity 31
Vector Magnetic Data
In Airborne Gravity we use the three components of the gravity field (Full Tensor Gravity) measurements extensively. In Airborne Magnetics the low precision of the aircraft attitude hindered the use of multi-component magnetic data. With a novel airborne vector magnetic system (Xie et al, TLE 2020, V39, N8) the three components can be recorded with sufficient high Signal-to-Noise ratio to allow accurate calculation of geomagnetic inclination and declination.
In test flights the new system has a noise level of 2.32 nT. The error of repeat flights was 4.86, 6.08 and 2.80 nT for the northward, eastward and vertical components.
A significant step forward.
Geophysical curiosity 30
A joint distribution of two random variables A & B is defined as P(A∩B). If the two random variables were independent than P(A∩B)=P(A)P(B). However, when the variables are not independent, the joint distribution needs to be generalized to incorporate the precedent that we either observe A with knowledge of B or vice versa. Bayes’s theorem guarantees that the joint distribution is symmetric in relation to both situations, that is:
P(A)P(B|A)= P(A∩B)= P(B)P(A|B)
Bayes’s rule is seldom introduced in this way. Ref Jones, FB219, Issue 9
Geophysical curiosity 29
The wave equation is derived by combining constitutive and dynamical equations. The constitutive equation defines the stress-strain relationship and the dynamical equation shows the relation between force and acceleration. The most-simple constitutive equation is Hooke’s law ( =C. , with C the stiffness matrix, or =S. , with S the compliance matrix, S=C-1). All the medium properties can be defined in the stiffness matrix, thus also all kinds of anisotropy: Polar (5), Horizontal (5), Orthorhombic (9), Monoclinic(12) and Triclinic (21) anisotropy, where the numbers indicate the number of independent matrix components. 5 for horizontal layering, 9 for horizontal layering with vertical fractures, 12 for horizontal layering and slanting fractures and 21 for the most general case. So, the stiffness matrix makes the wave equation complicated. In addition, it is whether we consider the acoustic, visco-elastic or poro-elastic case which further complicates the wave-equation.
The poro-elastic model is the least known. It is maybe not so relevant for seismic wave propagation, as the frequencies are low. But in the kHz range it can not be ignored. The difference is because at low frequencies the pore fluids moved together with the rock, whereas at high frequencies the pore fluid tend to stay stationary, that means move relatively in the opposite direction. Now here comes the important part. In a poro-elastic medium there exists a fast P-wave (fluid moves with the rock) and a slow P-wave (fluid moves in opposite direction). So, in principle, if we could measure the slow wave, we might be able to conclude something about the permeability, namely the lower the permeability the less the pore fluid can flow in the opposite direction. Hence, maybe one day…
Geophysical curiosity 28
Although I only became aware recently, Gramian constraints have been used for many years now. What are Gramian constraints and where are they applied?
Gramian constraints are used in inversion as one of the terms in the loss function. Inversion tries to find the model parameters that minimize a loss function. In a loss function several terms can be present: difference between the observed and the modelled data, difference between starting and latest updated model, smoothness of model and Gramian constraints.
Gramian constraints use the determinant of a matrix in which the diagonal elements are autocorrelations of the model parameters and the off-diagonal elements the cross correlations between the parameters. These parameters are normalised (weighted), which is especially important if they have different magnitudes as in the case of joint inversion of different data sets (seismic, gravity, magnetic, EM)
In the minimization of the loss function, parameter values are searched for which among others aims for a minimum of the determinant, which means tries to maximize the off-diagonal elements. This means it tries to increase the cross-correlations between the parameters. It will only be able to do so within the restrictions: other terms in the loss function shouldn’t start increasing more than the decrease in the Gramian term and it will only increase the off-diagonal terms if the data indicate cross-correlations exist. Hence, the use of the expression “It enhances the cross-correlation (if present)”.
Note that the Gramian can be based on the parameters, but also on attributes derived from the parameters (e.g. log(parameter)). It is useful to linearize the relationship as cross-correlations relate to linear relations only.
When also spatial derivatives of model parameters are used, we can incorporate coupling of structures of different properties (cross-gradient minimization).
Geophysical curiosity 27
Jan. 31th 2020
Ray or wavefront angle for AVA?
The question is, should we be using the ray or the wavefront angle of incidence on the interface when we apply the Zoeppritz equations. In case of a homogeneous subsurface, rays (high frequency approximation) are perpendicular to the spherical wave fronts. This is even so for inhomogeneous media, although the wave fronts will no longer be spherical.
For anisotropic media we distinguish a difference between a ray and wave front normal direction, and also between the wavefront normal velocity (phase velocity) and energy propagation velocity (ray velocity).
As the wavefront can be considered the result of constructive interference of plane waves, it follows that the angle of incidence of the wavefront normal on the interface should be used in the Zoeppritz calculations.
However, when we plot AVA, we usually plot the reflection amplitude as function of the angle of departure from the surface, hence the ray angle again. See the display below from the 2002 Disk course by Leon Thomsen.
Geophysical curiosity 26
Reading the article “CSEM acquisition methods in a multi-physics context” in the First Break of 2019, Volume 37, Issue 11, the following question, which has been bothering me for some time, came up: An EM field has coupled electrical and magnetic
components. The information in both is the same. So, we can measure either E or H and will receive the same information on the subsurface. If so, why not use H as the (x,y,z) receiver instruments are much smaller and operationally easier to
use. Is the signal-to-noise ration the issue?
I was pleased to receive the following answer from Lucy MacGregor: You’re absolutely right. The signal contains both E and H fields and they contain much the same information. There have been various modelling studies that have demonstrated that electric fields are generally used in preference to magnetic fields precisely because of signal to noise ratio. Magnetic fields are much more susceptible to motional noise. Overcoming this was the reason Steve Constable started using huge concrete weights. He needed to keep the instruments still to record magnetic fields for MT. For this reason, it would be hard to put magnetic field sensors in a cable, all you would see is cable motion. Now, of course E fields are also susceptible to cable motion, but you can get over this by using a relatively long dipole to average it out (which is what PGS did and that worked well).
In addition, I like to mention that in a later article in the same issue, vertical pyramidal structures were used to stabilise the vertical long E dipoles placed on the seafloor.
Geophysical curiosity 25
Jan. 10th 2020
Non-seismic geophysics gets on average little attention in the industry. Although seismic is the main method in many applications in the energy industry, non-seismic methods play a more dominant role in geothermal energy, pollution mitigation and certainly with respect to mineral exploration.
In the forthcoming energy transition, rare-earth elements are essential. Having them available will be an important (political) item in the future. Therefore, non-seismic geophysical methods will become more important. An interesting example can be found on the following website:
Geophysical curiosity 24
An interesting new development from TesserACT.
Nyquist’s theorem has been the guiding force behind a relentless striving to achieve ever more regular sampling geometries. Over the last few years, “compressive sensing” has challenged this orthodoxy. In lay man’s terms, compressive sensing offers the promise of stretching Nyquist a little in order to achieve better data quality (or lower cost) by removing the artefacts created by regular grids:
Conventional acquisition: shots & receivers Randomized acquisition: shots & randomized receivers
Conventional acquisition: shots & receivers Randomized
acquisition: randomized shots & receivers
Although it has been known for a long time that the Nyquist restriction could be relaxed by randomization, it is only due to more accurate & efficient positioning that the application became operationally feasible.
Geophysical curiosity 23
To work with the hottest topic of today, that is Machine learning, one needs to familiarize oneself with new terms. Often it is a matter of semantics, using a different name for a well-known item. This unfortunately acts as a barrier. Examples are feature for attribute, instance for case, etc. It starts already with the name Machine Learning. Years ago, I learned about Pattern Recognition, then later it was called Artificial Intelligence. Now it is Machine Learning or Deep Learning. Only a matter of getting used to.
Geophysical curiosity 22
April 24th 2019
Steerable total variation regularization
In signal processing, total variation regularization (TV), is based on the principle that signals with excessive and possibly spurious detail have high total variation, that is, the integral of the absolute gradient of the signal is high. According to this principle, reducing the total variation of the signal, subject to staying close to the original signal, removes unwanted detail whilst preserving important details such as sharp contrasts.
This noise removal technique has advantages over simple techniques such as linear smoothing or median filtering (although also known to be edge preserving), which reduce noise but at the same time smooth away edges to a greater or lesser degree. By contrast, total variation regularization is remarkably effective at simultaneously preserving edges whilst smoothing away noise in “flat” regions, even at low signal-to-noise ratios.
Total Variation is used in FWI to regularize the output of the non-linear, ill-posed inversion.
A new development in TV is steerable total variation regularization. It allows steering the regularization of the FWI solution towards weak or strong smoothing based on prior geological information.
Hold-on, doesn’t that lead to a very pre-conceived result. That could indeed be the case when not done carefully.
However, there is a way in which it can be done by including that prior information in the regularization parameter rather than in the misfit term as done usually. A spatially variant regularization parameter based on prior information where sharp contrasts are expected can be used. It will first reconstruct the high contrasts (salt bodies) followed by filling in the milder contrasts (sediments) at later stages.
Ref: Qui et al, SEG2016, P1174-1177
Geophysical curiosity 21
April 24th 2019
A most curious metric
In Full-Waveform Inversion the subsurface model is updated by minimizing the difference between synthetic/modelled and observed data. This is usually done by a sample-by-sample comparison (trace-by-trace or globally) using the L2-norm. The advantage of the L2-norm is that it is an easily understood and fast comparison, the disadvantage is that the model update is prone to errors due to poorly updating the low wavenumber trends, also mentioned as cycle skipping (more than half-wavelength shifts). Several ways of overcoming this issue are available. The most common one is to use a multi-stage approach, in which we start with the lowest available frequency (longest wavelength=largest time difference for cycle skipping) and then using increasing frequencies (Hz) in subsequent stages (2-4, 2-6, 2-8, 2-10, etc).
Now recently, at least for geophysics, a new difference measure has been used with great success, the quadratic Wasserstein metric W 2. This metric appears to me as curious. It namely computes the cost of rearranging one distribution (pile of sand) into another (different pile of sand) with a quadratic cost function (cost depends for each grain on the distance it needs to be moved/transported). This explains the term “optimal transport” which is often mentioned.
The advantage of using the W2 metric is that the cost function is much more convex (a wider basin of attraction), mitigating cycle skipping and hence less sensitive to errors in the starting model.
However, some extra data pre-processing is needed to apply this method. The synthetic/modelled and observed data need to be represented as distributions, which means non-negative and sum-up to unity. This can easily be achieved.
Another interesting approach is not to use the samples themselves, but first map the data to another more “suitable” domain, for example the Radon domain.
And then there is another “unexpected” improvement by not using the samples themselves, but by using the matching filter between observed and modelled data. If the two are equal, the matching filter would be a delta function. Hence, the model update is done by minimizing the Wasserstein distance between the matching filter and a delta function (for stability a delta function with a standard deviation of 0.001 is used). And apparently, this cost function allows updating the subsurface model.
Great, isn’t it.
Ref: Yang et al., Geophysics, 2018, V83, N1, P.R43-62
Geophysical curiosity 20
How to explain matching pursuit?
I will use the above plot from Wikipedia to explain matching pursuit. Given a time signal, we like to decompose it into a minimum number of components. The number of possible components, contained in a dictionary, is large, not to say unlimited. So, I start by matching the signal first with the most important, best matching component, then I take the next one and I keep on doing this, I pursue to get a better match, till I am satisfied with the match, which means the signal has been sufficiently approximated using “only” N components.
In the example above, the axes are time vs frequency, the components are cosine functions in the frequency range 0-60 Hz and the results show that time signal can be well represented by the centres of the ellipses.
Geophysical curiosity 19
April 2th 2019
Inversion of seismic data
The simplest definition of seismic data inversion I came across is:
“Going from interface to Formation information”
Geophysical curiosity 18
Do shear waves exist in an acoustic medium?
This is a curious question as it is well known that shear waves do not exist in an acoustic medium, because that is the “definition” of an acoustic medium? In an acoustic medium equals 0, hence V2S= , which means shear waves cannot propagate and hence cannot exist.
But another definition of an “acoustic” medium is also used (Alkhalifah, Geophysics, V63, N2, P623-631), namely defining it by V S0=0, that is the vertical shear wave velocity equals 0. This definition of acoustic is used to simplify the equations for P-wave propagation. However, it defines the medium to be truly acoustic for an isotropic medium only. In case of VTI anisotropy the story is different. Namely, in that case a shear wave exists as demonstrated by the (phase velocity) expression below: Grechka et al, Geophysics Vol 69, No.2, P576-582.
Fig.1: Wavefronts at 0.5 s generated by a pressure source
in VTI medium with VP0=2 km/s, VS0=0 and =0.4, =0.0
From the formula we see that except for the vertical ( =0°, 180°) and horizontal (+/-90°) directions a SV wave exists in case (the VTI anisotropy is not “elliptical”). Wavefront modelling (Fig.1) indeed shows that a curious shear wavefront is generated in accordance with expression (1).
Hence, defining the medium to be “acoustic” by putting the vertical shear wave velocity (VS0) to zero simplifies the P-wave equations, but do not make the medium truly acoustic. For anisotropic media, shear waves, are still excited in the medium. For modern high-resolution seismic modelling, the “noise” generated is not negligible.
Thus, putting VS0=0 doesn’t make the medium truly acoustic in case of anisotropy.
P.S. To be fair to Alkhalifah he called it an acoustic approximation.
Geophysical curiosity 17
Feb. 20th 2019
Cloud computing in Geophysics
Cloud computing is hardly used in geophysical processing. Whether that is due to security issues or the amount of data involved is not clear to me. Interestingly enough a recent publication by Savoretti et al, from Equinor in the first Break, V36, N10 gives some insight in the approach.
It is well known that cloud computing provides access to a scalable platform, servers, databases and storage options over the internet. The provider maintains and owns the network-connected hardware for these services.
It offers increased flexibility in access to compute power and data sets are easily transferred to the cloud during project work and removed afterwards. In cluster terms an allocated compute node participates in running workloads via a queue system, which keeps track of what can be done on which node most efficiently.
What I didn’t know is how it is commercially arranged. It appears that nodes can either be ordered permanently or from a “spot market” for lower prices. However, spot market nodes can be taken away suddenly in case there is a higher bidder. The work done so far on that node will then be lost.
To minimize the effect, the workflow has been adjusted to better tolerate loss of nodes in spot-market environment. Since then better use can be made of the cheaper spot-market nodes by automatic recovery and more fully automated mode ordering.
Geophysical curiosity 16
Dec. 24th 2018
Machine Learning, also called Artificial Intelligence or Algorithms, is today’s hot topic. It is also associated with “Big Data”, indicating that its use can mainly be found in case an overwhelming amount of data need to be explored.
Is Machine Learning also useful in geophysics? Before answering this question, I like to pose another question: Will Machine learning replace the Physics in Geophysics? By that I mean do we still need formulae and equations to explore for hydrocarbons, salt/fresh water, polluted/unpolluted subsurface, etc? To answer this question let’s look at “traditional” versus “new” geophysics.
In the traditional geophysics we try to interpret observations / data by applying two main methods: forward modelling and inversion. In forward modelling we assume a (best guess) model of the subsurface and use equations, like the visco-acoustic wave equation for seismic, to derive related observations (synthetic data). By comparing these synthetic data with the real observations, we derive an update of the assumed model using inversion. The success depends heavily on the accuracy (adequacy) of the forward modelling and the ability of solving the ill-posed inversion. So, it is essential to fully understand the physics of the problem.
In Machine Learning it is different. An understanding of the physics is useful, but not essential. It is now a matter of having a sufficiently large learning / labelled data set which (hopefully) contains a statistical relationship between the observations and the related subsurface. This relationship, which is nothing else than a “correlation”, will be derived by a clever algorithm that contains parameters, which are determined by tuning the algorithm using the learning (labelled) data set. This tuning will determine the parameter values resulting in a predictive capability of the algorithm with an appropriate accuracy. Among the numerous kinds of algorithms, the most popular ones are called Neural Nets.
Neural Nets consist of an input layer, hidden layers and an output layer, each containing nodes.
An input node is fed with a seismic sample or an attribute /feature from a point in the 3D data set. A node in a hidden layer would receive weighted inputs from all input nodes and depending on the sum provide input to the nodes in the output layer. Each node in
the output layer represents a class, say lithology or fluid saturation of a reservoir interval. The sum of each output node would indicate whether the input belongs to its class. In the learning phase the weights will be determined. Once determined,
the network would be able to predict the class /character of new input data (undrilled locations). It can also be used in an unsupervised way to determine clusters of data, which then still need to be interpreted. In general, more hidden layers
are needed to be able to handle more difficult problems.
Now, if you like the physics of geophysics, that is formulae and equations give you a warm feeling as they are an unambiguous, short and clear language describing physical processes, then you will have a tendency to give preference to traditional geophysics, but if you need practical results to earn an income (say predictions of fluid saturations in carbonates), then you will have to rely on the benefits of applying machine learning in addition to an understanding of the underlying physics.
So, the answer to the question: “Which one will be the winner?” will be in my opinion: Neither of the two, it will be a combination of both.
Geophysical curiosity 15
Up to now I didn’t know of the existence of quaternions and that they can also be used in geophysics. What is a quaternion?
A complex number is defined as a+ib, a quaternion as q=a+bi+cj+dk, hence it is a hypercomplex number.
It allows a Fourier domain representation of multi-component seismic data. For example, so-called “Projection Onto Convex Sets” (POCS) reconstructs all three seismic components at once.
Applications are in image disparity methods (edge detection), interpolation, time-lapse analysis, separation of up- and down-going waves (VSP) and calculation of attributes.
However, so far I have encountered very few “real” applications.
Geophysical curiosity 14
Migration or Wavefield reconstruction versus Imaging
It is important to make a distinction between wavefield reconstruction and imaging.
Migration is a two-step process: 1) wavefield reconstruction and 2) imaging.
Wavefield reconstruction is the determination of the wavefield over a volume of interest. Imaging involves the making of a picture of the geometrical distribution of the reflecting surfaces within the medium.
This is best illustrated by the fact that we perform a downward continuation, forward in time, of the wavefield generated by a source and a downward continuation, but now backward in time, of the recorded wavefield, followed by the application of an imaging condition to obtain the reflection point in space and its reflection strength.
Again, we have various methods for wavefield continuation, the more sophisticated the more accurate (acoustic, elastic, visco-elastic) and various so-called imaging conditions (cross-correlation, deconvolution, least-squares deconvolution).
Geophysical curiosity 13
Question: Is seismic a hologram?
An optical hologram is not an image; it is a recording of a wavefield. Remember how a hologram is made: A single frequency laser light is split into two parts. One part is mirrored directly onto the recording plate; the second one is reflected / diffracted by the object. What is recorded on the plate is the interference pattern between the two.
When we beam the laser onto the “developed” plate and look through we see the object in 3D. Looking from different directions we see the object from these different directions. Even more fascinating is that when we break off a part of the plate, we can still see the total object from all directions, although the smaller the remaining piece the poorer the quality of the image.
Remember that in seismic we can consider the subsurface to consist of a collection of diffraction or scattering points. This would imply that a single shot recorded by a single geophone would record diffraction energy from any point of the subsurface. Hence, the single record contains information about the entire subsurface and it would be possible, in principle to reconstruct/image the complete subsurface. This is like the concept that a single point on a hologram is enough to observe the object in 3D. But again, we would prefer a larger piece of the plate, i.e. more observations, to obtain a better-quality image.
If we would apply holographic seismic imaging, we could do with a much sparser and therefore cheaper survey.
“Extended resolution: Neidell is right”, E.A. Robinson, TLE 2018V37, N1
‘holistic migration”, E.A. Robinson, TLE 1998, V17, N3
Added statements by Öz Yilmaz
“Spatial under sampling can be compensated by temporal oversampling”
“Temporal oversampling enables the extension of the spectral bandwidth beyond the Nyquist frequency”
Geophysical curiosity 12
What is the difference between updating a velocity model in the data or in the image domain? Are we using the same data?
Updating of velocities means that we have an estimated velocity-depth model and look for diagnostics and ways which will allow improving the velocities.
We have observed the arrival times at all receivers for all shots. We know the shot and receiver locations at the surface and a rough/first estimate of the velocity model, which will be represented by a grid of cells, with for each cell a velocity . Then we apply raytracing from each shot to the receivers. That means that the ray will go through a subset of all cells. For each ray we can calculate the arrival time by adding all the calculated from the pathlength in each cell divided by the cell velocity . We will find a difference between the calculated/modelled and the observed travel time. This difference could come from anywhere along the ray. So, we simply divide the difference over the total ray path, leading to a small velocity update in each cell traversed. Having many shot-receiver combinations, means that (hopefully) each cell will be traversed by many rays. Each traverse will give a small contribution to the update. If they agree the sum will be a significant update, if they disagree the sum will be “zero”, that means no update. That’s all there is to it in the data domain. So we don’t do any migration/imaging.
We migrate the recorded data using the first estimate of the velocity-depth model. The results are Common Image Gathers (CIG). They will show residual move-out (diagnostic), which will be used to update the velocity model.
Although in both cases we use the same data, the advantage of using the image domain is that the signal-to-noise ratio will usually be much better. The disadvantage is that it needs more computer time.
Geophysical curiosity 11
What happens with reflections beyond the critical angle?
Let’s look at AVA, that means the reflection coefficient with angle of incidence (offset). This can for a flat interface between two homogeneous layers be calculated exactly using the Zoeppritz equation, but also approximated quite accurately to first order in the velocity and density contrasts using an expression formulated in terms of horizontal slowness p (Aki & Richard, Quantitative Seismology, eq. 5.44):
Aki & Richards mention that the linearized expression is only valid when none of the angles are near 90°. When the P-wave velocity below the interface is larger than above, the angle of transmission
However, the expression can also be given in terms of the average angle
but now the transmitted angle can become complex:
As a consequence, the reflection coefficient becomes complex. This means that the amplitudes (trace samples) become complex.
This makes us realize that for AVA, we calculate at an angle of incidence the reflection amplitude for each sample. That means looking at a sequence of samples, forming a wavelet, the same wavelet, but scaled with the reflection coefficient, will be returned. But only in case the reflection coefficient is real. If complex, all samples will have the same phase change and the reflected wavelet will have a different appearance. As the phase change is angle (offset) dependent, the wavelet change will be angle (offset) dependent, complicating the analysis somewhat.
Ref: Downton & Ursenbach, Geophysics, 2006, V71, N4.
Geophysical curiosity 10
Systematically organised fractures or cracks play an important role in seismic wave propagation and in fluid flow of reservoirs. If of a size significantly below seismic wavelength their effect on seismic wave propagation can be described by anisotropy.
For a VTI medium we estimate the Thomsen weak anisotropy parameters from travel time information of P and PS or S wave.
For a HTI medium, caused by the presence of a single set of cracks, we estimate the Thomsen-type parameters (v), (v), (v), (v) from either the azimuthal NMO velocities or AVA gradients of P and PS waves or from shear wave splitting (birefringence), in which case the fracture infill (brine, oil, gas, dry) has no influence. Note that for P or PS waves the infill has an influence.
There are several crack models, based on isolated infinite fractures with linear slip boundary conditions (Schoenberg), isolated penny-shaped cracks (Hudson) and hydraulically connected cracks and matrix pore space (Thomsen). Fortunately, seismic data are not sensitive to the shape of the fractures.
However, seismic is sensitive to pore fill & saturation and whether the pore spaces are hydraulically connected, which depends on the pore throat micro-geometry and the fluid properties. At seismic (low) frequencies, connection (pressure equilibrium) between fracture and medium pore space can be assumed. “Squirt flow” occurs at frequencies between sonic and ultrasonic bands. In that case Gassmann theory should be replaced by Biot theory. At high frequencies (ultrasonic) no fluid interaction occurs between fracture and matrix pore space. At very high frequencies Rayleigh scattering becomes important. Note that at intermediate frequencies, dispersion will occur due to the fluid flow between fracture and matrix pore space.
An empty crack is compliant because of its 2D shape, equant (non-flat) pores are stiffer. If a crack is isolated or connected to other cracks of similar shape and orientation than a pore fluid will stiffen the crack(s). If connected with equant matrix pore space, the fluid will “squirt” into the stiffer matrix pore space. Hence, the pores effect the compliance, that is the anisotropy.
There several ways of describing the presence of cracks or fractures. One way is to describe the influence of fractures on the wave propagation in terms of the parameters (v), (v), (v), (v) (penny shaped cracks). The other way is to consider the fracture properties or excess fracture compliance parameters, thus the additional compliances due to fractures. Two sets are used. One uses KN (excess normal fracture compliance) and KT (excess tangential compliance). The other set uses similar, but dimensionless and normalised parameters, N (normal weakness) and T (tangential weakness) together with the Lamé parameters and (linear slip theory). T gives estimate of the crack density and T / N is sensitive to fluid saturation. Obviously, the sets are related.
Crack density can be derived from the seismic observations of P, PS and/or AVA gradients (azimuths parallel/strike or normal to the fractures) or from shear wave splitting.
The main difference, except for shear waves, is between pore space filled with fluids (brine, oil) or filled with gas (dry). The dry-crack model is close to elliptical ( (v)= (v)), for fluid-filled cracks (v) is close to the crack density for typical background V s/Vp ratios (around 0.5). In the ( (v), (v)) plane all possible pairs of (v), and (v) are confined to a narrow area.
An important aspect is the frequency of the seismic wave. Low frequent (seismic) means that there is enough time for the pressure to equalize in fractures and matrix (equant) porosity, so the entire pore volume is involved. For moderately high frequencies (ultra-sonic) only the fracture volume is involved.
A complete characterization of fractures can be based on wide-azimuth 3D P-wave data or a combination of P and PS data or S data.
Geophysical curiosity 9
I thought I knew a reasonable number of statistical distributions, until I went to Wikipedia (https://en.wikipedia.org/wiki/List_of_probability_distributions)
and realised I only knew a few. (So, it is true that you don’t know what you don’t know).
More interesting, however, is that I recently came across a distribution called logistic distribution. What the name means I am lost, as logistics refers to moving goods from one place to another. However, I found it a very useful distribution for geophysics, as it can handle limited interval distributions of geophysical parameters of different magnitudes. Again, a description can be found in Wikipedia ( https://en.wikipedia.org/wiki/Logistic_distribution)
An interesting application can be found in an article by J. Eidsvik, D. Bhattacharjya and T. Mukerji: Value of information of seismic amplitude and CSEM resistivity. (Geophysics, 2008, Vol73, N4)
Geophysical Curiosity 8
The local angle domain. (LAD, Koren & Ravve, Geophysics, 2011, V76, N01).
When we consider fractures, we look at the source-receiver azimuth as an important parameter. Using either the NMO velocity as a function of azimuth (low resolution) or the AVA as a function of azimuth (high resolution) we determine the orientation of the fracture system(s). But is this correct?
It might be, but not always. Namely, what is important is the local azimuth of the “incident ray” and the “reflected ray” at the reflection point, because they are directly influenced by the local fracture orientation at both sides of the interface. So, in case the “local azimuth” is significantly different from the surface source-receiver azimuth, it is important to consider it.
But how often does this occur? I think seldom, but yes in cases of complicated geology and stress fields. Now those occur often in places where significant hydrocarbons can be found and where drilling is very expensive: deep water with reservoirs below salt or basalt.
So, there are cases when considering the local angle domain (LAD) is a must for obtaining a correct image of the subsurface.
Geophysical Curiosity 7
A P-wave reflected by an interface will change in amplitude according to the reflection coefficient. Hence, when the acoustic impedance (ρvp) increases across the boundary, a positive amplitude (meaning the wavefront starts with a compression)
will remain a positive amplitude upon reflection.
But how about a S-wave? What will happen to the amplitude when reflected by an interface where the shear impedance (ρvs) increases?
The first thought would be that the reflection coefficient will be positive, so the amplitude will keep the same polarity.
But, no, no, that is not the case. How confusing. Why can that be?
The answer lies in the fact that we use in the industry a moving reference frame.
So, for P waves, the positive direction is downwards on the way down and upwards on the way up.
What does this mean for S waves, say SH? It means on the way down the positive shear direction is to the right and on the way up the positive shear is of course also to the right. But if the, say SH wave front start with a movement to the right and the shear impedance increases across the interface, on the way up the movement will be in the same direction. However, now that direction is negative due to the choice of the frame we are using. So, the reflection coefficient is negative for an increase in shear impedance.
I hope it is no longer confusing.
Geophysical Curiosity 6In Seismic processing often, the word “Stacking” is mentioned. Traditionally, stacking referred to applying an NMO correction to offset traces and adding them to form a stack trace, with improved signal-to-noise ratio.
But do we still do stacking? Yes, we still do stacking, but in a different way. The stacking we do takes place in the imaging part of processing. Namely, in a modern processing scheme the data is prepared for imaging and after each offset is imaged, then the results are stacked. This can be most easily seen in case image gathers are built and displayed. After inspection and possibly re-imaging, these are the traces that are stacked.
Geophysical Curiosity 5
We are all aware of gravity, especially when you fall of your bike after too many drinks. Especially then it looks that gravity is a powerful force, although strictly speaking it is an acceleration. Interesting enough, gravity is the weakest force
in nature. How can that be? You would agree when you realize that it needs the whole earth, that is a mass of 6 1024 kg, to make me feel it. Also, according to Newton, if I meet another person, I should feel the “attraction” between
us, but I don’t. This shows how weak gravity is.
Geophysical Curiosity 4“How confusing can you make it?” Take the description of anisotropy. The confusion is mainly due to using misleading symbols and different notations for the same quantities. Two examples: 1) using the symbol C for incompressibility and 2) the use of (2) in one set of publications and 1 in other publications, both describing the same anisotropy parameter for propagation in the (x1,x3) plane. So, when reading publications make sure you are aware of the definitions.
In describing wave propagation, the elastic and anisotropy parameters occur in the so-called stress-strain relationship which needs to be combined with Newton’s first law of motion to obtain the wave equation. In principle (“worst case”: triclinic symmetry) there are 21 different (C ij) parameters needed to describe the medium. Luckily in practice we can get away with fewer parameters and in the “best case” with only 2. For almost all practical applications we have the following cases: orthorhombic (7 Cij parameters), Transverse Isotropic (5 Cij parameters) and Isotropic (2 Cij parameters). Orthorhombic describes a finely layered medium with vertical fractures, Transverse Isotropic describes either fine layering or a set of vertical fractures only, whereas Isotropic describes a medium without any sub-seismic scale layering nor fractures.
For weak anisotropy, we use the following parameters:
Orthorhombic: Κ, , (1), (2), (1), (2) , (3), (1), (2),
Transverse Isotropy: VTI: (Κ, , , , ), HTI: (K, , (v), (v), (v))
Unhappily, all rocks should be considered triclinic, but in reality, we can “ignore” that. We wouldn’t be able to derive the needed 21 parameters from seismic observations anyway. So, let’s forget it and restrict ourselves to at most orthorhombic or tilted orthorhombic in case of dipping structures.
One quantity we can observe quite well with modern wide- or multi-azimuth acquisition and that is the orientation of fracture sets. Namely, the NMO velocity of waves travelling perpendicular to fractures is lower than waves travelling parallel to fractures. In addition, if S-waves can be observed (multi-component land or OBC acquisition) shear-wave splitting also indicates the orientation of fractures..
Geophysical Curiosity 3
“Keep it simple, you stupid” This slogan is of course based on the well-known political slogan “It’s the economy, you stupid” and here refers to “why introducing anisotropy parameters” in the description of wave propagation, given that
a full description of wave propagation is already complicated enough. The reason is that in general we don’t honour the real Earth when we try to model geophysical observations. In 1992, I gave an “oration” when I became Professor at the University
of Utrecht, called “Do we do justice to the Earth” with a wink to my wife’s occupation as a judge. It must be clear, that the title refers to all the approximation we make to make geophysical problems tractable. If we want to model exactly
the seismic response of real rocks, that means including small-scale inhomogeneities, we need complex mathematical algorithms and still can only do the modelling in detail for specific, well-described cases. And even then, you might wonder
whether in practice we know or even can use such detail.
It is exactly for that reason that effective media are used. In these media, the detail is “summarised/replaced” by “simple” anisotropy parameters. It might appear “making it more complicated”, but after having familiarised yourself with it, you realize that it makes the description of wave propagation more realistic and especially more applicable. Nice examples are shown in the AVA exercise in the “Quantitative Reservoir Characterization” course.
Geophysical Curiosity 2
We know that seismic energy can be propagated in the earth in two ways: P-waves and S-waves. Now in an isotropic elastic medium the P-wave particle displacement is in-line with the propagation direction and in case of an S-wave the particle displacement
is perpendicular to the propagation direction. This particle displacement can be in any direction as long as it is perpendicular to the propagation direction.
In anisotropic media, the situation is quite different.
For P-waves the particle displacement is deviating slightly from the propagation direction (ray direction), as the displacement is normal to the wave-front. The ray vector is namely no longer perpendicular to the wave-front.
For S-waves it is even more complicated. It was noticed in field experiments that the vertical S-wave reflection time from a horizontal interface differed depending on the polarization direction. Hence, the velocity did depend on the polarization. Therefore, it was concluded that two S-wave modes exist, one with the particle displacement parallel (S-fast), the other with the particle displacement perpendicular (S-slow) to fracture orientation (or fine layering). In addition, the S-wave can, and that is quite odd, only propagate with these two polarizations. So, if a source generates an S-wave with polarization in between, this S-wave will not propagate at all, but will split into the two modes discussed before and each mode will propagate with its own velocity. This is called “shear-wave splitting” or “birefringence”. And again, the polarizations are not exactly perpendicular to the propagation direction. Instead, they are perpendicular to the P-wave polarization, which as mentioned before, is neither in the propagation direction.
Geophysical Curiosity 1
When you look at the formula for the P-wave velocity, namely
Please feel free to comment or add points of view and indicate whether you would like it to be mentioned on the website or not