TFA: Time-Frequency Analysis#
An introduction to the TFA toolbox. This gives a basic demonstration of a wavelet analysis applied to the MAGx_LR data (magnetic field 1Hz). For more details, see the following pages/notebooks.
import datetime as dt
from swarmpal.io import create_paldata, PalDataItem
from swarmpal.toolboxes import tfa
Fetching data#
create_paldata
and PalDataItem.from_vires
are used together to pull the desired data from VirES.
collection="SW_OPER_MAGA_LR_1B"
selects the MAGx_LR dataset from Swarm Alphameasurements=["B_NEC"]
selects the magnetic vector, B_NEC, which we will analyseWe provide a model choice with
models=["Model='CHAOS-Core'+'CHAOS-Static'"]
. This can be changed to e.g.models=["Model=CHAOS"]
which provides the full CHAOS model, including the magnetosphere. The string provided here is a VirES-compatible model string which specifies the chosen model (see viresclient for more information).The start and end times select the period we want to analyse
The
pad_times
option supplies an extra padding of 3 hours of data before and after the analysis window, which prevents edge effects in the wavelet analysis
Running this code creates a DataTree
object in the data
variable. This contains all the data we will use in our analysis.
data = create_paldata(
PalDataItem.from_vires(
collection="SW_OPER_MAGA_LR_1B",
measurements=["B_NEC"],
models=["Model='CHAOS-Core'+'CHAOS-Static'"],
auxiliaries=["QDLat", "MLT"],
start_time=dt.datetime(2015, 3, 14),
end_time=dt.datetime(2015, 3, 14, 3, 59, 59),
pad_times=(dt.timedelta(hours=3), dt.timedelta(hours=3)),
server_url="https://vires.services/ows",
options=dict(asynchronous=False, show_progress=False),
)
)
print(data)
DataTree('paldata', parent=None)
└── DataTree('SW_OPER_MAGA_LR_1B')
Dimensions: (Timestamp: 35999, NEC: 3)
Coordinates:
* Timestamp (Timestamp) datetime64[ns] 2015-03-13T21:00:00 ... 2015-03-1...
* NEC (NEC) <U1 'N' 'E' 'C'
Data variables:
Spacecraft (Timestamp) object 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
QDLat (Timestamp) float64 -9.028 -8.964 -8.9 ... 27.26 27.18 27.11
Latitude (Timestamp) float64 1.825 1.889 1.953 ... 34.16 34.1 34.03
B_NEC (Timestamp, NEC) float64 2.175e+04 -4.166e+03 ... 2.572e+04
Longitude (Timestamp) float64 -15.47 -15.47 -15.47 ... 12.1 12.1 12.1
Radius (Timestamp) float64 6.835e+06 6.835e+06 ... 6.83e+06 6.83e+06
B_NEC_Model (Timestamp, NEC) float64 2.177e+04 -4.168e+03 ... 2.571e+04
MLT (Timestamp) float64 19.82 19.82 19.82 ... 7.813 7.813 7.813
Attributes:
Sources: ['CHAOS-7_static.shc', 'SW_OPER_MAGA_LR_1B_20150313T0000...
MagneticModels: ["Model = 'CHAOS-Core'(max_degree=20,min_degree=1) + 'CH...
AppliedFilters: []
PAL_meta: {"analysis_window": ["2015-03-14T00:00:00", "2015-03-14T...
Applying processes#
Now we will apply some processes to our data. These are all available within the tfa
submodule which we imported earlier.
TFA: Preprocess#
First we preprocess the data. To do this we use Preprocess
, and, using .set_config
, supply a configuration to control how that should behave. This will create a function that we can apply to the data - we will call this function p1
since it is the first process we will apply.
(Strictly speaking p1
is a special object rather than a regular function in Python, but has been made callable so that we can use it like p1(data)
to apply it to data
)
Some of the options are used here, but refer to the other demonstrations for more detail.
p1 = tfa.processes.Preprocess()
p1.set_config(
# Identifies the data we are working with
# (this dataset must be available in the DataTree above)
dataset="SW_OPER_MAGA_LR_1B",
# Selects the variable we want to process
# B_NEC_res_Model is a special variable that is added to the data
# when we also specify remove_model=True
active_variable="B_NEC_res_Model",
# Selects which component (0, 1, 2) to process
# With the NEC vectors, 2 will refer to the "C" (centre / downward) component
active_component=2,
# Identifies the sampling rate (in Hz) of the data
sampling_rate=1,
# Removes the magnetic model from the data
# i.e. gets the data-model residual / perturbation
remove_model=True,
)
Now we have a process p1
, we can apply that to the data:
p1(data);
The preprocessing plays a number of roles but most importantly it sets a new array, TFA_Variable
, in the data, which will be used in the following steps.
This data can be inspected directly using tools available through xarray:
data["SW_OPER_MAGA_LR_1B"]["TFA_Variable"]
<xarray.DataArray 'TFA_Variable' (TFA_Time: 35999)> array([ 4.47622169, 4.39655361, 4.28856185, ..., 10.55364824, 10.51208973, 10.509436 ]) Coordinates: * TFA_Time (TFA_Time) datetime64[ns] 2015-03-13T21:00:00 ... 2015-03-14T06... Attributes: units: nT description: Magnetic field vector data-model residual, NEC frame
data["SW_OPER_MAGA_LR_1B"]["TFA_Variable"].plot.line(x="TFA_Time");
The TFA toolbox also provides a function to inspect the data, tfa.plotting.time_series
:
tfa.plotting.time_series(data);
We can see an unphysical spike in the data. In this case we could have removed this bad data in the data request step (by filtering according to flag values), but here we will demonstrate the functionality of the TFA toolbox in removing this spike.
TFA: Clean#
Just as with Preprocess
, there is Clean
which provides a data cleaning function to remove outliers.
p2 = tfa.processes.Clean()
p2.set_config(
window_size=300,
method="iqr",
multiplier=1,
)
p2(data)
tfa.plotting.time_series(data);
TFA: Filter#
Next we use Filter
, which applies a high-pass filter to the data.
p3 = tfa.processes.Filter()
p3.set_config(
cutoff_frequency=20 / 1000,
)
p3(data)
tfa.plotting.time_series(data);
TFA: Wavelet#
Now we get to the most interesting part: applying a wavelet analysis.
p4 = tfa.processes.Wavelet()
p4.set_config(
min_frequency=20 / 1000,
max_frequency=100 / 1000,
dj=0.1,
)
p4(data);
The results of the wavelet analysis are stored as a spectrum in a new wavelet_power
array:
data["SW_OPER_MAGA_LR_1B"]["wavelet_power"]
<xarray.DataArray 'wavelet_power' (scale: 24, TFA_Time: 35999)> array([[0.00064316, 0.00066654, 0.00067728, ..., 0.00051327, 0.00056429, 0.00060842], [0.00064358, 0.00061374, 0.00057177, ..., 0.00066236, 0.00066698, 0.00066114], [0.00067738, 0.00062488, 0.00056653, ..., 0.00079024, 0.00076054, 0.00072284], ..., [0.02291506, 0.02339542, 0.02387444, ..., 0.02147165, 0.02195263, 0.02243394], [0.02148644, 0.02193917, 0.02238947, ..., 0.02011903, 0.02057585, 0.02103183], [0.0167584 , 0.01711765, 0.01747478, ..., 0.01567197, 0.01603515, 0.01639743]]) Coordinates: * TFA_Time (TFA_Time) datetime64[ns] 2015-03-13T21:00:00 ... 2015-03-14T06... * scale (scale) float64 10.0 10.72 11.49 12.31 ... 40.0 42.87 45.95 49.25
And we can view it with tfa.plotting.spectrum
:
tfa.plotting.spectrum(data)
(<Figure size 640x480 with 2 Axes>,
<AxesSubplot: xlabel='Time', ylabel='Frequency [mHz]'>)
Plotting#
We showed basic usage of tfa.plotting.time_series
and tfa.plotting.spectrum
above. These have some extra options to control their behaviour. For example:
Adding extra x axes with
extra_x
Selecting a specific time window with
tlims
tfa.plotting.time_series(
data,
extra_x=("QDLat", "MLT", "Latitude"),
tlims=("2015-03-14T03:00:00", "2015-03-14T03:30:00"),
);
There is also a quicklook plot that combines the time series, spectrum, and wave power (from tfa.plotting.wave_index
) into one figure:
tfa.plotting.quicklook(
data,
tlims=("2015-03-14T03:00:00", "2015-03-14T03:30:00"),
extra_x=("QDLat", "MLT", "Latitude"),
);