swarmpal.toolboxes.tfa.tfalib
=============================

.. py:module:: swarmpal.toolboxes.tfa.tfalib

.. autoapi-nested-parse::

   # INSERT ESA PROJECT BLOCK #

   @author: constantinos@noa.gr



Attributes
----------

.. autoapisummary::

   swarmpal.toolboxes.tfa.tfalib.R_E


Functions
---------

.. autoapisummary::

   swarmpal.toolboxes.tfa.tfalib.constant_cadence
   swarmpal.toolboxes.tfa.tfalib.moving_mean_and_stdev
   swarmpal.toolboxes.tfa.tfalib.moving_q25_and_q75
   swarmpal.toolboxes.tfa.tfalib.outliers
   swarmpal.toolboxes.tfa.tfalib.filter
   swarmpal.toolboxes.tfa.tfalib.morlet_wave
   swarmpal.toolboxes.tfa.tfalib.wavelet_scales
   swarmpal.toolboxes.tfa.tfalib.wavelet_transform
   swarmpal.toolboxes.tfa.tfalib.wavelet_normalize
   swarmpal.toolboxes.tfa.tfalib.magn
   swarmpal.toolboxes.tfa.tfalib.mfa


Module Contents
---------------

.. py:data:: R_E
   :value: 6371.2


.. py:function:: constant_cadence(t_obs, x_obs, sampling_rate, interp=False)

   Set data points to a new time-series with constant sampling time.

   Even though data are supposed to be provided at constant time steps,
   e.g. as times t = 1, 2, 3, ... etc, they are often not, as they can
   be given instead at times t = 1.01, 2.03, 2.99, etc or data points
   along with their time stampts might be missing entirely, or even be
   duplicated one or more times. This function corrects these errors,
   by producing a new array with timestamps at exactly the requested
   cadence and either moves the existing values to their new proper
   place, or interpolates the new values at the new time stamps based
   on the old ones, depending on the value of the `interp` parameter.

   `t_obs` is a one-dimensional array with the  time in seconds
   `x_obs` is a one or two-dimensioanl array with real values
   `sampling_rate` is a real number, given in Hz

   Note: Gaps in the data will NOT be filled, even when `interp` is True.


   :param t_obs: A 1-D array of real values.
   :type t_obs: (N,) array_like
   :param x_obs: A N-D array of real values. The length of `x_obs` along the first
                 axis must be equal to the length of `t_obs`.
   :type x_obs: (...,N,...) array_like
   :param sampling_rate: The sampling rate of the output series in Hz (eg 2 means two
                         measurements per second, i.e. a time step of 0.5 sec between each).
   :type sampling_rate: float
   :param interp: If False the function will move the existing data values to their
                  new time stamps. If True, it will interpolate the values at the
                  new time stamps based on the original values, using a linear
                  interpolation scheme.
   :type interp: bool, optional

   :returns: * **t_rec** (*(N,) array_like*) -- A 1-D array of the new time values, set at constant cadence.
             * **x_rec** (*(...,N,...) array_like*) -- A N-D array of real values, with the values of `x_obs` set at constant
               cadence.
             * **nn_mask** (*(...,N,...) array_like, bool*) -- A N-D array of boolean values

   .. rubric:: Examples

   >>> import numpy as np
   >>> import tfalib
   >>> import matplotlib.pyplot as plt
   >>> N = 10 # No Points to generate
   >>> FS = 50 # sampling rate in Hz
   >>> # create time vector
   >>> t = np.arange(N) * (1/FS)
   >>> # add some small deviations in time, just to complicate things
   >>> n = 0.1 * np.random.randn(N) * (1/FS)
   >>> t = 12.81 + t + n
   >>> # produce x values from a simple linear relation
   >>> x = 10 + 0.01 * t
   >>> # remove some values at random
   >>> inds_to_rem = np.random.permutation(np.arange(1,N-1))[:int(N/4)]
   >>> t_obs = np.delete(t, inds_to_rem)
   >>> x_obs = np.delete(x, inds_to_rem)
   >>> (t_rec, x_rec, nn) = tfalib.constant_cadence(t_obs, x_obs, FS, False)
   >>> (t_int, x_int, nn) = tfalib.constant_cadence(t_obs, x_obs, FS, True)
   >>> plt.figure(3)
   >>> plt.plot(t_obs, x_obs, '--xk', t_rec, x_rec, 'or', t_int, x_int, '+b')
   >>> plt.legend(('Original Points', 'Moved', 'Interpolated'))
   >>> plt.grid(True)
   >>> plt.show()


.. py:function:: moving_mean_and_stdev(x, window_size, unbiased_std=True)

   Calculate moving average and moving st.dev

   :param x: A 1-D or 2-D array of real values. If 2-D then each column is being
             treated separately.
   :type x: (...,N,...) array_like
   :param window_size: The size of the rolling window (in number of points)
   :type window_size: int
   :param unbiased_std: If True the unbiased estimator of the standard deviation will be used,
                        i.e dividing by N-1. If False the standard deviation will be computed
                        by dividing by N.
   :type unbiased_std: bool, optional

   :returns: * **moving_mean** (*(...,N,...) array_like*) -- Moving mean, the same size as `x`.
             * **moving_stdev** (*(...,N,...) array_like*) -- Moving standard deviation, the same size as `x`.

   .. rubric:: Examples

   >>> import numpy as np
   >>> import matplotlib.pyplot as plt
   >>> import tfalib
   >>> N = 1000
   >>> t = np.linspace(0, 2*np.pi, N)
   >>> x = np.sin(2*np.pi*t/np.pi) + 0.1*np.random.randn(N)
   >>> m, s = tfalib.moving_mean_and_stdev(x, 50)
   >>> plt.plot(t, x, 'xk', t, m, '-b', t, m + 3*s, '-r', t, m - 3*s, '-r')
   >>> plt.show()


.. py:function:: moving_q25_and_q75(x, window_size)

   Calculate moving 25th and 75th percentiles

   The difference between these two percentiles is called the inter-quartile
   range (iqr) and can be used for outlier detection, i.e. accept only points
   that lie within the region from q25 - 1.5*iqr up to q75 + 1.5*iqr and
   discard the rest.

   NOTE: It is recommended to use an odd integer number for window_size

   :param x: A 1-D or 2-D array of real values. If 2-D then each column is being
             treated separately.
   :type x: (...,N,...) array_like
   :param window_size: The size of the rolling window (in number of points)
   :type window_size: int

   :returns: * **moving_q25** (*(...,N,...) array_like*) -- Moving 25th percentile, the same size as `x`.
             * **moving_q75** (*(...,N,...) array_like*) -- Moving 75th percentile, the same size as `x`.

   .. rubric:: Examples

   >>> import numpy as np
   >>> import matplotlib.pyplot as plt
   >>> import tfalib
   >>> N = 1000
   >>> t = np.linspace(0, 2*np.pi, N)
   >>> x = np.sin(2*np.pi*t/np.pi) + 0.1*np.random.randn(N)
   >>> q25, q75 = tfalib.moving_q25_and_q75(x, 50)
   >>> iqr = q75 - q25
   >>> plt.plot(t, x, 'xk', t, q25 - 1.5*iqr, '-r', t, q75 + 1.5*iqr, '-r')
   >>> plt.show()


.. py:function:: outliers(x, window_size, method='iqr', multiplier=np.nan)

   Find statistical outliers in data

   This uses a moving window to identify outliers, based on how larger or
   smaller data points are from their neighbours within the window. Two
   methods are used:

   `normal`: Assumes Gaussian distribution. Calculates the meand and st.dev.
   inside a window of length `window_size` and flags as outliers points that
   lie below/above the window mean +/- M times that st.dev, with M being
   defined by the `multiplier` parameter.

   `iqr`: As above, but using the quartiles q25 and q75 and the inter-quartile
   range iqr, to define the zone of acceptable measurements. Outliers will lie
   below q25 - M*iqr or above q75 + M*iqr, with M being the `multiplier`
   parameter.

   `multiplier` can be either a single float or a list of two numbers, in
   which case, the first will be used to define the lower limit and the second
   the upper one. If you want to search only for e.g. high outliers, then set
   the first element of `multiplier` as numpy.Inf so that it will include all
   values.



   :param x: A 1-D or 2-D array of real values. If 2-D then each column is being
             treated separately.
   :type x: (...,N,...) array_like
   :param window_size: The size of the rolling window (in number of points)
   :type window_size: int
   :param method: Can be either 'normal' or 'iqr' and signifies the method used
   :type method: string
   :param multiplier: The number that indicates the spread of the zone of accepted values
   :type multiplier: float or list (of two floats)

   :returns: **outlier_inds** -- Boolean array, the same size as `x` with True where an outlier has been
             detected and False elsewhere.
   :rtype: (...,N,...) array_like

   .. rubric:: Examples

   >>> import numpy as np
   >>> import matplotlib.pyplot as plt
   >>> import tfalib
   >>> N = 1000
   >>> t = np.linspace(0, 2*np.pi, N)
   >>> x = np.sin(2*np.pi*t/np.pi) + np.random.randn(N)
   >>> M = 100 # number of outliers
   >>> A = 5   # intensity of outliers
   >>> spkinds = np.random.permutation(N)[0:M]
   >>> x[spkinds[0:int(np.floor(M/2))]] += A # first half to be increased
   >>> x[spkinds[int(np.ceil(M/2)):]] -= A # second half to be decreased
   >>> outlier_inds = tfalib.outliers(x, 25, method = 'iqr', multiplier = 1.5)
   >>> plt.plot(t, x, 'xk', t[outlier_inds], x[outlier_inds], 'or')
   >>> plt.show()


.. py:function:: filter(x, sampling_rate, cutoff)

   High-pass filter the data

   This is just a wrapper of the Chebysev Type II filter of SciPy. The way it
   works is that the lowpass filtered version of the series is being produced,
   by means of cheby2() and then it is subtracted from the data series, so
   that the high-pass component remains.


   :param x: A 1-D or 2-D array of real values. If 2-D then each column is being
             treated separately.
   :type x: (...,N,...) array_like
   :param sampling_rate: The sampling rate of the data, i.e. the reciprocal of the time step
   :type sampling_rate: float
   :param cutoff: The cutoff frequency that the filter will use. Sinusoidal waveforms
                  with frequencies below this cutoff will be reduced in amplitude
                  (ideally to zero, but frequencies close to the cutoff will be less
                  affected), while those with frequencies above this cutoff will remain
                  unchanged.
   :type cutoff: float

   :returns: **filtered** -- Array, the same size as `x` with the result of the filtering process.
   :rtype: (...,N,...) array_like

   .. rubric:: Examples

   >>> import numpy as np
   >>> import tfalib
   >>> import matplotlib.pyplot as plt
   >>> T = np.arange(0, 3600, 0.5)
   >>> Y = np.sin(2*np.pi*T/500)*(np.exp(-(T/1000)**2)) + np.sin(2*np.pi*T/250)*(np.exp(-((T-np.max(T))/1000)**2))
   >>> F = tfalib.filter(Y, 2, 3/1000)
   >>> plt.figure(1)
   >>> plt.plot(T, Y, color=[.5,.5,.5], linewidth=5)
   >>> plt.plot(T, F, '-r')
   >>> plt.legend(('Original Series', 'High-Pass Filtered'))
   >>> plt.grid(True)
   >>> plt.show()


.. py:function:: morlet_wave(N=600, scale=1, dx=0.01, omega=6.203607835633639, roll=True, norm=True)

   Generate a morlet wave-function to be used with the wavelet_tranform()

   This generates the comlex-conjugate, scaled and time-reversed form of the
   Morlet wavelet, so that it can be immediately used in the wavelet transform
   function.


   :param N: Number of points to generate
   :type N: integer
   :param scale: The scale, i.e. period of the generated waveform
   :type scale: float
   :param dx: The time step of the data, i.e. the reciprocal of the sampling rate
   :type dx: float
   :param omega: The omega_zero parameter of the Morlet function. The default value is
                 6.2036 which is the value for which the wavelet scales directly
                 correspond to the Fourier periods
   :type omega: float
   :param roll:    If `False`, the signal is generated as is, centered at zero. If `True`,
                   it is translated so zero becomes the first element in the time series
                   and the part of the wavelet that corresponds to negative x values is
                   folded back at the end of the series. Use `False` to plot and see the
                   wavelet, but `True` to use it with the wavelet transform!
                norm: boolean
                    If `True` the function is normalized by multiplication with the factor
                    sqrt(dx/scale), so that its sum of squares is 1 and sum of squares of
                    FFT coefficients is N. Use `True` with the wavelet transform!
   :type roll: boolean

   :returns: * **wavelet** (*(N,) array_like*) -- 1-D array with the complex values of the Morlet wavelet
             * **x** (*(N,) array_like*) -- 1-D array with the `x` values that correspond to the wavelet (use for
               plotting only, otherwise ignore)
             * **wavelet_norm_factor** (*float*) -- A number to be used for the normalization of the result of the wavelet
               transform.

   .. rubric:: Examples

   >>> import numpy as np
   >>> from scipy.fft import fft
   >>> import tfalib
   >>> import matplotlib.pyplot as plt
   >>> N_wave = 600
   >>> s_wave = 50
   >>> dx_wave = .5
   >>> m, m_x, m_norm = tfalib.morlet_wave(N_wave, s_wave, dx_wave, roll=False, norm=True)
   >>> plt.figure()
   >>> plt.plot(m_x, np.real(m), '-b', m_x, np.imag(m), '-r')
   >>> plt.grid(True)
   >>> plt.show()
   >>> # Test wavelet function's properties
   >>> print('Wavelet Integral = %f + i %f (should be zero)'%(np.trapz(np.real(m), dx=dx_wave), np.trapz(np.imag(m), dx=dx_wave)))
   >>> print('Sum of squares = %f (should be 1)'%np.sum(np.abs(m)**2))
   >>> print('Sum of squares of FFT = %f (should be N)'%np.sum(np.abs(fft(m, norm='backward'))**2))


.. py:function:: wavelet_scales(minScale, maxScale, dj)

.. py:function:: wavelet_transform(x, dx, minScale, maxScale, wavelet_function=morlet_wave, dj=0.1)

   Apply the wavelet transform on time series data.


   :param x: Input time series
   :type x: (N,) Array like
   :param dx: The time step of the data, i.e. the reciprocal of the sampling rate
   :type dx: float
   :param wavelet_function: The wavelet mother function to use in the transform
   :type wavelet_function: function
   :param minScale: The smallest scale to use for the wavelet transform
   :type minScale: float
   :param maxScale: The largest scale to use for the wavelet transform
   :type maxScale: float
   :param dj: The step size to use for generating the scales that will be used for
              the wavelet transform. Scales are generated using the form:
                  scales = minScale * np.power(2, np.arange(0, M, dj))
              with M being given by np.log2(maxScale/minScale)+dj, ensuring that the
              maximum scale will be equal to maxScale
   :type dj: float

   :returns: * **wave_mat** (*(M,N) array_like*) -- 2-D array with the complex values of the wavelet transform. Each row is
               a different scale and each column a different moment in time
             * **scales** (*(M,) array_like*) -- 1-D array with the values of the scales that were used for the wavelet
               transform

   .. rubric:: Examples

   >>> import numpy as np
   >>> import matplotlib.pyplot as plt
   >>> import tfalib
   >>> fs = 8
   >>> T = np.arange(0, 10000, 1/fs);
   >>> N = len(T)
   >>> dj=0.1
   >>> W, scales = tfalib.wavelet_transform(X, 1/fs, tfalib.morlet_wave, 2, 1000, dj)
   >>> Wsq = np.abs(W)**2
   >>> log2scales = np.log2(scales)
   >>> plt.figure()
   >>> plt.imshow(Wsq[91:0:-1,:], aspect='auto',
   >>>            extent=[T[0], T[-1], log2scales[0], log2scales[-1]])
   >>> plt.yticks(np.arange(log2scales[0],log2scales[-1]+dj),
   >>>            labels=2**np.arange(log2scales[0],log2scales[-1]+dj))


.. py:function:: wavelet_normalize(wave_sq_matrix, scales, dx, dj, wavelet_norm_factor)

   Apply a normalization to the squared magnitude of the output of the wavelt
   transform so that its results are compatible with the FFT.


   :param wave_sq_matrix: The square of the magnitude of the output of the wavelet transform
   :type wave_sq_matrix: (M,N) Array like
   :param scales: 1-D array with the values of the scales that were used for the wavelet
                  transform
   :type scales: (M,) array_like
   :param dx: The time step of the data, i.e. the reciprocal of the sampling rate
   :type dx: float
   :param dj: The step size to use for generating the scales that will be used for
              the wavelet transform. Scales are generated using the form:
                  scales = minScale * np.power(2, np.arange(0, M, dj))
              with M being given by np.log2(maxScale/minScale)+dj, ensuring that the
              maximum scale will be equal to maxScale
   :type dj: float
   :param wavelet_norm_factor: The wavelet-specific normalization factor that needs to be applied
   :type wavelet_norm_factor: float

   :returns: **normalized_wave_sq_matrix** -- The normalized square of the magnitude of the output of the wavelet
             transform
   :rtype: (M,N) Array like


.. py:function:: magn(X)

   Return the row-wise magnitude of elements in 2D array 'X' as a single-column array.


.. py:function:: mfa(B_NEC, B_MEAN_NEC, R_NEC=None)

