Taylor_computations#

Hydrological_model_validator.Processing.Taylor_computations.build_all_points(data_dict: Dict[str | int, Dict[int, List[ndarray | list]]]) Tuple[DataFrame, List[str | int]][source]#

Build a DataFrame of normalized Taylor statistics points for all months and years, including reference points per month.

Parameters:

data_dict (dict) –

Dictionary containing model and satellite data structured as: {

’model_key’: { year: [monthly data arrays/lists] }, ‘satellite_key’: { year: [monthly data arrays/lists] }

} Years can be strings or integers, months are indexed 0-based.

Returns:

  • DataFrame with columns [‘sdev’, ‘crmsd’, ‘ccoef’, ‘month’, ‘year’] containing normalized Taylor statistics for each year and month, plus monthly reference points.

  • List of years found in the satellite data.

Return type:

tuple of (pandas.DataFrame, list)

Raises:

KeyError – If expected model or satellite keys are missing in data_dict.

Notes

  • Months with invalid or zero reference standard deviation are skipped.

  • The reference point per month has sdev=1, crmsd=0, ccoef=1, labeled year=’Ref’.

Example

>>> data_dict = {
...     'model': {
...         2000: [np.array([...]), np.array([...]), ...],
...         2001: [np.array([...]), np.array([...]), ...],
...     },
...     'satellite': {
...         2000: [np.array([...]), np.array([...]), ...],
...         2001: [np.array([...]), np.array([...]), ...],
...     }
... }
>>> df, years = build_all_points(data_dict)
>>> df.head()
   sdev  crmsd  ccoef  month  year
0  1.00   0.00   1.00      0   Ref
1  0.85   0.12   0.95      0  2000
2  0.88   0.10   0.96      0  2001
...
Hydrological_model_validator.Processing.Taylor_computations.compute_norm_taylor_stats(mod_vals: ndarray, sat_vals: ndarray, std_ref: float) Dict[str, float] | None[source]#

Compute normalized Taylor statistics for a given pair of model and satellite data arrays.

Parameters:
  • mod_vals (np.ndarray) – Array of model data values.

  • sat_vals (np.ndarray) – Array of satellite data values.

  • std_ref (float) – Reference standard deviation to normalize the statistics.

Returns:

Dictionary containing: - ‘sdev’ : Normalized model standard deviation - ‘crmsd’: Normalized centered root-mean-square difference - ‘ccoef’: Correlation coefficient Returns None if there are no valid overlapping values.

Return type:

dict or None

Raises:

ValueError – If std_ref is not a positive number.

Example

>>> mod = np.array([1.0, 2.0, 3.0])
>>> sat = np.array([1.1, 2.1, 3.1])
>>> std_ref = 0.5
>>> stats = compute_norm_taylor_stats(mod, sat, std_ref)
>>> stats.keys()
dict_keys(['sdev', 'crmsd', 'ccoef'])
Hydrological_model_validator.Processing.Taylor_computations.compute_std_reference(sat_data_by_year: Dict[int | str, List[ndarray | list]], years: List[str | int], month_index: int) float[source]#

Compute the reference standard deviation for satellite data of a specific month across multiple years.

Parameters:
  • sat_data_by_year (dict) – Dictionary keyed by year (int or str), with each value being a list of monthly data arrays.

  • years (list) – List of years (int or str) to include in the computation.

  • month_index (int) – Index of the month (0 = January). Can be any valid index that exists in the dataset.

Returns:

Standard deviation of concatenated satellite values for the specified month across all given years.

Return type:

float

Raises:

ValueError – If ‘month_index’ is not a non-negative integer. If no valid data is found for the specified month across the selected years. If any matched monthly array is empty.

Example

>>> sat_data_by_year = {
...     2000: [np.random.rand(10) for _ in range(6)],  # up to June
...     2001: [np.random.rand(10) for _ in range(6)]
... }
>>> std = compute_std_reference(sat_data_by_year, [2000, 2001], 2)  # March
>>> isinstance(std, float)
True
Hydrological_model_validator.Processing.Taylor_computations.compute_taylor_stat_tuple(mod_values: ndarray, sat_values: ndarray, label: str) Tuple[str, float, float, float][source]#

Compute Taylor statistics (standard deviation, centered RMSD, and correlation coefficient) for a given pair of model and satellite data arrays.

Parameters:
  • mod_values (np.ndarray) – Array of model data values.

  • sat_values (np.ndarray) – Array of satellite data values.

  • label (str) – Identifier associated with these data (e.g., year or month string).

Returns:

A tuple containing: - label (str): The input label - model standard deviation (float) - centered RMSD (float) - correlation coefficient (float)

Return type:

Tuple[str, float, float, float]

Raises:

ValueError – If input arrays are empty. If no finite data pairs exist between model and satellite arrays.

Example

>>> mod = np.array([1.1, 2.0, 3.2])
>>> sat = np.array([1.0, 2.1, 3.0])
>>> compute_taylor_stat_tuple(mod, sat, '2001')
('2001', ..., ..., ...)
Hydrological_model_validator.Processing.Taylor_computations.compute_yearly_taylor_stats(data_dict: Dict[str | int, Dict[int, List[ndarray | list]]]) Tuple[List[Tuple[str, float, float, float]], float][source]#

Compute Taylor statistics for each year using model and satellite data from the data dictionary. Also computes the global standard deviation of all satellite data.

Parameters:

data_dict (dict) –

Dictionary containing model and satellite data organized by year and month. Expected structure: {

’model_key’: { year: [monthly data arrays/lists] }, ‘satellite_key’: { year: [monthly data arrays/lists] }

}

Returns:

  • yearly_stats : list of tuples List of (year, sdev, crmsd, ccoef) tuples representing Taylor statistics for each year.

  • std_ref : float Global satellite standard deviation across all years and months, used as a normalization reference.

Return type:

tuple

Raises:
  • KeyError – If expected model or satellite keys are missing in data_dict.

  • ValueError – If global satellite standard deviation is zero or NaN (indicating invalid data).

Example

>>> yearly_stats, std_ref = compute_yearly_taylor_stats(data_dict)
>>> for year, sdev, crmsd, ccoef in yearly_stats:
...     print(f"{year}: sdev={sdev:.2f}, crmsd={crmsd:.2f}, ccoef={ccoef:.2f}")
...
2000: sdev=0.85, crmsd=0.12, ccoef=0.95
2001: sdev=0.88, crmsd=0.10, ccoef=0.96
...