Taylor_computations#
- Hydrological_model_validator.Processing.Taylor_computations.build_all_points(data_dict: Dict[str | int, Dict[int, List[ndarray | list]]]) Tuple[DataFrame, List[str | int]][source]#
Build a DataFrame of normalized Taylor statistics points for all months and years, including reference points per month.
- Parameters:
data_dict (dict) –
Dictionary containing model and satellite data structured as: {
’model_key’: { year: [monthly data arrays/lists] }, ‘satellite_key’: { year: [monthly data arrays/lists] }
} Years can be strings or integers, months are indexed 0-based.
- Returns:
DataFrame with columns [‘sdev’, ‘crmsd’, ‘ccoef’, ‘month’, ‘year’] containing normalized Taylor statistics for each year and month, plus monthly reference points.
List of years found in the satellite data.
- Return type:
tuple of (pandas.DataFrame, list)
- Raises:
KeyError – If expected model or satellite keys are missing in data_dict.
Notes
Months with invalid or zero reference standard deviation are skipped.
The reference point per month has sdev=1, crmsd=0, ccoef=1, labeled year=’Ref’.
Example
>>> data_dict = { ... 'model': { ... 2000: [np.array([...]), np.array([...]), ...], ... 2001: [np.array([...]), np.array([...]), ...], ... }, ... 'satellite': { ... 2000: [np.array([...]), np.array([...]), ...], ... 2001: [np.array([...]), np.array([...]), ...], ... } ... } >>> df, years = build_all_points(data_dict) >>> df.head() sdev crmsd ccoef month year 0 1.00 0.00 1.00 0 Ref 1 0.85 0.12 0.95 0 2000 2 0.88 0.10 0.96 0 2001 ...
- Hydrological_model_validator.Processing.Taylor_computations.compute_norm_taylor_stats(mod_vals: ndarray, sat_vals: ndarray, std_ref: float) Dict[str, float] | None[source]#
Compute normalized Taylor statistics for a given pair of model and satellite data arrays.
- Parameters:
mod_vals (np.ndarray) – Array of model data values.
sat_vals (np.ndarray) – Array of satellite data values.
std_ref (float) – Reference standard deviation to normalize the statistics.
- Returns:
Dictionary containing: - ‘sdev’ : Normalized model standard deviation - ‘crmsd’: Normalized centered root-mean-square difference - ‘ccoef’: Correlation coefficient Returns None if there are no valid overlapping values.
- Return type:
dict or None
- Raises:
ValueError – If std_ref is not a positive number.
Example
>>> mod = np.array([1.0, 2.0, 3.0]) >>> sat = np.array([1.1, 2.1, 3.1]) >>> std_ref = 0.5 >>> stats = compute_norm_taylor_stats(mod, sat, std_ref) >>> stats.keys() dict_keys(['sdev', 'crmsd', 'ccoef'])
- Hydrological_model_validator.Processing.Taylor_computations.compute_std_reference(sat_data_by_year: Dict[int | str, List[ndarray | list]], years: List[str | int], month_index: int) float[source]#
Compute the reference standard deviation for satellite data of a specific month across multiple years.
- Parameters:
sat_data_by_year (dict) – Dictionary keyed by year (int or str), with each value being a list of monthly data arrays.
years (list) – List of years (int or str) to include in the computation.
month_index (int) – Index of the month (0 = January). Can be any valid index that exists in the dataset.
- Returns:
Standard deviation of concatenated satellite values for the specified month across all given years.
- Return type:
float
- Raises:
ValueError – If ‘month_index’ is not a non-negative integer. If no valid data is found for the specified month across the selected years. If any matched monthly array is empty.
Example
>>> sat_data_by_year = { ... 2000: [np.random.rand(10) for _ in range(6)], # up to June ... 2001: [np.random.rand(10) for _ in range(6)] ... } >>> std = compute_std_reference(sat_data_by_year, [2000, 2001], 2) # March >>> isinstance(std, float) True
- Hydrological_model_validator.Processing.Taylor_computations.compute_taylor_stat_tuple(mod_values: ndarray, sat_values: ndarray, label: str) Tuple[str, float, float, float][source]#
Compute Taylor statistics (standard deviation, centered RMSD, and correlation coefficient) for a given pair of model and satellite data arrays.
- Parameters:
mod_values (np.ndarray) – Array of model data values.
sat_values (np.ndarray) – Array of satellite data values.
label (str) – Identifier associated with these data (e.g., year or month string).
- Returns:
A tuple containing: - label (str): The input label - model standard deviation (float) - centered RMSD (float) - correlation coefficient (float)
- Return type:
Tuple[str, float, float, float]
- Raises:
ValueError – If input arrays are empty. If no finite data pairs exist between model and satellite arrays.
Example
>>> mod = np.array([1.1, 2.0, 3.2]) >>> sat = np.array([1.0, 2.1, 3.0]) >>> compute_taylor_stat_tuple(mod, sat, '2001') ('2001', ..., ..., ...)
- Hydrological_model_validator.Processing.Taylor_computations.compute_yearly_taylor_stats(data_dict: Dict[str | int, Dict[int, List[ndarray | list]]]) Tuple[List[Tuple[str, float, float, float]], float][source]#
Compute Taylor statistics for each year using model and satellite data from the data dictionary. Also computes the global standard deviation of all satellite data.
- Parameters:
data_dict (dict) –
Dictionary containing model and satellite data organized by year and month. Expected structure: {
’model_key’: { year: [monthly data arrays/lists] }, ‘satellite_key’: { year: [monthly data arrays/lists] }
}
- Returns:
yearly_stats : list of tuples List of (year, sdev, crmsd, ccoef) tuples representing Taylor statistics for each year.
std_ref : float Global satellite standard deviation across all years and months, used as a normalization reference.
- Return type:
tuple
- Raises:
KeyError – If expected model or satellite keys are missing in data_dict.
ValueError – If global satellite standard deviation is zero or NaN (indicating invalid data).
Example
>>> yearly_stats, std_ref = compute_yearly_taylor_stats(data_dict) >>> for year, sdev, crmsd, ccoef in yearly_stats: ... print(f"{year}: sdev={sdev:.2f}, crmsd={crmsd:.2f}, ccoef={ccoef:.2f}") ... 2000: sdev=0.85, crmsd=0.12, ccoef=0.95 2001: sdev=0.88, crmsd=0.10, ccoef=0.96 ...