climate_ref_core.diagnostics
#
Diagnostic interface
AbstractDiagnostic
#
Bases: Protocol
Interface for the calculation of a diagnostic.
This is a very high-level interface to provide maximum scope for the diagnostic packages to have differing assumptions about how they work. The configuration and output of the diagnostic should follow the Earth System Metrics and Diagnostics Standards formats as much as possible.
A diagnostic can be executed multiple times,
each time targeting a different group of input data.
The groups are determined using the grouping the data catalog according to the group_by field
in the DataRequirement object using one or more metadata fields.
Each group must conform with a set of constraints,
to ensure that the correct data is available to run the diagnostic.
Each group will then be processed as a separate execution of the diagnostic.
See (cmip_ref_example.example.ExampleDiagnostic)[] for an example implementation.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 | |
data_requirements
instance-attribute
#
Description of the required datasets for the current diagnostic
This information is used to filter the a data catalog of both model and/or observation datasets that are required by the diagnostic.
A diagnostic may specify either a single set of requirements (i.e. a list of DataRequirement's),
or multiple sets of requirements (i.e. a list of lists of DataRequirement's).
Each of these sets of requirements will be processed separately which is effectively an OR operation
across the sets of requirements.
Any modifications to the input data will new diagnostic calculation.
facets
instance-attribute
#
Facets that are used to describe the values produced by this metric.
These facets represent the dimensions that can be used to uniquely identify a metric value. Each metric value should have a unique set of keys for the dimension (this isn't checked). A faceted search can then be performed on these facets.
These facets must be present in the controlled vocabulary otherwise a KeyError exception
is raised.
name
instance-attribute
#
Name of the diagnostic being run
This should be unique for a given provider, but multiple providers can implement the same diagnostic.
provider
instance-attribute
#
The provider that provides the diagnostic.
series
instance-attribute
#
Definition of the series that are produced by the diagnostic.
slug
instance-attribute
#
Unique identifier for the diagnostic.
Defaults to the name of the diagnostic in lowercase with spaces replaced by hyphens.
build_execution_result(definition)
#
Build the result from running the diagnostic on the given configuration.
This can be replayed later to build the result from the output execution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The configuration to run the diagnostic on. |
required |
Returns:
| Type | Description |
|---|---|
ExecutionResult
|
The result of running the diagnostic. |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
execute(definition)
#
Execute the diagnostic on the given configuration.
The implementation of this method is left to the diagnostic providers. The results should be written to the output directory of the execution definition. These are later used to build the output bundle and the diagnostic bundle.
This may occur in a separate process (or python environment in the case of a CommandLineDiagnostic).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The configuration to run the diagnostic on. |
required |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
CommandLineDiagnostic
#
Bases: Diagnostic
Diagnostic that can be run from the command line.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
build_cmd(definition)
#
Build the command to run the diagnostic on the given configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The configuration to run the diagnostic on. |
required |
Returns:
| Type | Description |
|---|---|
Iterable[str]
|
A command that can be run with :func: |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
execute(definition)
#
Run the diagnostic on the given configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The configuration to run the diagnostic on. |
required |
Returns:
| Type | Description |
|---|---|
None
|
The result of running the diagnostic. |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
DataRequirement
#
Definition of the input datasets that a diagnostic requires to run.
This is used to create groups of datasets. Each group will result in an execution of the diagnostic and defines the input data for that execution.
The data catalog is filtered according to the filters field,
then grouped according to the group_by field,
and then each group is checked that it satisfies the constraints.
Each such group will be processed as a separate execution of the diagnostic.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 | |
constraints = field(factory=tuple)
class-attribute
instance-attribute
#
Constraints that must be satisfied when executing a given diagnostic run
All of the constraints must be satisfied for a given group to be run. Each filter is applied iterative to a set of datasets to reduce the set of datasets. This is effectively an AND operation.
filters
instance-attribute
#
Filters to apply to the data catalog of datasets.
This is used to reduce the set of datasets to only those that are required by the diagnostic.
Each FacetFilter contains one or more facet values that must all be satisfied for a dataset to match that filter. The overall selection keeps any dataset that matches at least one of the provided filters.
If no filters are specified, all datasets in the data catalog are used.
group_by
instance-attribute
#
The fields to group the datasets by.
This group by operation is performed after the data catalog is filtered according to filters.
Each group will contain a unique combination of values from the metadata fields,
and will result in a separate execution of the diagnostic.
If group_by=None, all datasets will be processed together as a single execution.
The unique values of the group by fields are used to create a unique key for the diagnostic execution.
Changing the value of group_by may invalidate all previous diagnostic executions.
source_type
instance-attribute
#
Type of the source dataset (CMIP6, CMIP7 etc)
apply_filters(data_catalog)
#
Apply filters to a DataFrame-based data catalog.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data_catalog
|
DataFrame
|
DataFrame to filter. Each column contains a facet |
required |
Returns:
| Type | Description |
|---|---|
DataFrame
|
Filtered data catalog |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
Diagnostic
#
Bases: AbstractDiagnostic
Interface for the calculation of a diagnostic.
This is a very high-level interface to provide maximum scope for the diagnostic packages to have differing assumptions. The configuration and output of the diagnostic should follow the Earth System Metrics and Diagnostics Standards formats as much as possible.
A diagnostic can be executed multiple times,
each time targeting a different group of input data.
The groups are determined using the grouping the data catalog according to the group_by field
in the DataRequirement object using one or more metadata fields.
Each group must conform with a set of constraints,
to ensure that the correct data is available to run the diagnostic.
Each group will then be processed as a separate execution of the diagnostic.
See (climate_ref_example.example.ExampleDiagnostic)[] for an example implementation.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
provider
property
writable
#
The provider that provides the diagnostic.
full_slug()
#
Full slug that describes the diagnostic
This is a combination of the provider slug and the diagnostic slug.
run(definition)
#
Run the diagnostic on the given configuration.
This executes the diagnostic and builds the result from the output bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The configuration to run the diagnostic on. |
required |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
ExecutionDefinition
#
Definition of an execution of a diagnostic
This represents the information needed by a diagnostic to perform an execution for a specific set of datasets fulfilling the requirements.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | |
datasets
instance-attribute
#
Collection of datasets required for the diagnostic execution
diagnostic
instance-attribute
#
The diagnostic that is being executed
key
instance-attribute
#
The unique identifier for the datasets in the diagnostic execution group.
The key is derived from the datasets in the group using facet values. New datasets which match the same group by facet values will result in the same key.
output_directory
instance-attribute
#
Output directory to store the output of the diagnostic execution
as_relative_path(filename)
#
Get the relative path of a file in the output directory
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
Path | str
|
Path to a file in the output directory If this is an absolute path, it will be converted to a relative path within the output directory. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Relative path to the file in the output directory |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
execution_slug()
#
output_fragment()
#
Get the relative path of the output directory to the root output directory
Returns:
| Type | Description |
|---|---|
Path
|
Relative path to the output directory |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
to_output_path(filename)
#
Get the absolute path for a file in the output directory
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
Path | str | None
|
Name of the file to get the full path for |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Full path to the file in the output directory |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
ExecutionResult
#
The result of executing a diagnostic.
This execution may or may not be successful.
The content of the result follows the Earth System Metrics and Diagnostics Standards (EMDS).
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 | |
definition
instance-attribute
#
The definition of the diagnostic execution that produced this result.
metric_bundle_filename = None
class-attribute
instance-attribute
#
Filename of the diagnostic bundle file relative to the execution directory.
The contents of this file are defined by EMDS standard
output_bundle_filename = None
class-attribute
instance-attribute
#
Filename of the output bundle file relative to the execution directory.
The contents of this file are defined by EMDS standard
series_filename = None
class-attribute
instance-attribute
#
A collection of series metric values that were extracted from the execution.
These are written to a CSV file in the output directory.
successful = False
class-attribute
instance-attribute
#
Whether the diagnostic execution ran successfully.
as_relative_path(filename)
#
Get the relative path of a file in the output directory
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
Path | str
|
Path to a file in the output directory If this is an absolute path, it will be converted to a relative path within the output directory. |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Relative path to the file in the output directory |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
build_from_failure(definition)
staticmethod
#
Build a failed diagnostic result.
This is a placeholder. Additional log information should still be captured in the output bundle.
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
build_from_output_bundle(definition, *, cmec_output_bundle, cmec_metric_bundle, series=tuple())
staticmethod
#
Build a ExecutionResult from a CMEC output bundle.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
definition
|
ExecutionDefinition
|
The execution definition. |
required |
cmec_output_bundle
|
CMECOutput | dict[str, Any]
|
An output bundle in the CMEC format. |
required |
cmec_metric_bundle
|
CMECMetric | dict[str, Any]
|
An diagnostic bundle in the CMEC format. |
required |
series
|
Sequence[SeriesMetricValue]
|
Series metric values extracted from the execution. |
tuple()
|
Returns:
| Type | Description |
|---|---|
ExecutionResult
|
A prepared ExecutionResult object. The output bundle will be written to the output directory. |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
to_output_path(filename)
#
Get the absolute path for a file in the output directory
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
filename
|
str | Path | None
|
Name of the file to get the full path for If None the path to the output bundle will be returned |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Full path to the file in the output directory |
Source code in packages/climate-ref-core/src/climate_ref_core/diagnostics.py
ensure_relative_path(path, root_directory)
#
Ensure that a path is relative to a root directory
If a path is an absolute path, but not relative to the root directory, a ValueError is raised.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
Path | str
|
The path to check |
required |
root_directory
|
Path
|
The root directory that the path should be relative to |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the path is not relative to the root directory |
Returns:
| Type | Description |
|---|---|
The path relative to the root directory
|
|