climate_ref.models
#
Declaration of the models used by the REF.
These models are used to represent the data that is stored in the database.
Base
#
Bases: DeclarativeBase
Base class for all models
Source code in packages/climate-ref/src/climate_ref/models/base.py
Dataset
#
Bases: Base
Represents a dataset
A dataset is a collection of data files, that is used as an input to the benchmarking process. Adding/removing or updating a dataset will trigger a new diagnostic calculation.
A polymorphic association is used to capture the different types of datasets as each dataset type may have different metadata fields. This enables the use of a single table to store all datasets, but still allows for querying specific metadata fields for each dataset type.
Source code in packages/climate-ref/src/climate_ref/models/dataset.py
created_at = mapped_column(server_default=func.now())
class-attribute
instance-attribute
#
When the dataset was added to the database
dataset_type = mapped_column(nullable=False, index=True)
class-attribute
instance-attribute
#
Type of dataset
finalised = mapped_column(default=True, nullable=False)
class-attribute
instance-attribute
#
Whether the complete set of metadata for the dataset has been finalised.
For CMIP6, ingestion may initially create unfinalised datasets (False) until all metadata is extracted. For other dataset types (e.g., obs4MIPs, PMP climatology), this should be True upon creation.
slug = mapped_column(unique=True)
class-attribute
instance-attribute
#
Globally unique identifier for the dataset.
In the case of CMIP6 datasets, this is the instance_id.
updated_at = mapped_column(server_default=func.now(), onupdate=func.now())
class-attribute
instance-attribute
#
When the dataset was updated.
Updating a dataset will trigger a new diagnostic calculation.
Diagnostic
#
Bases: CreatedUpdatedMixin, Base
Represents a diagnostic that can be calculated
Source code in packages/climate-ref/src/climate_ref/models/diagnostic.py
enabled = mapped_column(default=True)
class-attribute
instance-attribute
#
Whether the diagnostic is enabled or not
If a diagnostic is not enabled, it will not be used for any calculations.
name = mapped_column()
class-attribute
instance-attribute
#
Long name of the diagnostic
provider_id = mapped_column(ForeignKey('provider.id'))
class-attribute
instance-attribute
#
The provider that provides the diagnostic
slug = mapped_column()
class-attribute
instance-attribute
#
Unique identifier for the diagnostic
This will be used to reference the diagnostic in the benchmarking process
Execution
#
Bases: CreatedUpdatedMixin, Base
Represents a single execution of a diagnostic
Each result is part of a group of executions that share similar input datasets.
An execution group might be run multiple times as new data becomes available,
each run will create a Execution.
Source code in packages/climate-ref/src/climate_ref/models/execution.py
114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
dataset_hash = mapped_column(index=True)
class-attribute
instance-attribute
#
Hash of the datasets used to calculate the diagnostic
This is used to verify if an existing diagnostic execution has been run with the same datasets.
datasets = relationship(secondary=execution_datasets)
class-attribute
instance-attribute
#
The datasets used in this execution
execution_group_id = mapped_column(ForeignKey('execution_group.id', name='fk_execution_id'), index=True)
class-attribute
instance-attribute
#
The execution group that this execution belongs to
output_fragment = mapped_column()
class-attribute
instance-attribute
#
Relative directory to store the output of the execution.
During execution this directory is relative to the temporary directory. If the diagnostic execution is successful, the executions will be moved to the final output directory and the temporary directory will be cleaned up. This directory may contain multiple input and output files.
path = mapped_column(nullable=True)
class-attribute
instance-attribute
#
Path to the output bundle
Relative to the diagnostic execution result output directory
retracted = mapped_column(default=False)
class-attribute
instance-attribute
#
Whether the diagnostic execution result has been retracted or not
This may happen if a dataset has been retracted, or if the diagnostic execution was incorrect. Rather than delete the values, they are marked as retracted. These data may still be visible in the UI, but should be marked as retracted.
successful = mapped_column(nullable=True, index=True)
class-attribute
instance-attribute
#
Was the run successful
mark_failed()
#
mark_successful(path)
#
Mark the diagnostic execution as successful
Source code in packages/climate-ref/src/climate_ref/models/execution.py
register_datasets(db, execution_dataset)
#
Register the datasets used in the diagnostic calculation with the execution
Source code in packages/climate-ref/src/climate_ref/models/execution.py
ExecutionGroup
#
Bases: CreatedUpdatedMixin, Base
Represents a group of executions with a shared set of input datasets.
When solving, the ExecutionGroups are derived from the available datasets,
the defined diagnostics and their data requirements. From the information in the
group an execution can be triggered, which is an actual run of a diagnostic calculation
with a specific set of input datasets.
When the ExecutionGroup is created, it is marked dirty, meaning there are no
current executions available. When an Execution was run successfully for a
ExecutionGroup, the dirty mark is removed. After ingesting new data and
solving again and if new versions of the input datasets are available, the
ExecutionGroup will be marked dirty again.
The diagnostic_id and key form a unique identifier for ExecutionGroups.
Source code in packages/climate-ref/src/climate_ref/models/execution.py
diagnostic_id = mapped_column(ForeignKey('diagnostic.id'), index=True)
class-attribute
instance-attribute
#
The diagnostic that this execution group belongs to
dirty = mapped_column(default=False)
class-attribute
instance-attribute
#
Whether the execution group should be rerun
An execution group is dirty if the diagnostic or any of the input datasets has been updated since the last execution.
key = mapped_column(index=True)
class-attribute
instance-attribute
#
Key for the datasets in this Execution group.
selectors = mapped_column(default=dict)
class-attribute
instance-attribute
#
Collection of selectors that define the group
These selectors are the unique key, value pairs that were selected during the initial groupby operation. These are also used to define the dataset key.
should_run(dataset_hash)
#
Check if the diagnostic execution group needs to be executed.
The diagnostic execution group should be run if:
- the execution group is marked as dirty
- no executions have been performed ever
- the dataset hash is different from the last run
Source code in packages/climate-ref/src/climate_ref/models/execution.py
ExecutionOutput
#
Bases: DimensionMixin, CreatedUpdatedMixin, Base
An output generated as part of an execution.
This output may be a plot, data file or HTML file. These outputs are defined in the CMEC output bundle.
Outputs can be tagged with dimensions from the controlled vocabulary to enable filtering and organization.
Source code in packages/climate-ref/src/climate_ref/models/execution.py
223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 | |
description = mapped_column(nullable=True)
class-attribute
instance-attribute
#
Long description describing the plot
filename = mapped_column(nullable=True)
class-attribute
instance-attribute
#
Path to the output
Relative to the diagnostic execution result output directory
long_name = mapped_column(nullable=True)
class-attribute
instance-attribute
#
Human readable name describing the plot
output_type = mapped_column(index=True)
class-attribute
instance-attribute
#
Type of the output
This will determine how the output is displayed
short_name = mapped_column(nullable=True)
class-attribute
instance-attribute
#
Short key of the output
This is unique for a given result and output type
build(*, execution_id, output_type, dimensions, filename=None, short_name=None, long_name=None, description=None)
classmethod
#
Build an ExecutionOutput from dimensions and metadata
This is a helper method that validates the dimensions supplied.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
execution_id
|
int
|
Execution that created the output |
required |
output_type
|
ResultOutputType
|
Type of the output |
required |
dimensions
|
dict[str, str]
|
Dimensions that describe the output |
required |
filename
|
str | None
|
Path to the output |
None
|
short_name
|
str | None
|
Short key of the output |
None
|
long_name
|
str | None
|
Human readable name |
None
|
description
|
str | None
|
Long description |
None
|
Raises:
| Type | Description |
|---|---|
KeyError
|
If an unknown dimension was supplied. Dimensions must exist in the controlled vocabulary. |
Returns:
| Type | Description |
|---|---|
Newly created ExecutionOutput
|
|
Source code in packages/climate-ref/src/climate_ref/models/execution.py
MetricValue
#
Bases: DimensionMixin, CreatedUpdatedMixin, Base
Represents a single metric value
This is a base class for different types of metric values (e.g. scalar, series) which are stored in a single table using single table inheritance.
This value has a number of dimensions which are used to query the diagnostic values. These dimensions describe aspects such as the type of statistic being measured, the region of interest or the model from which the statistic is being measured.
The columns in this table are not known statically because the REF can track an arbitrary
set of dimensions depending on the controlled vocabulary that will be used.
A call to register_cv_dimensions must be made before using this class.
Source code in packages/climate-ref/src/climate_ref/models/metric_value.py
type = mapped_column(index=True)
class-attribute
instance-attribute
#
Type of metric value
This value is used to determine how the metric value should be interpreted.
Provider
#
Bases: CreatedUpdatedMixin, Base
Represents a provider that can provide diagnostic calculations
Source code in packages/climate-ref/src/climate_ref/models/provider.py
name = mapped_column()
class-attribute
instance-attribute
#
Long name of the provider
slug = mapped_column(unique=True)
class-attribute
instance-attribute
#
Globally unique identifier for the provider.
version = mapped_column(nullable=False)
class-attribute
instance-attribute
#
Version of the provider.
This should map to the package version.
ScalarMetricValue
#
Bases: MetricValue
A scalar value with an associated dimensions
This is a subclass of MetricValue that is used to represent a scalar value.
Source code in packages/climate-ref/src/climate_ref/models/metric_value.py
build(*, execution_id, value, dimensions, attributes)
classmethod
#
Build a MetricValue from a collection of dimensions and a value
This is a helper method that validates the dimensions supplied and provides an interface similar to climate_ref_core.metric_values.ScalarMetricValue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
execution_id
|
int
|
Execution that created the diagnostic value |
required |
value
|
float
|
The value of the diagnostic |
required |
dimensions
|
dict[str, str]
|
Dimensions that describe the diagnostic execution result |
required |
attributes
|
dict[str, Any] | None
|
Optional additional attributes to describe the value, but are not in the controlled vocabulary. |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If an unknown dimension was supplied. Dimensions must exist in the controlled vocabulary. |
Returns:
| Type | Description |
|---|---|
Newly created MetricValue
|
|
Source code in packages/climate-ref/src/climate_ref/models/metric_value.py
SeriesMetricValue
#
Bases: MetricValue
A 1d series with associated dimensions
This is a subclass of MetricValue that is used to represent a series. This can be used to represent time series, vertical profiles or other 1d data.
Source code in packages/climate-ref/src/climate_ref/models/metric_value.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | |
build(*, execution_id, values, index, index_name, dimensions, attributes)
classmethod
#
Build a database object from a series
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
execution_id
|
int
|
Execution that created the diagnostic value |
required |
values
|
list[float | int]
|
1-d array of values |
required |
index
|
list[float | int | str]
|
1-d array of index values |
required |
index_name
|
str
|
Name of the index. Used for presentation purposes |
required |
dimensions
|
dict[str, str]
|
Dimensions that describe the diagnostic execution result |
required |
attributes
|
dict[str, Any] | None
|
Optional additional attributes to describe the value, but are not in the controlled vocabulary. |
required |
Raises:
| Type | Description |
|---|---|
KeyError
|
If an unknown dimension was supplied. Dimensions must exist in the controlled vocabulary. |
ValueError
|
If the length of values and index do not match. |
Returns:
| Type | Description |
|---|---|
Newly created MetricValue
|
|
Source code in packages/climate-ref/src/climate_ref/models/metric_value.py
sub-packages#
| Sub-package | Description |
|---|---|
| base | |
| dataset | |
| diagnostic | |
| execution | |
| metric_value | |
| mixins | Model mixins for shared functionality |
| provider |