CLI#
The ref command-line interface (CLI) is the primary way to interact with the Climate-REF framework.
This CLI tool is installed as part of the climate-ref package and provides commands for managing configurations, datasets, diagnostics, and more.
ref#
A CLI for the Assessment Fast Track Rapid Evaluation Framework
This CLI provides a number of commands for managing and executing diagnostics.
Usage:
Options:
--configuration-directory PATH Configuration directory
-v, --verbose Set the log level to DEBUG
-q, --quiet Set the log level to WARNING
--log-level [ERROR|WARNING|DEBUG|INFO]
Set the level of logging information to
display \[default: INFO]
--version Print the version and exit
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to
copy it or customize the installation.
celery#
Managing remote celery workers
This module is used to manage remote execution workers for the Climate REF project.
It is added to the ref command line interface if the climate-ref-celery package is installed.
A celery worker should be run for each diagnostic provider.
Usage:
list-config#
List the celery configuration
Usage:
start-worker#
Start a Celery worker for the given provider.
A celery worker enables the execution of tasks in the background on multiple different nodes. This worker will register a celery task for each diagnostic in the provider. The worker tasks can be executed by sending a celery task with the name '{package_slug}_{diagnostic_slug}'.
Providers must be registered as entry points in the pyproject.toml file of the package.
The entry point should be defined under the group climate-ref.providers
(See import_provider for details).
Usage:
Options:
--loglevel TEXT Log level for the worker \[default: info]
--provider TEXT Name of the provider to start a worker for. This argument
may be supplied multiple times. If no provider is given,
the worker will consume the default queue.
--package TEXT Deprecated. Use provider instead
[EXTRA_ARGS]... Additional arguments for the worker
config#
View and update the REF configuration
Usage:
list#
Print the current climate_ref configuration
If a configuration directory is provided, the configuration will attempt to load from the specified directory.
Usage:
datasets#
View and ingest input datasets
The metadata from these datasets are stored in the database so that they can be used to determine which executions are required for a given diagnostic without having to re-parse the datasets.
Usage:
fetch-data#
Fetch REF-specific datasets
These datasets have been verified to have open licenses and are in the process of being added to Obs4MIPs.
Usage:
Options:
--registry TEXT Name of the data registry to use
\[required]
--output-directory PATH Output directory where files will be saved
--force-cleanup / --no-force-cleanup
If True, remove any existing files
\[default: no-force-cleanup]
--symlink / --no-symlink If True, symlink files into the output
directory, otherwise perform a copy
\[default: no-symlink]
--verify / --no-verify Verify the checksums of the fetched files
\[default: verify]
fetch-sample-data#
Fetch the sample data for the given version.
These data will be written into the test data directory. This operation may fail if the test data directory does not exist, as is the case for non-source-based installations.
Usage:
Options:
--force-cleanup / --no-force-cleanup
If True, remove any existing files
\[default: no-force-cleanup]
--symlink / --no-symlink If True, symlink files into the output
directory, otherwise perform a copy
\[default: no-symlink]
ingest#
Ingest a directory of datasets into the database
Each dataset will be loaded and validated using the specified dataset adapter. This will extract metadata from the datasets and store it in the database.
A table of the datasets will be printed to the console at the end of the operation.
Usage:
Options:
FILE_OR_DIRECTORY... \[required]
--source-type [cmip6|cmip7|obs4mips|pmp-climatology]
Type of source dataset \[required]
--solve / --no-solve Solve for new diagnostic executions after
ingestion \[default: no-solve]
--dry-run / --no-dry-run Do not ingest datasets into the database
\[default: no-dry-run]
--n-jobs INTEGER Number of jobs to run in parallel
--skip-invalid / --no-skip-invalid
Ignore (but log) any datasets that don't
pass validation \[default: skip-invalid]
list#
List the datasets that have been ingested
The data catalog is sorted by the date that the dataset was ingested (first = newest).
Usage:
Options:
--source-type [cmip6|cmip7|obs4mips|pmp-climatology]
Type of source dataset \[default: cmip6]
--column TEXT
--include-files / --no-include-files
Include files in the output \[default: no-
include-files]
--limit INTEGER Limit the number of datasets (or files when
using --include-files) to display to this
number. \[default: 100]
list-columns#
Print the current climate_ref configuration
If a configuration directory is provided, the configuration will attempt to load from the specified directory.
Usage:
Options:
--source-type [cmip6|cmip7|obs4mips|pmp-climatology]
Type of source dataset \[default: cmip6]
--include-files / --no-include-files
Include files in the output \[default: no-
include-files]
executions#
View execution groups and their results
Usage:
delete-groups#
Delete execution groups matching the specified filters.
This command will delete execution groups and their associated executions. Use filters to specify which groups to delete. At least one filter must be provided to prevent accidental deletion of all groups.
Filters can be combined using AND logic across filter types and OR logic within a filter type.
Usage:
Options:
--diagnostic TEXT Filter by diagnostic slug (substring match,
case-insensitive).Multiple values can be
provided.
--provider TEXT Filter by provider slug (substring match,
case-insensitive).Multiple values can be
provided.
--filter TEXT Filter by facet key=value pairs (exact
match). Multiple filters can be provided.
--successful / --not-successful
Filter by successful or unsuccessful
executions.
--dirty / --not-dirty Filter to include only dirty or clean
execution groups.These execution groups will
be re-computed on the next run.
--remove-outputs Also remove output directories from the
filesystem
--force / --no-force Skip confirmation prompt \[default: no-
force]
flag-dirty#
Flag an execution group for recomputation
Usage:
Options:
inspect#
Inspect a specific execution group by its ID
This will display the execution details, datasets, results directory, and logs if available.
Usage:
Options:
list-groups#
List the diagnostic execution groups that have been identified
The data catalog is sorted by the date that the execution group was created (first = newest).
If the --column option is provided, only the specified columns will be displayed.
Filters can be combined using AND logic across filter types and OR logic within a filter type.
The output will be in a tabular format.
Usage:
Options:
--column TEXT Only include specified columns in the output
--limit INTEGER Limit the number of rows to display
\[default: 100]
--diagnostic TEXT Filter by diagnostic slug (substring match,
case-insensitive).Multiple values can be
provided.
--provider TEXT Filter by provider slug (substring match,
case-insensitive).Multiple values can be
provided.
--filter TEXT Filter by facet key=value pairs (exact
match). Multiple filters can be provided.
--successful / --not-successful
Filter by successful or unsuccessful
executions.
--dirty / --not-dirty Filter to include only dirty or clean
execution groups.These execution groups will
be re-computed on the next run.
providers#
Manage the REF providers.
Usage:
create-env#
Create a conda environment containing the provider software.
If no provider is specified, all providers will be installed. If the provider is up to date or does not use a virtual environment, it will be skipped.
Usage:
Options:
list#
Print the available providers.
Usage:
solve#
Solve for executions that require recalculation
This may trigger a number of additional calculations depending on what data has been ingested since the last solve. This command will block until all executions have been solved or the timeout is reached.
Filters can be applied to limit the diagnostics and providers that are considered, see the options
--diagnostic and --provider for more information.
Usage:
Options:
--dry-run / --no-dry-run Do not execute any diagnostics \[default:
no-dry-run]
--execute / --no-execute Solve the newly identified executions
\[default: execute]
--timeout INTEGER Timeout in seconds for the solve operation
\[default: 60]
--one-per-provider / --no-one-per-provider
Limit to one execution per provider. This is
useful for testing \[default: no-one-per-
provider]
--one-per-diagnostic / --no-one-per-diagnostic
Limit to one execution per diagnostic. This
is useful for testing \[default: no-one-
per-diagnostic]
--diagnostic TEXT Filters executions by the diagnostic slug.
Diagnostics will be included if any of the
filters match a case-insensitive subset of
the diagnostic slug. Multiple values can be
provided
--provider TEXT Filters executions by provider slug.
Providers will be included if any of the
filters match a case-insensitive subset of
the provider slug. Multiple values can be
provided