CLI#

The ref command-line interface (CLI) is the primary way to interact with the Climate-REF framework. This CLI tool is installed as part of the climate-ref package and provides commands for managing configurations, datasets, diagnostics, and more.

ref#

A CLI for the Assessment Fast Track Rapid Evaluation Framework

This CLI provides a number of commands for managing and executing diagnostics.

Usage:

ref [OPTIONS] COMMAND [ARGS]...

Options:

  --configuration-directory PATH  Configuration directory
  -v, --verbose                   Set the log level to DEBUG
  -q, --quiet                     Set the log level to WARNING
  --log-level [ERROR|WARNING|DEBUG|INFO]
                                  Set the level of logging information to
                                  display  \[default: INFO]
  --version                       Print the version and exit
  --install-completion            Install completion for the current shell.
  --show-completion               Show completion for the current shell, to
                                  copy it or customize the installation.

celery#

Managing remote celery workers

This module is used to manage remote execution workers for the Climate REF project. It is added to the ref command line interface if the climate-ref-celery package is installed.

A celery worker should be run for each diagnostic provider.

Usage:

ref celery [OPTIONS] COMMAND [ARGS]...

list-config#

List the celery configuration

Usage:

ref celery list-config [OPTIONS]

start-worker#

Start a Celery worker for the given provider.

A celery worker enables the execution of tasks in the background on multiple different nodes. This worker will register a celery task for each diagnostic in the provider. The worker tasks can be executed by sending a celery task with the name '{package_slug}_{diagnostic_slug}'.

Providers must be registered as entry points in the pyproject.toml file of the package. The entry point should be defined under the group climate-ref.providers (See import_provider for details).

Usage:

ref celery start-worker [OPTIONS] [EXTRA_ARGS]...

Options:

  --loglevel TEXT  Log level for the worker  \[default: info]
  --provider TEXT  Name of the provider to start a worker for. This argument
                   may be supplied multiple times. If no provider is given,
                   the worker will consume the default queue.
  --package TEXT   Deprecated. Use provider instead
  [EXTRA_ARGS]...  Additional arguments for the worker

config#

View and update the REF configuration

Usage:

ref config [OPTIONS] COMMAND [ARGS]...

list#

Print the current climate_ref configuration

If a configuration directory is provided, the configuration will attempt to load from the specified directory.

Usage:

ref config list [OPTIONS]

datasets#

View and ingest input datasets

The metadata from these datasets are stored in the database so that they can be used to determine which executions are required for a given diagnostic without having to re-parse the datasets.

Usage:

ref datasets [OPTIONS] COMMAND [ARGS]...

fetch-data#

Fetch REF-specific datasets

These datasets have been verified to have open licenses and are in the process of being added to Obs4MIPs.

Usage:

ref datasets fetch-data [OPTIONS]

Options:

  --registry TEXT                 Name of the data registry to use
                                  \[required]
  --output-directory PATH         Output directory where files will be saved
  --force-cleanup / --no-force-cleanup
                                  If True, remove any existing files
                                  \[default: no-force-cleanup]
  --symlink / --no-symlink        If True, symlink files into the output
                                  directory, otherwise perform a copy
                                  \[default: no-symlink]
  --verify / --no-verify          Verify the checksums of the fetched files
                                  \[default: verify]

fetch-sample-data#

Fetch the sample data for the given version.

These data will be written into the test data directory. This operation may fail if the test data directory does not exist, as is the case for non-source-based installations.

Usage:

ref datasets fetch-sample-data [OPTIONS]

Options:

  --force-cleanup / --no-force-cleanup
                                  If True, remove any existing files
                                  \[default: no-force-cleanup]
  --symlink / --no-symlink        If True, symlink files into the output
                                  directory, otherwise perform a copy
                                  \[default: no-symlink]

ingest#

Ingest a directory of datasets into the database

Each dataset will be loaded and validated using the specified dataset adapter. This will extract metadata from the datasets and store it in the database.

A table of the datasets will be printed to the console at the end of the operation.

Usage:

ref datasets ingest [OPTIONS] FILE_OR_DIRECTORY...

Options:

  FILE_OR_DIRECTORY...            \[required]
  --source-type [cmip6|cmip7|obs4mips|pmp-climatology]
                                  Type of source dataset  \[required]
  --solve / --no-solve            Solve for new diagnostic executions after
                                  ingestion  \[default: no-solve]
  --dry-run / --no-dry-run        Do not ingest datasets into the database
                                  \[default: no-dry-run]
  --n-jobs INTEGER                Number of jobs to run in parallel
  --skip-invalid / --no-skip-invalid
                                  Ignore (but log) any datasets that don't
                                  pass validation  \[default: skip-invalid]

list#

List the datasets that have been ingested

The data catalog is sorted by the date that the dataset was ingested (first = newest).

Usage:

ref datasets list [OPTIONS]

Options:

  --source-type [cmip6|cmip7|obs4mips|pmp-climatology]
                                  Type of source dataset  \[default: cmip6]
  --column TEXT
  --include-files / --no-include-files
                                  Include files in the output  \[default: no-
                                  include-files]
  --limit INTEGER                 Limit the number of datasets (or files when
                                  using --include-files) to display to this
                                  number.  \[default: 100]

list-columns#

Print the current climate_ref configuration

If a configuration directory is provided, the configuration will attempt to load from the specified directory.

Usage:

ref datasets list-columns [OPTIONS]

Options:

  --source-type [cmip6|cmip7|obs4mips|pmp-climatology]
                                  Type of source dataset  \[default: cmip6]
  --include-files / --no-include-files
                                  Include files in the output  \[default: no-
                                  include-files]

executions#

View execution groups and their results

Usage:

ref executions [OPTIONS] COMMAND [ARGS]...

delete-groups#

Delete execution groups matching the specified filters.

This command will delete execution groups and their associated executions. Use filters to specify which groups to delete. At least one filter must be provided to prevent accidental deletion of all groups.

Filters can be combined using AND logic across filter types and OR logic within a filter type.

Usage:

ref executions delete-groups [OPTIONS]

Options:

  --diagnostic TEXT               Filter by diagnostic slug (substring match,
                                  case-insensitive).Multiple values can be
                                  provided.
  --provider TEXT                 Filter by provider slug (substring match,
                                  case-insensitive).Multiple values can be
                                  provided.
  --filter TEXT                   Filter by facet key=value pairs (exact
                                  match). Multiple filters can be provided.
  --successful / --not-successful
                                  Filter by successful or unsuccessful
                                  executions.
  --dirty / --not-dirty           Filter to include only dirty or clean
                                  execution groups.These execution groups will
                                  be re-computed on the next run.
  --remove-outputs                Also remove output directories from the
                                  filesystem
  --force / --no-force            Skip confirmation prompt  \[default: no-
                                  force]

flag-dirty#

Flag an execution group for recomputation

Usage:

ref executions flag-dirty [OPTIONS] EXECUTION_ID

Options:

  EXECUTION_ID  \[required]

inspect#

Inspect a specific execution group by its ID

This will display the execution details, datasets, results directory, and logs if available.

Usage:

ref executions inspect [OPTIONS] EXECUTION_ID

Options:

  EXECUTION_ID  \[required]

list-groups#

List the diagnostic execution groups that have been identified

The data catalog is sorted by the date that the execution group was created (first = newest). If the --column option is provided, only the specified columns will be displayed.

Filters can be combined using AND logic across filter types and OR logic within a filter type.

The output will be in a tabular format.

Usage:

ref executions list-groups [OPTIONS]

Options:

  --column TEXT                   Only include specified columns in the output
  --limit INTEGER                 Limit the number of rows to display
                                  \[default: 100]
  --diagnostic TEXT               Filter by diagnostic slug (substring match,
                                  case-insensitive).Multiple values can be
                                  provided.
  --provider TEXT                 Filter by provider slug (substring match,
                                  case-insensitive).Multiple values can be
                                  provided.
  --filter TEXT                   Filter by facet key=value pairs (exact
                                  match). Multiple filters can be provided.
  --successful / --not-successful
                                  Filter by successful or unsuccessful
                                  executions.
  --dirty / --not-dirty           Filter to include only dirty or clean
                                  execution groups.These execution groups will
                                  be re-computed on the next run.

providers#

Manage the REF providers.

Usage:

ref providers [OPTIONS] COMMAND [ARGS]...

create-env#

Create a conda environment containing the provider software.

If no provider is specified, all providers will be installed. If the provider is up to date or does not use a virtual environment, it will be skipped.

Usage:

ref providers create-env [OPTIONS]

Options:

  --provider TEXT  Only install the environment for the named provider.

list#

Print the available providers.

Usage:

ref providers list [OPTIONS]

solve#

Solve for executions that require recalculation

This may trigger a number of additional calculations depending on what data has been ingested since the last solve. This command will block until all executions have been solved or the timeout is reached.

Filters can be applied to limit the diagnostics and providers that are considered, see the options --diagnostic and --provider for more information.

Usage:

ref solve [OPTIONS]

Options:

  --dry-run / --no-dry-run        Do not execute any diagnostics  \[default:
                                  no-dry-run]
  --execute / --no-execute        Solve the newly identified executions
                                  \[default: execute]
  --timeout INTEGER               Timeout in seconds for the solve operation
                                  \[default: 60]
  --one-per-provider / --no-one-per-provider
                                  Limit to one execution per provider. This is
                                  useful for testing  \[default: no-one-per-
                                  provider]
  --one-per-diagnostic / --no-one-per-diagnostic
                                  Limit to one execution per diagnostic. This
                                  is useful for testing  \[default: no-one-
                                  per-diagnostic]
  --diagnostic TEXT               Filters executions by the diagnostic slug.
                                  Diagnostics will be included if any of the
                                  filters match a case-insensitive subset of
                                  the diagnostic slug. Multiple values can be
                                  provided
  --provider TEXT                 Filters executions by provider slug.
                                  Providers will be included if any of the
                                  filters match a case-insensitive subset of
                                  the provider slug. Multiple values can be
                                  provided