April 2025

Features

  • Billing:

    • Items now billed for include Container runtime (inlcuding per attached GPU), data transfer to and from bucket, and data storage (including Datasets and Artifacts).

    • Billing history page now presents a summary of expenditure for the last day, month, and total by product type.

  • Self Healing

    • Burst experiments that result in strong_fail will attempt to re-launch up to a maximum of 3 re-tries.

    • Experiments implementing checkpointing using the AtomicDirectory checkpointer from cycling_utils will have the latest checkpoint available to resume from on re-launch.

Fixes

  • Artifacts:

    • Checkpoints already shipped to the Artifact store are not shipped again.

    • Checkpoint artifacts are sorted by recency with the latest at the top of the page.

Last updated