|
iceberg-cpp
|
API for removing old snapshots from a table. More...
#include <expire_snapshots.h>
Classes | |
| struct | ApplyResult |
Public Member Functions | |
| ExpireSnapshots & | ExpireSnapshotId (int64_t snapshot_id) |
| Expires a specific Snapshot identified by id. | |
| ExpireSnapshots & | ExpireOlderThan (int64_t timestamp_millis) |
| Expires all snapshots older than the given timestamp. | |
| ExpireSnapshots & | RetainLast (int num_snapshots) |
| Retains the most recent ancestors of the current snapshot. | |
| ExpireSnapshots & | DeleteWith (std::function< void(const std::string &)> delete_func) |
| Passes an alternative delete implementation that will be used for manifests and data files. | |
| ExpireSnapshots & | CleanupLevel (enum CleanupLevel level) |
| Configures the cleanup level for expired files. | |
| ExpireSnapshots & | CleanExpiredMetadata (bool clean) |
| Enable cleaning up unused metadata, such as partition specs, schemas, etc. | |
| Kind | kind () const final |
| Return the kind of this pending update. | |
| bool | IsRetryable () const override |
| Whether this update can be retried after a commit conflict. | |
| Result< ApplyResult > | Apply () |
| Apply the pending changes and return the results. | |
| Status | Finalize (Result< const TableMetadata * > commit_result) override |
| Finalize the expire snapshots update, cleaning up expired files. | |
Public Member Functions inherited from iceberg::PendingUpdate | |
| virtual Status | Commit () |
| Apply the pending changes and commit. | |
| PendingUpdate (const PendingUpdate &)=delete | |
| PendingUpdate & | operator= (const PendingUpdate &)=delete |
| PendingUpdate (PendingUpdate &&) noexcept=default | |
| PendingUpdate & | operator= (PendingUpdate &&) noexcept=default |
Public Member Functions inherited from iceberg::ErrorCollector | |
| ErrorCollector (ErrorCollector &&)=default | |
| ErrorCollector & | operator= (ErrorCollector &&)=default |
| ErrorCollector (const ErrorCollector &)=default | |
| ErrorCollector & | operator= (const ErrorCollector &)=default |
| template<typename... Args> | |
| auto & | AddError (this auto &self, ErrorKind kind, const std::format_string< Args... > fmt, Args &&... args) |
| Add a specific error and return reference to derived class. | |
| auto & | AddError (this auto &self, Error err) |
| Add an existing error object and return reference to derived class. | |
| auto & | AddError (this auto &self, std::unexpected< Error > err) |
| Add an unexpected result's error and return reference to derived class. | |
| bool | has_errors () const |
| Check if any errors have been collected. | |
| size_t | error_count () const |
| Get the number of errors collected. | |
| Status | CheckErrors () const |
| Check for accumulated errors and return them if any exist. | |
| void | ClearErrors () |
| Clear all accumulated errors. | |
| const std::vector< Error > & | errors () const |
| Get read-only access to all collected errors. | |
Static Public Member Functions | |
| static Result< std::shared_ptr< ExpireSnapshots > > | Make (std::shared_ptr< TransactionContext > ctx) |
Additional Inherited Members | |
Public Types inherited from iceberg::PendingUpdate | |
| enum class | Kind : uint8_t { kExpireSnapshots , kSetSnapshot , kUpdateLocation , kUpdatePartitionSpec , kUpdatePartitionStatistics , kUpdateProperties , kUpdateSchema , kUpdateSnapshot , kUpdateSnapshotReference , kUpdateSortOrder , kUpdateStatistics } |
Protected Member Functions inherited from iceberg::PendingUpdate | |
| PendingUpdate (std::shared_ptr< TransactionContext > ctx) | |
| const TableMetadata & | base () const |
Protected Attributes inherited from iceberg::PendingUpdate | |
| std::shared_ptr< TransactionContext > | ctx_ |
Protected Attributes inherited from iceberg::ErrorCollector | |
| std::vector< Error > | errors_ |
API for removing old snapshots from a table.
This API accumulates snapshot deletions and commits the new list to the table. This API does not allow deleting the current snapshot.
When committing, these changes will be applied to the latest table metadata. Commit conflicts will be resolved by applying the changes to the new latest metadata and reattempting the commit.
Manifest files that are no longer used by valid snapshots will be deleted. Data files that were deleted by snapshots that are expired will be deleted. DeleteWith() can be used to pass an alternative deletion method.
Apply() returns a list of the snapshots that will be removed.
| Result< ExpireSnapshots::ApplyResult > iceberg::ExpireSnapshots::Apply | ( | ) |
Apply the pending changes and return the results.
| ExpireSnapshots & iceberg::ExpireSnapshots::CleanExpiredMetadata | ( | bool | clean | ) |
Enable cleaning up unused metadata, such as partition specs, schemas, etc.
| clean | Remove unused partition specs, schemas, or other metadata when true. |
| ExpireSnapshots & iceberg::ExpireSnapshots::CleanupLevel | ( | enum CleanupLevel | level | ) |
Configures the cleanup level for expired files.
This method provides fine-grained control over which files are cleaned up during snapshot expiration.
Consider CleanupLevel::kMetadataOnly when data files are shared across tables or when using procedures like add-files that may reference the same data files.
Consider CleanupLevel::kNone when data and metadata files may be more efficiently removed using a distributed framework through the actions API.
| level | The cleanup level to use for expired snapshots. |
| ExpireSnapshots & iceberg::ExpireSnapshots::DeleteWith | ( | std::function< void(const std::string &)> | delete_func | ) |
Passes an alternative delete implementation that will be used for manifests and data files.
Manifest files that are no longer used by valid snapshots will be deleted. Data files that were deleted by snapshots that are expired will be deleted.
If this method is not called, unnecessary manifests and data files will still be deleted.
| delete_func | A function that will be called to delete manifests and data files |
| ExpireSnapshots & iceberg::ExpireSnapshots::ExpireOlderThan | ( | int64_t | timestamp_millis | ) |
Expires all snapshots older than the given timestamp.
| timestamp_millis | A long timestamp in milliseconds. |
| ExpireSnapshots & iceberg::ExpireSnapshots::ExpireSnapshotId | ( | int64_t | snapshot_id | ) |
|
overridevirtual |
Finalize the expire snapshots update, cleaning up expired files.
After a successful commit, this method deletes manifest files, manifest lists, data files, and statistics files that are no longer referenced by any valid snapshot. The cleanup behavior is controlled by the CleanupLevel setting.
| commit_result | The committed table metadata when the commit succeeds, or the commit error when it fails. |
Reimplemented from iceberg::PendingUpdate.
|
inlineoverridevirtual |
Whether this update can be retried after a commit conflict.
Implements iceberg::PendingUpdate.
|
inlinefinalvirtual |
Return the kind of this pending update.
Implements iceberg::PendingUpdate.
| ExpireSnapshots & iceberg::ExpireSnapshots::RetainLast | ( | int | num_snapshots | ) |
Retains the most recent ancestors of the current snapshot.
If a snapshot would be expired because it is older than the expiration timestamp, but is one of the num_snapshots most recent ancestors of the current state, it will be retained. This will not cause snapshots explicitly identified by id from expiring.
This may keep more than num_snapshots ancestors if snapshots are added concurrently. This may keep less than num_snapshots ancestors if the current table state does not have that many.
| num_snapshots | The number of snapshots to retain. |