iceberg-cpp
Loading...
Searching...
No Matches
Classes | Public Member Functions | Static Public Member Functions | List of all members
iceberg::ExpireSnapshots Class Reference

API for removing old snapshots from a table. More...

#include <expire_snapshots.h>

Inheritance diagram for iceberg::ExpireSnapshots:
iceberg::PendingUpdate iceberg::ErrorCollector

Classes

struct  ApplyResult
 

Public Member Functions

ExpireSnapshotsExpireSnapshotId (int64_t snapshot_id)
 Expires a specific Snapshot identified by id.
 
ExpireSnapshotsExpireOlderThan (int64_t timestamp_millis)
 Expires all snapshots older than the given timestamp.
 
ExpireSnapshotsRetainLast (int num_snapshots)
 Retains the most recent ancestors of the current snapshot.
 
ExpireSnapshotsDeleteWith (std::function< void(const std::string &)> delete_func)
 Passes an alternative delete implementation that will be used for manifests and data files.
 
ExpireSnapshotsCleanupLevel (enum CleanupLevel level)
 Configures the cleanup level for expired files.
 
ExpireSnapshotsCleanExpiredMetadata (bool clean)
 Enable cleaning up unused metadata, such as partition specs, schemas, etc.
 
Kind kind () const final
 Return the kind of this pending update.
 
bool IsRetryable () const override
 Whether this update can be retried after a commit conflict.
 
Result< ApplyResultApply ()
 Apply the pending changes and return the results.
 
Status Finalize (Result< const TableMetadata * > commit_result) override
 Finalize the expire snapshots update, cleaning up expired files.
 
- Public Member Functions inherited from iceberg::PendingUpdate
virtual Status Commit ()
 Apply the pending changes and commit.
 
 PendingUpdate (const PendingUpdate &)=delete
 
PendingUpdateoperator= (const PendingUpdate &)=delete
 
 PendingUpdate (PendingUpdate &&) noexcept=default
 
PendingUpdateoperator= (PendingUpdate &&) noexcept=default
 
- Public Member Functions inherited from iceberg::ErrorCollector
 ErrorCollector (ErrorCollector &&)=default
 
ErrorCollectoroperator= (ErrorCollector &&)=default
 
 ErrorCollector (const ErrorCollector &)=default
 
ErrorCollectoroperator= (const ErrorCollector &)=default
 
template<typename... Args>
auto & AddError (this auto &self, ErrorKind kind, const std::format_string< Args... > fmt, Args &&... args)
 Add a specific error and return reference to derived class.
 
auto & AddError (this auto &self, Error err)
 Add an existing error object and return reference to derived class.
 
auto & AddError (this auto &self, std::unexpected< Error > err)
 Add an unexpected result's error and return reference to derived class.
 
bool has_errors () const
 Check if any errors have been collected.
 
size_t error_count () const
 Get the number of errors collected.
 
Status CheckErrors () const
 Check for accumulated errors and return them if any exist.
 
void ClearErrors ()
 Clear all accumulated errors.
 
const std::vector< Error > & errors () const
 Get read-only access to all collected errors.
 

Static Public Member Functions

static Result< std::shared_ptr< ExpireSnapshots > > Make (std::shared_ptr< TransactionContext > ctx)
 

Additional Inherited Members

- Public Types inherited from iceberg::PendingUpdate
enum class  Kind : uint8_t {
  kExpireSnapshots , kSetSnapshot , kUpdateLocation , kUpdatePartitionSpec ,
  kUpdatePartitionStatistics , kUpdateProperties , kUpdateSchema , kUpdateSnapshot ,
  kUpdateSnapshotReference , kUpdateSortOrder , kUpdateStatistics
}
 
- Protected Member Functions inherited from iceberg::PendingUpdate
 PendingUpdate (std::shared_ptr< TransactionContext > ctx)
 
const TableMetadatabase () const
 
- Protected Attributes inherited from iceberg::PendingUpdate
std::shared_ptr< TransactionContextctx_
 
- Protected Attributes inherited from iceberg::ErrorCollector
std::vector< Errorerrors_
 

Detailed Description

API for removing old snapshots from a table.

This API accumulates snapshot deletions and commits the new list to the table. This API does not allow deleting the current snapshot.

When committing, these changes will be applied to the latest table metadata. Commit conflicts will be resolved by applying the changes to the new latest metadata and reattempting the commit.

Manifest files that are no longer used by valid snapshots will be deleted. Data files that were deleted by snapshots that are expired will be deleted. DeleteWith() can be used to pass an alternative deletion method.

Apply() returns a list of the snapshots that will be removed.

Member Function Documentation

◆ Apply()

Result< ExpireSnapshots::ApplyResult > iceberg::ExpireSnapshots::Apply ( )

Apply the pending changes and return the results.

Returns
The results of changes

◆ CleanExpiredMetadata()

ExpireSnapshots & iceberg::ExpireSnapshots::CleanExpiredMetadata ( bool  clean)

Enable cleaning up unused metadata, such as partition specs, schemas, etc.

Parameters
cleanRemove unused partition specs, schemas, or other metadata when true.
Returns
Reference to this for method chaining.

◆ CleanupLevel()

ExpireSnapshots & iceberg::ExpireSnapshots::CleanupLevel ( enum CleanupLevel  level)

Configures the cleanup level for expired files.

This method provides fine-grained control over which files are cleaned up during snapshot expiration.

Consider CleanupLevel::kMetadataOnly when data files are shared across tables or when using procedures like add-files that may reference the same data files.

Consider CleanupLevel::kNone when data and metadata files may be more efficiently removed using a distributed framework through the actions API.

Parameters
levelThe cleanup level to use for expired snapshots.
Returns
Reference to this for method chaining.

◆ DeleteWith()

ExpireSnapshots & iceberg::ExpireSnapshots::DeleteWith ( std::function< void(const std::string &)>  delete_func)

Passes an alternative delete implementation that will be used for manifests and data files.

Manifest files that are no longer used by valid snapshots will be deleted. Data files that were deleted by snapshots that are expired will be deleted.

If this method is not called, unnecessary manifests and data files will still be deleted.

Parameters
delete_funcA function that will be called to delete manifests and data files
Returns
Reference to this for method chaining.

◆ ExpireOlderThan()

ExpireSnapshots & iceberg::ExpireSnapshots::ExpireOlderThan ( int64_t  timestamp_millis)

Expires all snapshots older than the given timestamp.

Parameters
timestamp_millisA long timestamp in milliseconds.
Returns
Reference to this for method chaining.

◆ ExpireSnapshotId()

ExpireSnapshots & iceberg::ExpireSnapshots::ExpireSnapshotId ( int64_t  snapshot_id)

Expires a specific Snapshot identified by id.

Parameters
snapshot_idLong id of the snapshot to expire.
Returns
Reference to this for method chaining.

◆ Finalize()

Status iceberg::ExpireSnapshots::Finalize ( Result< const TableMetadata * >  commit_result)
overridevirtual

Finalize the expire snapshots update, cleaning up expired files.

After a successful commit, this method deletes manifest files, manifest lists, data files, and statistics files that are no longer referenced by any valid snapshot. The cleanup behavior is controlled by the CleanupLevel setting.

Parameters
commit_resultThe committed table metadata when the commit succeeds, or the commit error when it fails.
Returns
Status indicating success or failure

Reimplemented from iceberg::PendingUpdate.

◆ IsRetryable()

bool iceberg::ExpireSnapshots::IsRetryable ( ) const
inlineoverridevirtual

Whether this update can be retried after a commit conflict.

Implements iceberg::PendingUpdate.

◆ kind()

Kind iceberg::ExpireSnapshots::kind ( ) const
inlinefinalvirtual

Return the kind of this pending update.

Implements iceberg::PendingUpdate.

◆ RetainLast()

ExpireSnapshots & iceberg::ExpireSnapshots::RetainLast ( int  num_snapshots)

Retains the most recent ancestors of the current snapshot.

If a snapshot would be expired because it is older than the expiration timestamp, but is one of the num_snapshots most recent ancestors of the current state, it will be retained. This will not cause snapshots explicitly identified by id from expiring.

This may keep more than num_snapshots ancestors if snapshots are added concurrently. This may keep less than num_snapshots ancestors if the current table state does not have that many.

Parameters
num_snapshotsThe number of snapshots to retain.
Returns
Reference to this for method chaining.

The documentation for this class was generated from the following files: