iceberg-cpp
Loading...
Searching...
No Matches
Public Member Functions | Static Public Member Functions | Protected Member Functions | Protected Attributes | List of all members
iceberg::TableScanBuilder< ScanType > Class Template Reference

Builder class for creating TableScan instances. More...

#include <table_scan.h>

Inheritance diagram for iceberg::TableScanBuilder< ScanType >:
iceberg::ErrorCollector

Public Member Functions

TableScanBuilderOption (std::string key, std::string value)
 Update property that will override the table's behavior based on the incoming pair. Unknown properties will be ignored.
 
TableScanBuilderProject (std::shared_ptr< Schema > schema)
 Set the projected schema.
 
TableScanBuilderCaseSensitive (bool case_sensitive)
 If data columns are selected via Select(), controls whether the match to the schema will be done with case sensitivity. Default is true.
 
TableScanBuilderIncludeColumnStats ()
 Request this scan to load the column stats with each data file.
 
TableScanBuilderIncludeColumnStats (const std::vector< std::string > &requested_columns)
 Request this scan to load the column stats for the specific columns with each data file.
 
TableScanBuilderSelect (const std::vector< std::string > &column_names)
 Request this scan to read the given data columns.
 
TableScanBuilderFilter (std::shared_ptr< Expression > filter)
 Set the expression to filter data.
 
TableScanBuilderIgnoreResiduals ()
 Request data filtering to files but not to rows in those files.
 
TableScanBuilderMinRowsRequested (int64_t num_rows)
 Request this scan to return at least the given number of rows.
 
TableScanBuilderUseSnapshot (int64_t snapshot_id)
 Request this scan to use the given snapshot by ID.
 
TableScanBuilderUseRef (const std::string &ref)
 Request this scan to use the given reference.
 
TableScanBuilderAsOfTime (int64_t timestamp_millis)
 Request this scan to use the most recent snapshot as of the given time in milliseconds on the branch in the scan or main if no branch is set.
 
TableScanBuilderFromSnapshot (int64_t from_snapshot_id, bool inclusive=false)
 Instructs this scan to look for changes starting from a particular snapshot.
 
TableScanBuilderFromSnapshot (const std::string &ref, bool inclusive=false)
 Instructs this scan to look for changes starting from a particular snapshot.
 
TableScanBuilderToSnapshot (int64_t to_snapshot_id)
 Instructs this scan to look for changes up to a particular snapshot (inclusive).
 
TableScanBuilderToSnapshot (const std::string &ref)
 Instructs this scan to look for changes up to a particular snapshot ref (inclusive).
 
TableScanBuilderUseBranch (const std::string &branch)
 Use the specified branch.
 
Result< std::unique_ptr< ScanType > > Build ()
 Builds and returns a TableScan instance.
 
- Public Member Functions inherited from iceberg::ErrorCollector
 ErrorCollector (ErrorCollector &&)=default
 
ErrorCollectoroperator= (ErrorCollector &&)=default
 
 ErrorCollector (const ErrorCollector &)=default
 
ErrorCollectoroperator= (const ErrorCollector &)=default
 
template<typename... Args>
auto & AddError (this auto &self, ErrorKind kind, const std::format_string< Args... > fmt, Args &&... args)
 Add a specific error and return reference to derived class.
 
auto & AddError (this auto &self, Error err)
 Add an existing error object and return reference to derived class.
 
auto & AddError (this auto &self, std::unexpected< Error > err)
 Add an unexpected result's error and return reference to derived class.
 
bool has_errors () const
 Check if any errors have been collected.
 
size_t error_count () const
 Get the number of errors collected.
 
Status CheckErrors () const
 Check for accumulated errors and return them if any exist.
 
void ClearErrors ()
 Clear all accumulated errors.
 
const std::vector< Error > & errors () const
 Get read-only access to all collected errors.
 

Static Public Member Functions

static Result< std::unique_ptr< TableScanBuilder< ScanType > > > Make (std::shared_ptr< TableMetadata > metadata, std::shared_ptr< FileIO > io)
 Constructs a TableScanBuilder for the given table.
 

Protected Member Functions

 TableScanBuilder (std::shared_ptr< TableMetadata > metadata, std::shared_ptr< FileIO > io)
 
Result< std::reference_wrapper< const std::shared_ptr< Schema > > > ResolveSnapshotSchema ()
 

Protected Attributes

std::shared_ptr< TableMetadatametadata_
 
std::shared_ptr< FileIOio_
 
internal::TableScanContext context_
 
std::shared_ptr< Schemasnapshot_schema_
 
- Protected Attributes inherited from iceberg::ErrorCollector
std::vector< Errorerrors_
 

Detailed Description

template<typename ScanType = DataTableScan>
class iceberg::TableScanBuilder< ScanType >

Builder class for creating TableScan instances.

Scan builder.

Member Function Documentation

◆ AsOfTime()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::AsOfTime ( int64_t  timestamp_millis)

Request this scan to use the most recent snapshot as of the given time in milliseconds on the branch in the scan or main if no branch is set.

Parameters
timestamp_millisa timestamp in milliseconds.
Note
InvalidArgument will be returned if the snapshot cannot be found or time travel is attempted on a tag

◆ Build()

template<typename ScanType >
Result< std::unique_ptr< ScanType > > iceberg::TableScanBuilder< ScanType >::Build ( )

Builds and returns a TableScan instance.

Returns
A Result containing the TableScan or an error.

◆ CaseSensitive()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::CaseSensitive ( bool  case_sensitive)

If data columns are selected via Select(), controls whether the match to the schema will be done with case sensitivity. Default is true.

Parameters
case_sensitivewhether the scan is case-sensitive

◆ Filter()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Filter ( std::shared_ptr< Expression filter)

Set the expression to filter data.

Parameters
filtera filter expression

◆ FromSnapshot() [1/2]

template<typename ScanType >
requires IsIncrementalScan<ScanType>
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::FromSnapshot ( const std::string &  ref,
bool  inclusive = false 
)

Instructs this scan to look for changes starting from a particular snapshot.

This method is only available for incremental scans. If the start snapshot is not configured, it defaults to the oldest ancestor of the end snapshot (inclusive).

Parameters
refthe start ref name that points to a particular snapshot ID
inclusivewhether the start snapshot is inclusive, default is false
Note
InvalidArgument will be returned if the start snapshot is not an ancestor of the end snapshot

◆ FromSnapshot() [2/2]

template<typename ScanType >
requires IsIncrementalScan<ScanType>
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::FromSnapshot ( int64_t  from_snapshot_id,
bool  inclusive = false 
)

Instructs this scan to look for changes starting from a particular snapshot.

This method is only available for incremental scans. If the start snapshot is not configured, it defaults to the oldest ancestor of the end snapshot (inclusive).

Parameters
from_snapshot_idthe start snapshot ID
inclusivewhether the start snapshot is inclusive, default is false
Note
InvalidArgument will be returned if the start snapshot is not an ancestor of the end snapshot

◆ IncludeColumnStats() [1/2]

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::IncludeColumnStats ( )

Request this scan to load the column stats with each data file.

Column stats include: value count, null value count, lower bounds, and upper bounds.

◆ IncludeColumnStats() [2/2]

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::IncludeColumnStats ( const std::vector< std::string > &  requested_columns)

Request this scan to load the column stats for the specific columns with each data file.

Column stats include: value count, null value count, lower bounds, and upper bounds.

Parameters
requested_columnscolumn names for which to keep the stats.

◆ Make()

template<typename ScanType >
Result< std::unique_ptr< TableScanBuilder< ScanType > > > iceberg::TableScanBuilder< ScanType >::Make ( std::shared_ptr< TableMetadata metadata,
std::shared_ptr< FileIO io 
)
static

Constructs a TableScanBuilder for the given table.

Parameters
metadataCurrent table metadata.
ioFileIO instance for reading manifests files.

◆ MinRowsRequested()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::MinRowsRequested ( int64_t  num_rows)

Request this scan to return at least the given number of rows.

This is used as a hint and is entirely optional in order to not have to return more rows than necessary. This may return fewer rows if the scan does not contain that many, or it may return more than requested.

Parameters
num_rowsThe minimum number of rows requested

◆ Option()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Option ( std::string  key,
std::string  value 
)

Update property that will override the table's behavior based on the incoming pair. Unknown properties will be ignored.

Parameters
keyname of the table property to be overridden
valuevalue to override with

◆ Project()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Project ( std::shared_ptr< Schema schema)

Set the projected schema.

Parameters
schemaa projection schema

◆ Select()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Select ( const std::vector< std::string > &  column_names)

Request this scan to read the given data columns.

This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.

Parameters
column_namescolumn names from the table's schema

◆ ToSnapshot() [1/2]

template<typename ScanType >
requires IsIncrementalScan<ScanType>
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::ToSnapshot ( const std::string &  ref)

Instructs this scan to look for changes up to a particular snapshot ref (inclusive).

This method is only available for incremental scans. If the end snapshot is not configured, it defaults to the current table snapshot (inclusive).

Parameters
refthe end snapshot Ref (inclusive)

◆ ToSnapshot() [2/2]

template<typename ScanType >
requires IsIncrementalScan<ScanType>
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::ToSnapshot ( int64_t  to_snapshot_id)

Instructs this scan to look for changes up to a particular snapshot (inclusive).

This method is only available for incremental scans. If the end snapshot is not configured, it defaults to the current table snapshot (inclusive).

Parameters
to_snapshot_idthe end snapshot ID (inclusive)

◆ UseBranch()

template<typename ScanType >
requires IsIncrementalScan<ScanType>
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseBranch ( const std::string &  branch)

Use the specified branch.

This method is only available for incremental scans.

Parameters
branchthe branch name

◆ UseRef()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseRef ( const std::string &  ref)

Request this scan to use the given reference.

Parameters
refreference
Note
InvalidArgument will be returned if a reference with the given name could not be found

◆ UseSnapshot()

template<typename ScanType >
TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseSnapshot ( int64_t  snapshot_id)

Request this scan to use the given snapshot by ID.

Parameters
snapshot_ida snapshot ID
Note
InvalidArgument will be returned if the snapshot cannot be found

The documentation for this class was generated from the following files: