|
iceberg-cpp
|
Builder class for creating TableScan instances. More...
#include <table_scan.h>
Public Member Functions | |
| TableScanBuilder & | Option (std::string key, std::string value) |
| Update property that will override the table's behavior based on the incoming pair. Unknown properties will be ignored. | |
| TableScanBuilder & | Project (std::shared_ptr< Schema > schema) |
| Set the projected schema. | |
| TableScanBuilder & | CaseSensitive (bool case_sensitive) |
| If data columns are selected via Select(), controls whether the match to the schema will be done with case sensitivity. Default is true. | |
| TableScanBuilder & | IncludeColumnStats () |
| Request this scan to load the column stats with each data file. | |
| TableScanBuilder & | IncludeColumnStats (const std::vector< std::string > &requested_columns) |
| Request this scan to load the column stats for the specific columns with each data file. | |
| TableScanBuilder & | Select (const std::vector< std::string > &column_names) |
| Request this scan to read the given data columns. | |
| TableScanBuilder & | Filter (std::shared_ptr< Expression > filter) |
| Set the expression to filter data. | |
| TableScanBuilder & | IgnoreResiduals () |
| Request data filtering to files but not to rows in those files. | |
| TableScanBuilder & | MinRowsRequested (int64_t num_rows) |
| Request this scan to return at least the given number of rows. | |
| TableScanBuilder & | UseSnapshot (int64_t snapshot_id) |
| Request this scan to use the given snapshot by ID. | |
| TableScanBuilder & | UseRef (const std::string &ref) |
| Request this scan to use the given reference. | |
| TableScanBuilder & | AsOfTime (int64_t timestamp_millis) |
| Request this scan to use the most recent snapshot as of the given time in milliseconds on the branch in the scan or main if no branch is set. | |
| TableScanBuilder & | FromSnapshot (int64_t from_snapshot_id, bool inclusive=false) |
| Instructs this scan to look for changes starting from a particular snapshot. | |
| TableScanBuilder & | FromSnapshot (const std::string &ref, bool inclusive=false) |
| Instructs this scan to look for changes starting from a particular snapshot. | |
| TableScanBuilder & | ToSnapshot (int64_t to_snapshot_id) |
| Instructs this scan to look for changes up to a particular snapshot (inclusive). | |
| TableScanBuilder & | ToSnapshot (const std::string &ref) |
| Instructs this scan to look for changes up to a particular snapshot ref (inclusive). | |
| TableScanBuilder & | UseBranch (const std::string &branch) |
| Use the specified branch. | |
| Result< std::unique_ptr< ScanType > > | Build () |
| Builds and returns a TableScan instance. | |
Public Member Functions inherited from iceberg::ErrorCollector | |
| ErrorCollector (ErrorCollector &&)=default | |
| ErrorCollector & | operator= (ErrorCollector &&)=default |
| ErrorCollector (const ErrorCollector &)=default | |
| ErrorCollector & | operator= (const ErrorCollector &)=default |
| template<typename... Args> | |
| auto & | AddError (this auto &self, ErrorKind kind, const std::format_string< Args... > fmt, Args &&... args) |
| Add a specific error and return reference to derived class. | |
| auto & | AddError (this auto &self, Error err) |
| Add an existing error object and return reference to derived class. | |
| auto & | AddError (this auto &self, std::unexpected< Error > err) |
| Add an unexpected result's error and return reference to derived class. | |
| bool | has_errors () const |
| Check if any errors have been collected. | |
| size_t | error_count () const |
| Get the number of errors collected. | |
| Status | CheckErrors () const |
| Check for accumulated errors and return them if any exist. | |
| void | ClearErrors () |
| Clear all accumulated errors. | |
| const std::vector< Error > & | errors () const |
| Get read-only access to all collected errors. | |
Static Public Member Functions | |
| static Result< std::unique_ptr< TableScanBuilder< ScanType > > > | Make (std::shared_ptr< TableMetadata > metadata, std::shared_ptr< FileIO > io) |
| Constructs a TableScanBuilder for the given table. | |
Protected Member Functions | |
| TableScanBuilder (std::shared_ptr< TableMetadata > metadata, std::shared_ptr< FileIO > io) | |
| Result< std::reference_wrapper< const std::shared_ptr< Schema > > > | ResolveSnapshotSchema () |
Protected Attributes | |
| std::shared_ptr< TableMetadata > | metadata_ |
| std::shared_ptr< FileIO > | io_ |
| internal::TableScanContext | context_ |
| std::shared_ptr< Schema > | snapshot_schema_ |
Protected Attributes inherited from iceberg::ErrorCollector | |
| std::vector< Error > | errors_ |
Builder class for creating TableScan instances.
Scan builder.
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::AsOfTime | ( | int64_t | timestamp_millis | ) |
Request this scan to use the most recent snapshot as of the given time in milliseconds on the branch in the scan or main if no branch is set.
| timestamp_millis | a timestamp in milliseconds. |
| Result< std::unique_ptr< ScanType > > iceberg::TableScanBuilder< ScanType >::Build | ( | ) |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::CaseSensitive | ( | bool | case_sensitive | ) |
If data columns are selected via Select(), controls whether the match to the schema will be done with case sensitivity. Default is true.
| case_sensitive | whether the scan is case-sensitive |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Filter | ( | std::shared_ptr< Expression > | filter | ) |
Set the expression to filter data.
| filter | a filter expression |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::FromSnapshot | ( | const std::string & | ref, |
| bool | inclusive = false |
||
| ) |
Instructs this scan to look for changes starting from a particular snapshot.
This method is only available for incremental scans. If the start snapshot is not configured, it defaults to the oldest ancestor of the end snapshot (inclusive).
| ref | the start ref name that points to a particular snapshot ID |
| inclusive | whether the start snapshot is inclusive, default is false |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::FromSnapshot | ( | int64_t | from_snapshot_id, |
| bool | inclusive = false |
||
| ) |
Instructs this scan to look for changes starting from a particular snapshot.
This method is only available for incremental scans. If the start snapshot is not configured, it defaults to the oldest ancestor of the end snapshot (inclusive).
| from_snapshot_id | the start snapshot ID |
| inclusive | whether the start snapshot is inclusive, default is false |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::IncludeColumnStats | ( | ) |
Request this scan to load the column stats with each data file.
Column stats include: value count, null value count, lower bounds, and upper bounds.
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::IncludeColumnStats | ( | const std::vector< std::string > & | requested_columns | ) |
Request this scan to load the column stats for the specific columns with each data file.
Column stats include: value count, null value count, lower bounds, and upper bounds.
| requested_columns | column names for which to keep the stats. |
|
static |
Constructs a TableScanBuilder for the given table.
| metadata | Current table metadata. |
| io | FileIO instance for reading manifests files. |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::MinRowsRequested | ( | int64_t | num_rows | ) |
Request this scan to return at least the given number of rows.
This is used as a hint and is entirely optional in order to not have to return more rows than necessary. This may return fewer rows if the scan does not contain that many, or it may return more than requested.
| num_rows | The minimum number of rows requested |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Option | ( | std::string | key, |
| std::string | value | ||
| ) |
Update property that will override the table's behavior based on the incoming pair. Unknown properties will be ignored.
| key | name of the table property to be overridden |
| value | value to override with |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Project | ( | std::shared_ptr< Schema > | schema | ) |
Set the projected schema.
| schema | a projection schema |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::Select | ( | const std::vector< std::string > & | column_names | ) |
Request this scan to read the given data columns.
This produces an expected schema that includes all fields that are either selected or used by this scan's filter expression.
| column_names | column names from the table's schema |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::ToSnapshot | ( | const std::string & | ref | ) |
Instructs this scan to look for changes up to a particular snapshot ref (inclusive).
This method is only available for incremental scans. If the end snapshot is not configured, it defaults to the current table snapshot (inclusive).
| ref | the end snapshot Ref (inclusive) |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::ToSnapshot | ( | int64_t | to_snapshot_id | ) |
Instructs this scan to look for changes up to a particular snapshot (inclusive).
This method is only available for incremental scans. If the end snapshot is not configured, it defaults to the current table snapshot (inclusive).
| to_snapshot_id | the end snapshot ID (inclusive) |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseBranch | ( | const std::string & | branch | ) |
Use the specified branch.
This method is only available for incremental scans.
| branch | the branch name |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseRef | ( | const std::string & | ref | ) |
Request this scan to use the given reference.
| ref | reference |
| TableScanBuilder< ScanType > & iceberg::TableScanBuilder< ScanType >::UseSnapshot | ( | int64_t | snapshot_id | ) |
Request this scan to use the given snapshot by ID.
| snapshot_id | a snapshot ID |