iceberg-cpp
Loading...
Searching...
No Matches
Public Member Functions | List of all members
iceberg::ManifestReaderImpl Class Reference

Read manifest entries from a manifest file. More...

#include <manifest_reader_internal.h>

Inheritance diagram for iceberg::ManifestReaderImpl:
iceberg::ManifestReader

Public Member Functions

 ManifestReaderImpl (std::string manifest_path, std::optional< int64_t > manifest_length, std::shared_ptr< FileIO > file_io, std::shared_ptr< Schema > schema, std::shared_ptr< PartitionSpec > spec, std::unique_ptr< InheritableMetadata > inheritable_metadata, std::optional< int64_t > first_row_id)
 Construct a ManifestReaderImpl for lazy initialization.
 
Result< std::vector< ManifestEntry > > Entries () override
 Read all manifest entries in the manifest file.
 
Result< std::vector< ManifestEntry > > LiveEntries () override
 Read only live (non-deleted) manifest entries.
 
ManifestReaderSelect (const std::vector< std::string > &columns) override
 Select specific columns of data file to read from the manifest entries.
 
ManifestReaderFilterPartitions (std::shared_ptr< Expression > expr) override
 Filter manifest entries by partition filter.
 
ManifestReaderFilterPartitions (std::shared_ptr< PartitionSet > partition_set) override
 Filter manifest entries to a specific set of partitions.
 
ManifestReaderFilterRows (std::shared_ptr< Expression > expr) override
 Filter manifest entries by row-level filter.
 
ManifestReaderCaseSensitive (bool case_sensitive) override
 Set case sensitivity for column name matching.
 
ManifestReaderTryDropStats () override
 Try to drop stats from returned DataFile objects.
 

Additional Inherited Members

- Static Public Member Functions inherited from iceberg::ManifestReader
static bool ShouldDropStats (const std::vector< std::string > &columns)
 Determine whether stats should be dropped based on selected columns.
 
static Result< std::unique_ptr< ManifestReader > > Make (const ManifestFile &manifest, std::shared_ptr< FileIO > file_io, std::shared_ptr< Schema > schema, std::shared_ptr< PartitionSpec > spec)
 Creates a reader for a manifest file.
 
static Result< std::unique_ptr< ManifestReader > > Make (std::string_view manifest_location, std::optional< int64_t > manifest_length, std::shared_ptr< FileIO > file_io, std::shared_ptr< Schema > schema, std::shared_ptr< PartitionSpec > spec, std::unique_ptr< InheritableMetadata > inheritable_metadata, std::optional< int64_t > first_row_id=std::nullopt)
 Creates a reader for a manifest file.
 
static std::vector< std::string > WithStatsColumns (const std::vector< std::string > &columns)
 Add stats columns to the column list if needed.
 

Detailed Description

Read manifest entries from a manifest file.

This implementation supports lazy reader creation and filtering based on partition expressions, row expressions, and partition sets. Following the Java implementation pattern.

Constructor & Destructor Documentation

◆ ManifestReaderImpl()

iceberg::ManifestReaderImpl::ManifestReaderImpl ( std::string  manifest_path,
std::optional< int64_t >  manifest_length,
std::shared_ptr< FileIO file_io,
std::shared_ptr< Schema schema,
std::shared_ptr< PartitionSpec spec,
std::unique_ptr< InheritableMetadata inheritable_metadata,
std::optional< int64_t >  first_row_id 
)

Construct a ManifestReaderImpl for lazy initialization.

Parameters
manifest_pathPath to the manifest file.
manifest_lengthLength of the manifest file (optional).
file_ioFile IO implementation.
schemaTable schema.
specPartition spec.
inheritable_metadataMetadata inherited from manifest.
first_row_idFirst row ID for V3 manifests.
Note
ManifestReader::Make() functions should guarantee non-null parameters.

Member Function Documentation

◆ CaseSensitive()

ManifestReader & iceberg::ManifestReaderImpl::CaseSensitive ( bool  case_sensitive)
overridevirtual

Set case sensitivity for column name matching.

Implements iceberg::ManifestReader.

◆ Entries()

Result< std::vector< ManifestEntry > > iceberg::ManifestReaderImpl::Entries ( )
overridevirtual

Read all manifest entries in the manifest file.

TODO(gangwu): provide a lazy-evaluated iterator interface for better performance.

Implements iceberg::ManifestReader.

◆ FilterPartitions() [1/2]

ManifestReader & iceberg::ManifestReaderImpl::FilterPartitions ( std::shared_ptr< Expression expr)
overridevirtual

Filter manifest entries by partition filter.

Note
Unlike the Java implementation, this method does not combine new expressions with existing ones. Each call replaces the previous partition filter.

Implements iceberg::ManifestReader.

◆ FilterPartitions() [2/2]

ManifestReader & iceberg::ManifestReaderImpl::FilterPartitions ( std::shared_ptr< PartitionSet partition_set)
overridevirtual

Filter manifest entries to a specific set of partitions.

Implements iceberg::ManifestReader.

◆ FilterRows()

ManifestReader & iceberg::ManifestReaderImpl::FilterRows ( std::shared_ptr< Expression expr)
overridevirtual

Filter manifest entries by row-level filter.

Note
Unlike the Java implementation, this method does not combine new expressions with existing ones. Each call replaces the previous row filter.

Implements iceberg::ManifestReader.

◆ LiveEntries()

Result< std::vector< ManifestEntry > > iceberg::ManifestReaderImpl::LiveEntries ( )
overridevirtual

Read only live (non-deleted) manifest entries.

Implements iceberg::ManifestReader.

◆ Select()

ManifestReader & iceberg::ManifestReaderImpl::Select ( const std::vector< std::string > &  columns)
overridevirtual

Select specific columns of data file to read from the manifest entries.

Note
Column names should match the names in DataFile schema. Unmatched names will be ignored.

Implements iceberg::ManifestReader.

◆ TryDropStats()

ManifestReader & iceberg::ManifestReaderImpl::TryDropStats ( )
overridevirtual

Try to drop stats from returned DataFile objects.

Implements iceberg::ManifestReader.


The documentation for this class was generated from the following files: