iceberg-cpp
Loading...
Searching...
No Matches
Public Member Functions | Static Public Member Functions | List of all members
iceberg::ManifestReader Class Referenceabstract

Read manifest entries from a manifest file. More...

#include <manifest_reader.h>

Inheritance diagram for iceberg::ManifestReader:
iceberg::ManifestReaderImpl

Public Member Functions

virtual Result< std::vector< ManifestEntry > > Entries ()=0
 Read all manifest entries in the manifest file.
 
virtual Result< std::vector< ManifestEntry > > LiveEntries ()=0
 Read only live (non-deleted) manifest entries.
 
virtual ManifestReaderSelect (const std::vector< std::string > &columns)=0
 Select specific columns of data file to read from the manifest entries.
 
virtual ManifestReaderFilterPartitions (std::shared_ptr< Expression > expr)=0
 Filter manifest entries by partition filter.
 
virtual ManifestReaderFilterPartitions (std::shared_ptr< class PartitionSet > partition_set)=0
 Filter manifest entries to a specific set of partitions.
 
virtual ManifestReaderFilterRows (std::shared_ptr< Expression > expr)=0
 Filter manifest entries by row-level filter.
 
virtual ManifestReaderCaseSensitive (bool case_sensitive)=0
 Set case sensitivity for column name matching.
 
virtual ManifestReaderTryDropStats ()=0
 Try to drop stats from returned DataFile objects.
 

Static Public Member Functions

static bool ShouldDropStats (const std::vector< std::string > &columns)
 Determine whether stats should be dropped based on selected columns.
 
static Result< std::unique_ptr< ManifestReader > > Make (const ManifestFile &manifest, std::shared_ptr< FileIO > file_io, std::shared_ptr< Schema > schema, std::shared_ptr< PartitionSpec > spec)
 Creates a reader for a manifest file.
 
static Result< std::unique_ptr< ManifestReader > > Make (std::string_view manifest_location, std::optional< int64_t > manifest_length, std::shared_ptr< FileIO > file_io, std::shared_ptr< Schema > schema, std::shared_ptr< PartitionSpec > spec, std::unique_ptr< InheritableMetadata > inheritable_metadata, std::optional< int64_t > first_row_id=std::nullopt)
 Creates a reader for a manifest file.
 
static std::vector< std::string > WithStatsColumns (const std::vector< std::string > &columns)
 Add stats columns to the column list if needed.
 

Detailed Description

Read manifest entries from a manifest file.

Member Function Documentation

◆ CaseSensitive()

virtual ManifestReader & iceberg::ManifestReader::CaseSensitive ( bool  case_sensitive)
pure virtual

Set case sensitivity for column name matching.

Implemented in iceberg::ManifestReaderImpl.

◆ Entries()

virtual Result< std::vector< ManifestEntry > > iceberg::ManifestReader::Entries ( )
pure virtual

Read all manifest entries in the manifest file.

TODO(gangwu): provide a lazy-evaluated iterator interface for better performance.

Implemented in iceberg::ManifestReaderImpl.

◆ FilterPartitions() [1/2]

virtual ManifestReader & iceberg::ManifestReader::FilterPartitions ( std::shared_ptr< class PartitionSet partition_set)
pure virtual

Filter manifest entries to a specific set of partitions.

Implemented in iceberg::ManifestReaderImpl.

◆ FilterPartitions() [2/2]

virtual ManifestReader & iceberg::ManifestReader::FilterPartitions ( std::shared_ptr< Expression expr)
pure virtual

Filter manifest entries by partition filter.

Note
Unlike the Java implementation, this method does not combine new expressions with existing ones. Each call replaces the previous partition filter.

Implemented in iceberg::ManifestReaderImpl.

◆ FilterRows()

virtual ManifestReader & iceberg::ManifestReader::FilterRows ( std::shared_ptr< Expression expr)
pure virtual

Filter manifest entries by row-level filter.

Note
Unlike the Java implementation, this method does not combine new expressions with existing ones. Each call replaces the previous row filter.

Implemented in iceberg::ManifestReaderImpl.

◆ LiveEntries()

virtual Result< std::vector< ManifestEntry > > iceberg::ManifestReader::LiveEntries ( )
pure virtual

Read only live (non-deleted) manifest entries.

Implemented in iceberg::ManifestReaderImpl.

◆ Make() [1/2]

Result< std::unique_ptr< ManifestReader > > iceberg::ManifestReader::Make ( const ManifestFile manifest,
std::shared_ptr< FileIO file_io,
std::shared_ptr< Schema schema,
std::shared_ptr< PartitionSpec spec 
)
static

Creates a reader for a manifest file.

Parameters
manifestA ManifestFile object containing metadata about the manifest.
file_ioFile IO implementation to use.
schemaSchema used to bind the partition type.
specPartition spec used for this manifest file.
Returns
A Result containing the reader or an error.

◆ Make() [2/2]

Result< std::unique_ptr< ManifestReader > > iceberg::ManifestReader::Make ( std::string_view  manifest_location,
std::optional< int64_t >  manifest_length,
std::shared_ptr< FileIO file_io,
std::shared_ptr< Schema schema,
std::shared_ptr< PartitionSpec spec,
std::unique_ptr< InheritableMetadata inheritable_metadata,
std::optional< int64_t >  first_row_id = std::nullopt 
)
static

Creates a reader for a manifest file.

Parameters
manifest_locationPath to the manifest file.
manifest_lengthLength of the manifest file.
file_ioFile IO implementation to use.
schemaSchema used to bind the partition type.
specPartition spec used for this manifest file.
inheritable_metadataInheritable metadata.
first_row_idFirst row ID to use for the manifest entries.
Returns
A Result containing the reader or an error.

◆ Select()

virtual ManifestReader & iceberg::ManifestReader::Select ( const std::vector< std::string > &  columns)
pure virtual

Select specific columns of data file to read from the manifest entries.

Note
Column names should match the names in DataFile schema. Unmatched names will be ignored.

Implemented in iceberg::ManifestReaderImpl.

◆ ShouldDropStats()

bool iceberg::ManifestReader::ShouldDropStats ( const std::vector< std::string > &  columns)
static

Determine whether stats should be dropped based on selected columns.

Returns true if the selected columns do not include any stats columns, or only include record_count (which is a primitive, not a large map).

◆ TryDropStats()

virtual ManifestReader & iceberg::ManifestReader::TryDropStats ( )
pure virtual

Try to drop stats from returned DataFile objects.

Implemented in iceberg::ManifestReaderImpl.


The documentation for this class was generated from the following files: