iceberg-cpp
Loading...
Searching...
No Matches
Public Member Functions | Static Public Member Functions | Static Public Attributes | Friends | List of all members
iceberg::Schema Class Reference

A schema for a Table. More...

#include <schema.h>

Inheritance diagram for iceberg::Schema:
iceberg::StructType iceberg::NestedType iceberg::Type iceberg::util::Formattable

Public Member Functions

 Schema (std::vector< SchemaField > fields, int32_t schema_id=kInitialSchemaId)
 
int32_t schema_id () const
 Get the schema ID.
 
std::string ToString () const override
 Get a user-readable string representation.
 
Result< std::optional< std::reference_wrapper< const SchemaField > > > FindFieldByName (std::string_view name, bool case_sensitive=true) const
 Recursively find the SchemaField by field name.
 
Result< std::optional< std::reference_wrapper< const SchemaField > > > FindFieldById (int32_t field_id) const
 Recursively find the SchemaField by field id.
 
Result< std::optional< std::string_view > > FindColumnNameById (int32_t field_id) const
 Returns the canonical field name for the given id.
 
Result< std::unique_ptr< StructLikeAccessor > > GetAccessorById (int32_t field_id) const
 Get the accessor to access the field by field id.
 
Result< std::unique_ptr< Schema > > Select (std::span< const std::string > names, bool case_sensitive=true) const
 Creates a projected schema from selected field names.
 
Result< std::unique_ptr< Schema > > Project (const std::unordered_set< int32_t > &field_ids) const
 Creates a projected schema from selected field IDs.
 
const std::vector< int32_t > & IdentifierFieldIds () const
 Return the field IDs of the identifier fields.
 
Result< std::vector< std::string > > IdentifierFieldNames () const
 Return the canonical field names of the identifier fields.
 
Result< int32_t > HighestFieldId () const
 Get the highest field ID in the schema.
 
bool SameSchema (const Schema &other) const
 Checks whether this schema is equivalent to another schema while ignoring the schema id.
 
Status Validate (int32_t format_version) const
 Validate the schema for a given format version.
 
- Public Member Functions inherited from iceberg::StructType
 StructType (std::vector< SchemaField > fields)
 
TypeId type_id () const override
 Get the type ID.
 
std::string ToString () const override
 Get a user-readable string representation.
 
std::span< const SchemaFieldfields () const override
 Get a view of the child fields.
 
Result< std::optional< SchemaFieldConstRef > > GetFieldById (int32_t field_id) const override
 Get a field by field ID.
 
Result< std::optional< SchemaFieldConstRef > > GetFieldByIndex (int32_t index) const override
 Get a field by index.
 
Result< std::optional< SchemaFieldConstRef > > GetFieldByName (std::string_view name, bool case_sensitive) const override
 Get a field by name. Return an error Status if the field name is not unique; prefer GetFieldById or GetFieldByIndex when possible.
 
std::unique_ptr< SchemaToSchema () const
 
virtual Result< std::optional< SchemaFieldConstRef > > GetFieldByName (std::string_view name, bool case_sensitive) const=0
 Get a field by name. Return an error Status if the field name is not unique; prefer GetFieldById or GetFieldByIndex when possible.
 
Result< std::optional< SchemaFieldConstRef > > GetFieldByName (std::string_view name) const
 Get a field by name (case-sensitive).
 
- Public Member Functions inherited from iceberg::NestedType
bool is_primitive () const override
 Is this a primitive type (may not have child fields)?
 
bool is_nested () const override
 Is this a nested type (may have child fields)?
 
Result< std::optional< SchemaFieldConstRef > > GetFieldByName (std::string_view name) const
 Get a field by name (case-sensitive).
 

Static Public Member Functions

static Result< std::unique_ptr< Schema > > Make (std::vector< SchemaField > fields, int32_t schema_id, std::vector< int32_t > identifier_field_ids)
 Create a schema.
 
static Result< std::unique_ptr< Schema > > Make (std::vector< SchemaField > fields, int32_t schema_id, const std::vector< std::string > &identifier_field_names)
 Create a schema.
 
static Status ValidateIdentifierFields (int32_t field_id, const Schema &schema, const std::unordered_map< int32_t, int32_t > &id_to_parent)
 Validate that the identifier field with the given ID is valid for the schema.
 
static const std::shared_ptr< Schema > & EmptySchema ()
 Get an empty schema.
 

Static Public Attributes

static constexpr int32_t kInitialSchemaId = 0
 
static constexpr int32_t kInitialColumnId = 0
 
static constexpr int32_t kInvalidColumnId = -1
 
static constexpr std::string_view kAllColumns = "*"
 Special value to select all columns from manifest files.
 
- Static Public Attributes inherited from iceberg::StructType
static constexpr TypeId kTypeId = TypeId::kStruct
 

Friends

bool operator== (const Schema &lhs, const Schema &rhs)
 

Additional Inherited Members

- Public Types inherited from iceberg::NestedType
using SchemaFieldConstRef = std::reference_wrapper< const SchemaField >
 
- Protected Member Functions inherited from iceberg::StructType
bool Equals (const Type &other) const override
 Compare two types for equality.
 
- Static Protected Member Functions inherited from iceberg::StructType
static Result< std::unordered_map< int32_t, SchemaFieldConstRef > > InitFieldById (const StructType &)
 
static Result< std::unordered_map< std::string_view, SchemaFieldConstRef > > InitFieldByName (const StructType &)
 
static Result< std::unordered_map< std::string, SchemaFieldConstRef > > InitFieldByLowerCaseName (const StructType &)
 
- Protected Attributes inherited from iceberg::StructType
std::vector< SchemaFieldfields_
 
Lazy< InitFieldById > field_by_id_
 
Lazy< InitFieldByName > field_by_name_
 
Lazy< InitFieldByLowerCaseName > field_by_lowercase_name_
 

Detailed Description

A schema for a Table.

A schema is a list of typed columns, along with a unique integer ID. A Table may have different schemas over its lifetime due to schema evolution.

Member Function Documentation

◆ EmptySchema()

const std::shared_ptr< Schema > & iceberg::Schema::EmptySchema ( )
static

Get an empty schema.

An empty schema has no fields and a schema ID of 0.

◆ FindColumnNameById()

Result< std::optional< std::string_view > > iceberg::Schema::FindColumnNameById ( int32_t  field_id) const

Returns the canonical field name for the given id.

Parameters
field_idThe id of the field to get the canonical name for.
Returns
The canocinal column name of the field with the given id, or std::nullopt if not found.

◆ FindFieldById()

Result< std::optional< std::reference_wrapper< const SchemaField > > > iceberg::Schema::FindFieldById ( int32_t  field_id) const

Recursively find the SchemaField by field id.

Parameters
field_idThe id of the field to get the accessor for.
Returns
The field with the given id, or std::nullopt if not found.

◆ FindFieldByName()

Result< std::optional< std::reference_wrapper< const SchemaField > > > iceberg::Schema::FindFieldByName ( std::string_view  name,
bool  case_sensitive = true 
) const

Recursively find the SchemaField by field name.

Short names for maps and lists are included for any name that does not conflict with a canonical name. For example, a list, 'l', of structs with field 'x' will produce short name 'l.x' in addition to canonical name 'l.element.x'. A map 'm', if its value includes a struct with field 'x' will produce short name 'm.x' in addition to canonical name 'm.value.x'. FIXME: Currently only handles ASCII lowercase conversion; extend to support non-ASCII characters (e.g., using std::towlower or ICU)

◆ GetAccessorById()

Result< std::unique_ptr< StructLikeAccessor > > iceberg::Schema::GetAccessorById ( int32_t  field_id) const

Get the accessor to access the field by field id.

Parameters
field_idThe id of the field to get the accessor for.
Returns
The accessor to access the field, or NotFound if the field is not found.

◆ HighestFieldId()

Result< int32_t > iceberg::Schema::HighestFieldId ( ) const

Get the highest field ID in the schema.

Returns
The highest field ID.

◆ Make() [1/2]

Result< std::unique_ptr< Schema > > iceberg::Schema::Make ( std::vector< SchemaField fields,
int32_t  schema_id,
const std::vector< std::string > &  identifier_field_names 
)
static

Create a schema.

Parameters
fieldsThe fields that make up the schema.
schema_idThe unique identifier for this schema (default: kInitialSchemaId).
identifier_field_namesCanonical names of fields that uniquely identify rows in the table.
Returns
A new Schema instance or Status if failed.

◆ Make() [2/2]

Result< std::unique_ptr< Schema > > iceberg::Schema::Make ( std::vector< SchemaField fields,
int32_t  schema_id,
std::vector< int32_t >  identifier_field_ids 
)
static

Create a schema.

Parameters
fieldsThe fields that make up the schema.
schema_idThe unique identifier for this schema (default:kInitialSchemaId).
identifier_field_idsField IDs that uniquely identify rows in the table.
Returns
A new Schema instance or Status if failed.

◆ Project()

Result< std::unique_ptr< Schema > > iceberg::Schema::Project ( const std::unordered_set< int32_t > &  field_ids) const

Creates a projected schema from selected field IDs.

Parameters
field_idsSet of field IDs to select
Returns
Projected schema containing only the specified fields.
Note
Field ID of a nested field may not be projected unless at least one of its sub-fields has been projected.

◆ schema_id()

int32_t iceberg::Schema::schema_id ( ) const

Get the schema ID.

A schema is identified by a unique ID for the purposes of schema evolution.

◆ Select()

Result< std::unique_ptr< Schema > > iceberg::Schema::Select ( std::span< const std::string >  names,
bool  case_sensitive = true 
) const

Creates a projected schema from selected field names.

Parameters
namesSelected field names and nested names are dot-concatenated.
case_sensitiveWhether name matching is case-sensitive (default: true).
Returns
Projected schema containing only selected fields.
Note
If the field name of a nested type has been selected, all of its sub-fields will be selected.

◆ ToString()

std::string iceberg::Schema::ToString ( ) const
overridevirtual

Get a user-readable string representation.

Implements iceberg::util::Formattable.

◆ Validate()

Status iceberg::Schema::Validate ( int32_t  format_version) const

Validate the schema for a given format version.

This validates that the schema does not contain types that were released in later format versions.

Parameters
format_versionThe format version to validate against.
Returns
Error status if the schema is invalid.

◆ ValidateIdentifierFields()

Status iceberg::Schema::ValidateIdentifierFields ( int32_t  field_id,
const Schema schema,
const std::unordered_map< int32_t, int32_t > &  id_to_parent 
)
static

Validate that the identifier field with the given ID is valid for the schema.

This method checks that the specified field ID represents a valid identifier field according to Iceberg's identifier field requirements. It verifies that the field:

  • exists in the schema
  • is a primitive type
  • is not optional (required field)
  • is not a float or double type
  • is not nested within optional or non-struct parent fields
Parameters
field_idThe ID of the field to validate as an identifier field.
schemaThe schema containing the field to validate.
id_to_parentA mapping from field IDs to their parent field IDs for nested field validation.
Returns
Status indicating success or failure of the validation.

The documentation for this class was generated from the following files: