|
iceberg-cpp
|
#include <schema.h>
Public Member Functions | |
| Schema (std::vector< SchemaField > fields, int32_t schema_id=kInitialSchemaId) | |
| int32_t | schema_id () const |
| Get the schema ID. | |
| std::string | ToString () const override |
| Get a user-readable string representation. | |
| Result< std::optional< std::reference_wrapper< const SchemaField > > > | FindFieldByName (std::string_view name, bool case_sensitive=true) const |
| Recursively find the SchemaField by field name. | |
| Result< std::optional< std::reference_wrapper< const SchemaField > > > | FindFieldById (int32_t field_id) const |
| Recursively find the SchemaField by field id. | |
| Result< std::optional< std::string_view > > | FindColumnNameById (int32_t field_id) const |
| Returns the canonical field name for the given id. | |
| Result< std::unique_ptr< StructLikeAccessor > > | GetAccessorById (int32_t field_id) const |
| Get the accessor to access the field by field id. | |
| Result< std::unique_ptr< Schema > > | Select (std::span< const std::string > names, bool case_sensitive=true) const |
| Creates a projected schema from selected field names. | |
| Result< std::unique_ptr< Schema > > | Project (const std::unordered_set< int32_t > &field_ids) const |
| Creates a projected schema from selected field IDs. | |
| const std::vector< int32_t > & | IdentifierFieldIds () const |
| Return the field IDs of the identifier fields. | |
| Result< std::vector< std::string > > | IdentifierFieldNames () const |
| Return the canonical field names of the identifier fields. | |
| Result< int32_t > | HighestFieldId () const |
| Get the highest field ID in the schema. | |
| bool | SameSchema (const Schema &other) const |
| Checks whether this schema is equivalent to another schema while ignoring the schema id. | |
| Status | Validate (int32_t format_version) const |
| Validate the schema for a given format version. | |
Public Member Functions inherited from iceberg::StructType | |
| StructType (std::vector< SchemaField > fields) | |
| TypeId | type_id () const override |
| Get the type ID. | |
| std::string | ToString () const override |
| Get a user-readable string representation. | |
| std::span< const SchemaField > | fields () const override |
| Get a view of the child fields. | |
| Result< std::optional< SchemaFieldConstRef > > | GetFieldById (int32_t field_id) const override |
| Get a field by field ID. | |
| Result< std::optional< SchemaFieldConstRef > > | GetFieldByIndex (int32_t index) const override |
| Get a field by index. | |
| Result< std::optional< SchemaFieldConstRef > > | GetFieldByName (std::string_view name, bool case_sensitive) const override |
| Get a field by name. Return an error Status if the field name is not unique; prefer GetFieldById or GetFieldByIndex when possible. | |
| std::unique_ptr< Schema > | ToSchema () const |
| virtual Result< std::optional< SchemaFieldConstRef > > | GetFieldByName (std::string_view name, bool case_sensitive) const=0 |
| Get a field by name. Return an error Status if the field name is not unique; prefer GetFieldById or GetFieldByIndex when possible. | |
| Result< std::optional< SchemaFieldConstRef > > | GetFieldByName (std::string_view name) const |
| Get a field by name (case-sensitive). | |
Public Member Functions inherited from iceberg::NestedType | |
| bool | is_primitive () const override |
| Is this a primitive type (may not have child fields)? | |
| bool | is_nested () const override |
| Is this a nested type (may have child fields)? | |
| Result< std::optional< SchemaFieldConstRef > > | GetFieldByName (std::string_view name) const |
| Get a field by name (case-sensitive). | |
Static Public Member Functions | |
| static Result< std::unique_ptr< Schema > > | Make (std::vector< SchemaField > fields, int32_t schema_id, std::vector< int32_t > identifier_field_ids) |
| Create a schema. | |
| static Result< std::unique_ptr< Schema > > | Make (std::vector< SchemaField > fields, int32_t schema_id, const std::vector< std::string > &identifier_field_names) |
| Create a schema. | |
| static Status | ValidateIdentifierFields (int32_t field_id, const Schema &schema, const std::unordered_map< int32_t, int32_t > &id_to_parent) |
| Validate that the identifier field with the given ID is valid for the schema. | |
| static const std::shared_ptr< Schema > & | EmptySchema () |
| Get an empty schema. | |
Static Public Attributes | |
| static constexpr int32_t | kInitialSchemaId = 0 |
| static constexpr int32_t | kInitialColumnId = 0 |
| static constexpr int32_t | kInvalidColumnId = -1 |
| static constexpr std::string_view | kAllColumns = "*" |
| Special value to select all columns from manifest files. | |
Static Public Attributes inherited from iceberg::StructType | |
| static constexpr TypeId | kTypeId = TypeId::kStruct |
Friends | |
| bool | operator== (const Schema &lhs, const Schema &rhs) |
Additional Inherited Members | |
Public Types inherited from iceberg::NestedType | |
| using | SchemaFieldConstRef = std::reference_wrapper< const SchemaField > |
Protected Member Functions inherited from iceberg::StructType | |
| bool | Equals (const Type &other) const override |
| Compare two types for equality. | |
Static Protected Member Functions inherited from iceberg::StructType | |
| static Result< std::unordered_map< int32_t, SchemaFieldConstRef > > | InitFieldById (const StructType &) |
| static Result< std::unordered_map< std::string_view, SchemaFieldConstRef > > | InitFieldByName (const StructType &) |
| static Result< std::unordered_map< std::string, SchemaFieldConstRef > > | InitFieldByLowerCaseName (const StructType &) |
Protected Attributes inherited from iceberg::StructType | |
| std::vector< SchemaField > | fields_ |
| Lazy< InitFieldById > | field_by_id_ |
| Lazy< InitFieldByName > | field_by_name_ |
| Lazy< InitFieldByLowerCaseName > | field_by_lowercase_name_ |
A schema for a Table.
A schema is a list of typed columns, along with a unique integer ID. A Table may have different schemas over its lifetime due to schema evolution.
|
static |
Get an empty schema.
An empty schema has no fields and a schema ID of 0.
| Result< std::optional< std::string_view > > iceberg::Schema::FindColumnNameById | ( | int32_t | field_id | ) | const |
Returns the canonical field name for the given id.
| field_id | The id of the field to get the canonical name for. |
| Result< std::optional< std::reference_wrapper< const SchemaField > > > iceberg::Schema::FindFieldById | ( | int32_t | field_id | ) | const |
Recursively find the SchemaField by field id.
| field_id | The id of the field to get the accessor for. |
| Result< std::optional< std::reference_wrapper< const SchemaField > > > iceberg::Schema::FindFieldByName | ( | std::string_view | name, |
| bool | case_sensitive = true |
||
| ) | const |
Recursively find the SchemaField by field name.
Short names for maps and lists are included for any name that does not conflict with a canonical name. For example, a list, 'l', of structs with field 'x' will produce short name 'l.x' in addition to canonical name 'l.element.x'. A map 'm', if its value includes a struct with field 'x' will produce short name 'm.x' in addition to canonical name 'm.value.x'. FIXME: Currently only handles ASCII lowercase conversion; extend to support non-ASCII characters (e.g., using std::towlower or ICU)
| Result< std::unique_ptr< StructLikeAccessor > > iceberg::Schema::GetAccessorById | ( | int32_t | field_id | ) | const |
Get the accessor to access the field by field id.
| field_id | The id of the field to get the accessor for. |
| Result< int32_t > iceberg::Schema::HighestFieldId | ( | ) | const |
Get the highest field ID in the schema.
|
static |
Create a schema.
| fields | The fields that make up the schema. |
| schema_id | The unique identifier for this schema (default: kInitialSchemaId). |
| identifier_field_names | Canonical names of fields that uniquely identify rows in the table. |
|
static |
Create a schema.
| fields | The fields that make up the schema. |
| schema_id | The unique identifier for this schema (default:kInitialSchemaId). |
| identifier_field_ids | Field IDs that uniquely identify rows in the table. |
| Result< std::unique_ptr< Schema > > iceberg::Schema::Project | ( | const std::unordered_set< int32_t > & | field_ids | ) | const |
Creates a projected schema from selected field IDs.
| field_ids | Set of field IDs to select |
| int32_t iceberg::Schema::schema_id | ( | ) | const |
Get the schema ID.
A schema is identified by a unique ID for the purposes of schema evolution.
| Result< std::unique_ptr< Schema > > iceberg::Schema::Select | ( | std::span< const std::string > | names, |
| bool | case_sensitive = true |
||
| ) | const |
Creates a projected schema from selected field names.
| names | Selected field names and nested names are dot-concatenated. |
| case_sensitive | Whether name matching is case-sensitive (default: true). |
|
overridevirtual |
Get a user-readable string representation.
Implements iceberg::util::Formattable.
| Status iceberg::Schema::Validate | ( | int32_t | format_version | ) | const |
Validate the schema for a given format version.
This validates that the schema does not contain types that were released in later format versions.
| format_version | The format version to validate against. |
|
static |
Validate that the identifier field with the given ID is valid for the schema.
This method checks that the specified field ID represents a valid identifier field according to Iceberg's identifier field requirements. It verifies that the field:
| field_id | The ID of the field to validate as an identifier field. |
| schema | The schema containing the field to validate. |
| id_to_parent | A mapping from field IDs to their parent field IDs for nested field validation. |