Skip to content

feat: expose Arrow IPC reader via registerArrow and readArrow #37

@andygrove

Description

@andygrove

Is your feature request related to a problem or challenge?

DataFusion 53.1 supports Arrow IPC files via
SessionContext::read_arrow / register_arrow. Since results already
return to the JVM as Arrow batches over the C Data Interface, being
able to round-trip through Arrow IPC files is a natural fit and the
cheapest format to integrate.

Describe the solution you'd like

  • Add an ArrowReadOptions value class (file extension, schema).
  • Add proto/arrow_read_options.proto and pass options through the
    proto-over-JNI convention (refactor: pass csv and parquet read options via protobuf #29).
  • Expose SessionContext.registerArrow(name, path[, options]) and
    readArrow(path[, options]).
  • Cover with tests; can write fixtures with arrow-vector's own IPC
    writer to keep test deps minimal.

Describe alternatives you've considered

CREATE EXTERNAL TABLE … STORED AS ARROW via SQL.

Additional context

In the default DataFusion feature set; no Cargo flag changes required.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions