- ak.from_parquet(path, *, columns=None, row_groups=None, storage_options=None, max_gap=64000, max_block=256000000, footer_sample_size=1000000, generate_bitmasks=False, highlevel=True, behavior=None, attrs=None)#
path (str) – Local filename or remote URL, passed to fsspec for resolution. May contain glob patterns.
columns (None, str, or list of str) – Glob pattern(s) with bash-like curly brackets for matching column names. Nested records are separated by dots. If a list of patterns, the logical-or is matched. If None, all columns are read.
row_groups (None or set of int) – Row groups to read; must be non-negative. Order is ignored: the output array is presented in the order specified by Parquet metadata. If None, all row groups/all rows are read.
storage_options – Passed to
max_gap (int) – Passed to
max_block (int) – Passed to
footer_sample_size (int) – Passed to
generate_bitmasks (bool) – If enabled and Arrow/Parquet does not have Awkward metadata,
generate_bitmasks=Truecreates empty bitmasks for nullable types that don’t have bitmasks in the Arrow/Parquet data, so that the Form (BitMaskedForm vs UnmaskedForm) is predictable.
attrs (None or dict) – Custom attributes for the output array, if high-level.
Reads data from a local or remote Parquet file or collection of files.
The data are eagerly (not lazily) read and must fit into memory. Use
row_groups to select and filter manageable subsets of the data, and
ak.metadata_from_parquet to find column names and the range of row groups
that a dataset has.