- ak.metadata_from_parquet(path, *, storage_options=None, row_groups=None, ignore_metadata=False, scan_files=True)#
path (str) – Local filename or remote URL, passed to fsspec for resolution. May contain glob patterns. A list of paths is also allowed, but they must be data files, not directories.
storage_options – Passed to
row_groups (None or set of int) – Row groups to read; must be non-negative. Order is ignored: the output array is presented in the order specified by Parquet metadata. If None, all row groups/all rows are read.
ignore_metadata (bool) – ignore the dedicated _metadata file if found and instead derive metadata from the first data file.
scan_files (bool) – TODO
This function differs from ak.from_parquet._metadata as follows:
this function will always use a _metadata file, if present
if there is no _metadata, the schema comes from _common_metadata or the first data file
the total number of rows is always known
Returns dict containing
form: an Awkward Form representing the low-level type of the data (use
.typeto get a high-level type),
fs: the fsspec filesystem object,
paths: a list of matching path names,
col_counts: the number of rows in each row group,
columns: the columns defined by the schema,
num_rows: the length of the array that would be read by
num_row_groups: the units that can be filtered (for the