Defined in awkward.operations.ak_to_parquet_dataset on line 11.

ak.to_parquet_dataset(directory, filenames=None, storage_options=None)#
  • directory (str or Path) – A directory in which to write _common_metadata and _metadata, making the directory of Parquet files into a dataset.

  • filenames (None or list of str or Path) – If None, the directory will be recursively searched for files ending in filename_extension and sorted lexicographically. Otherwise, this explicit list of files is taken and row-groups are concatenated in its given order. If any filenames are relative, they are interpreted relative to directory.

  • filename_extension (str) – Filename extension (including .) to use to search for files recursively. Ignored if filenames is None.

Creates a _common_metadata and a _metadata in a directory of Parquet files.

>>> ak.to_parquet(array1, "/directory/arr1.parquet", parquet_compliant_nested=True)
>>> ak.to_parquet(array2, "/directory/arr2.parquet", parquet_compliant_nested=True)
>>> ak.to_parquet_dataset("/directory")

The _common_metadata contains the schema that all files share. (If the files have different schemas, this function raises an exception.)

The _metadata contains row-group metadata used to seek to specific row-groups within the multi-file dataset.