ak.to_dataframe#
Defined in awkward.operations.ak_to_dataframe on line 24.
- ak.to_dataframe(array, *, how='inner', levelname=_default_levelname, anonymous='values')#
- Parameters:
array – Array-like data (anything
ak.to_layoutrecognizes).how (None or str) – Passed to pd.merge to combine DataFrames for each multiplicity into one DataFrame. If None, a list of Pandas DataFrames is returned.
levelname (int -> str) – Computes a name for each level of the row index from the number of levels deep.
anonymous (str) – Column name to use if the
arraydoes not contain records; otherwise, column names are derived from record fields.
Converts Awkward data structures into Pandas MultiIndex rows and columns. The resulting DataFrame(s) contains no Awkward structures.
ak.Arraystructures can’t be losslessly converted into a single DataFrame; different fields in a record structure might have different nested list lengths, but a DataFrame can have only one index.If
howis None, this function always returns a list of DataFrames (even if it contains only one DataFrame); otherwisehowis passed to pd.merge to merge them into a single DataFrame with the associated loss of data.In the following example, nested lists are converted into MultiIndex rows. The index level names
"entry","subentry"and"subsubentry"can be controlled with thelevelnameparameter. The column name"values"is assigned because this array has no fields; it can be controlled with theanonymousparameter.>>> ak.to_dataframe(ak.Array([[[1.1, 2.2], [], [3.3]], ... [], ... [[4.4], [5.5, 6.6]], ... [[7.7]], ... [[8.8]]])) values entry subentry subsubentry 0 0 0 1.1 1 2.2 2 0 3.3 2 0 0 4.4 1 0 5.5 1 6.6 3 0 0 7.7 4 0 0 8.8
In this example, nested records are converted into MultiIndex columns. (MultiIndex rows and columns can be mixed; these examples are deliberately simple.)
>>> ak.to_dataframe(ak.Array([ ... {"I": {"a": _, "b": {"i": _}}, "II": {"x": {"y": {"z": _}}}} ... for _ in range(0, 50, 10)])) I II a b x i y z entry 0 0 0 0 1 10 10 10 2 20 20 20 3 30 30 30 4 40 40 40
The following two examples show how fields of different length lists are merged. With
how="inner"(default), only subentries that exist for all fields are preserved; withhow="outer", all subentries are preserved at the expense of requiring missing values.>>> ak.to_dataframe(ak.Array([{"x": [], "y": [4.4, 3.3, 2.2, 1.1]}, ... {"x": [1], "y": [3.3, 2.2, 1.1]}, ... {"x": [1, 2], "y": [2.2, 1.1]}, ... {"x": [1, 2, 3], "y": [1.1]}, ... {"x": [1, 2, 3, 4], "y": []}]), ... how="inner") x y entry subentry 1 0 1 3.3 2 0 1 2.2 1 2 1.1 3 0 1 1.1
The same with
how="outer":>>> ak.to_dataframe(ak.Array([{"x": [], "y": [4.4, 3.3, 2.2, 1.1]}, ... {"x": [1], "y": [3.3, 2.2, 1.1]}, ... {"x": [1, 2], "y": [2.2, 1.1]}, ... {"x": [1, 2, 3], "y": [1.1]}, ... {"x": [1, 2, 3, 4], "y": []}]), ... how="outer") x y entry subentry 0 0 NaN 4.4 1 NaN 3.3 2 NaN 2.2 3 NaN 1.1 1 0 1.0 3.3 1 NaN 2.2 2 NaN 1.1 2 0 1.0 2.2 1 2.0 1.1 3 0 1.0 1.1 1 2.0 NaN 2 3.0 NaN 4 0 1.0 NaN 1 2.0 NaN 2 3.0 NaN 3 4.0 NaN