ak.cartesian#
Defined in awkward.operations.ak_cartesian on line 31.
- ak.cartesian(arrays, axis=1, *, nested=None, parameters=None, with_name=None, highlevel=True, behavior=None, attrs=None)#
- Parameters:
arrays (mapping or sequence of arrays) – Each value in this mapping or sequence can be any array-like data that
ak.to_layoutrecognizes.axis (int or str) – The dimension at which this operation is applied. The outermost dimension is
0, followed by1, etc., and negative values count backward from the innermost:-1is the innermost dimension,-2is the next level up, etc. If a str, it is interpreted as the name of the axis which maps to an int if named axes are present. Named axes are attached to an array usingak.with_named_axisand removed withak.without_named_axis; also see the Named axes user guide.nested (None, True, False, or iterable of str or int) – If None or False, all combinations of elements from the
arraysare produced at the same level of nesting; if True, they are grouped in nested lists by combinations that share a common item from each of thearrays; if an iterable of str or int, group common items for a chosen set of keys from thearraydict or integer slots of thearrayiterable.parameters (None or dict) – Parameters for the new
ak.contents.RecordArraynode that is created by this operation.with_name (None or str) – Assigns a
"__record__"name to the newak.contents.RecordArraynode that is created by this operation (overridingparameters, if necessary).highlevel (bool) – If True, return an
ak.Array; otherwise, return a low-levelak.contents.Contentsubclass.behavior (None or dict) – Custom
ak.behaviorfor the output array, if high-level.attrs (None or dict) – Custom attributes for the output array, if high-level.
Computes a Cartesian product (i.e. cross product) of data from a set of
arrays. This operation creates records (ifarraysis a dict) or tuples (ifarraysis another kind of iterable) that hold the combinations of elements, and it can introduce new levels of nesting.As a simple example with
axis=0, the Cartesian product of>>> one = ak.Array([1, 2, 3]) >>> two = ak.Array(["a", "b"])
is
>>> ak.cartesian([one, two], axis=0).show() [(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')]
With nesting, a new level of nested lists is created to group combinations that share the same element from
oneinto the same list.>>> ak.cartesian([one, two], axis=0, nested=True).show() [[(1, 'a'), (1, 'b')], [(2, 'a'), (2, 'b')], [(3, 'a'), (3, 'b')]]
The primary purpose of this function, however, is to compute a different Cartesian product for each element of an array: in other words,
axis=1. The following arrays each have four elements.>>> one = ak.Array([[1, 2, 3], [], [4, 5], [6]]) >>> two = ak.Array([["a", "b"], ["c"], ["d"], ["e", "f"]])
The default
axis=1produces 6 pairs from the Cartesian product of[1, 2, 3]and["a", "b"], 0 pairs from[]and["c"], 1 pair from[4, 5]and["d"], and 1 pair from[6]and["e", "f"].>>> ak.cartesian([one, two]).show() [[(1, 'a'), (1, 'b'), (2, 'a'), (2, 'b'), (3, 'a'), (3, 'b')], [], [(4, 'd'), (5, 'd')], [(6, 'e'), (6, 'f')]]
The nesting depth is the same as the original arrays; with
nested=True, the nesting depth is increased by 1 and tuples are grouped by their first element.>>> ak.cartesian([one, two], nested=True).show() [[[(1, 'a'), (1, 'b')], [(2, 'a'), (2, ...)], [(3, 'a'), (3, 'b')]], [], [[(4, 'd')], [(5, 'd')]], [[(6, 'e'), (6, 'f')]]]
These tuples are
ak.contents.RecordArraynodes with unnamed fields. To name the fields, we can passoneandtwoin a dict, rather than a list.>>> ak.cartesian({"x": one, "y": two}).show() [[{x: 1, y: 'a'}, {x: 1, y: 'b'}, {...}, ..., {x: 3, y: 'a'}, {x: 3, y: 'b'}], [], [{x: 4, y: 'd'}, {x: 5, y: 'd'}], [{x: 6, y: 'e'}, {x: 6, y: 'f'}]]
With more than two elements in the Cartesian product,
nestedcan specify which are grouped and which are not. For example,>>> one = ak.Array([1, 2, 3, 4]) >>> two = ak.Array([1.1, 2.2, 3.3]) >>> three = ak.Array(["a", "b"])
can be left entirely ungrouped:
>>> ak.cartesian([one, two, three], axis=0).show() [(1, 1.1, 'a'), (1, 1.1, 'b'), (1, 2.2, 'a'), (1, 2.2, 'b'), (1, 3.3, 'a'), (1, 3.3, 'b'), (2, 1.1, 'a'), (2, 1.1, 'b'), (2, 2.2, 'a'), (2, 2.2, 'b'), ..., (3, 2.2, 'b'), (3, 3.3, 'a'), (3, 3.3, 'b'), (4, 1.1, 'a'), (4, 1.1, 'b'), (4, 2.2, 'a'), (4, 2.2, 'b'), (4, 3.3, 'a'), (4, 3.3, 'b')]
can be grouped by
one(adding 1 more dimension):>>> ak.cartesian([one, two, three], axis=0, nested=[0]).show() [[(1, 1.1, 'a'), (1, 1.1, 'b'), (1, 2.2, 'a')], [(1, 2.2, 'b'), (1, 3.3, 'a'), (1, 3.3, 'b')], [(2, 1.1, 'a'), (2, 1.1, 'b'), (2, 2.2, 'a')], [(2, 2.2, 'b'), (2, 3.3, 'a'), (2, 3.3, 'b')], [(3, 1.1, 'a'), (3, 1.1, 'b'), (3, 2.2, 'a')], [(3, 2.2, 'b'), (3, 3.3, 'a'), (3, 3.3, 'b')], [(4, 1.1, 'a'), (4, 1.1, 'b'), (4, 2.2, 'a')], [(4, 2.2, 'b'), (4, 3.3, 'a'), (4, 3.3, 'b')]]
can be grouped by
oneandtwo(adding 2 more dimensions):>>> ak.cartesian([one, two, three], axis=0, nested=[0, 1]).show() [[[(1, 1.1, 'a'), (1, 1.1, 'b')], [...], [(1, 3.3, 'a'), (1, 3.3, ...)]], [[(2, 1.1, 'a'), (2, 1.1, 'b')], [...], [(2, 3.3, 'a'), (2, 3.3, ...)]], [[(3, 1.1, 'a'), (3, 1.1, 'b')], [...], [(3, 3.3, 'a'), (3, 3.3, ...)]], [[(4, 1.1, 'a'), (4, 1.1, 'b')], [...], [(4, 3.3, 'a'), (4, 3.3, ...)]]]
or grouped by unique
one-twopairs (adding 1 more dimension):>>> ak.cartesian([one, two, three], axis=0, nested=[1]).show() [[(1, 1.1, 'a'), (1, 1.1, 'b')], [(1, 2.2, 'a'), (1, 2.2, 'b')], [(1, 3.3, 'a'), (1, 3.3, 'b')], [(2, 1.1, 'a'), (2, 1.1, 'b')], [(2, 2.2, 'a'), (2, 2.2, 'b')], [(2, 3.3, 'a'), (2, 3.3, 'b')], [(3, 1.1, 'a'), (3, 1.1, 'b')], [(3, 2.2, 'a'), (3, 2.2, 'b')], [(3, 3.3, 'a'), (3, 3.3, 'b')], [(4, 1.1, 'a'), (4, 1.1, 'b')], [(4, 2.2, 'a'), (4, 2.2, 'b')], [(4, 3.3, 'a'), (4, 3.3, 'b')]]
The order of the output is fixed: it is always lexicographical in the order that the
arraysare written.To emulate an SQL or Pandas “group by” operation, put the keys that you wish to group by first and use
nested=[0]ornested=[n]to group by unique n-tuples. If necessary, record keys can later be reordered with a list of strings inak.Array.__getitem__.To get list index positions in the tuples/records, rather than data from the original
arrays, useak.argcartesianinstead ofak.cartesian. Theak.argcartesianform can be particularly useful as nested indexing inak.Array.__getitem__.