ak.run_lengths -------------- .. py:module: ak.run_lengths Defined in `awkward.operations.ak_run_lengths <https://github.com/scikit-hep/awkward/blob/3d14f2e74c7b6d99b09e436f90de4722f5959548/src/awkward/operations/ak_run_lengths.py>`__ on `line 17 <https://github.com/scikit-hep/awkward/blob/3d14f2e74c7b6d99b09e436f90de4722f5959548/src/awkward/operations/ak_run_lengths.py#L17>`__. .. py:function:: ak.run_lengths(array, *, highlevel=True, behavior=None, attrs=None) :param array: Array-like data (anything :py:obj:`ak.to_layout` recognizes). :param highlevel: If True, return an :py:obj:`ak.Array`; otherwise, return a low-level :py:obj:`ak.contents.Content` subclass. :type highlevel: bool :param behavior: Custom :py:obj:`ak.behavior` for the output array, if high-level. :type behavior: None or dict :param attrs: Custom attributes for the output array, if high-level. :type attrs: None or dict Computes the lengths of sequences of identical values at the deepest level of nesting, returning an array with the same structure but with ``int64`` type. For example, .. code-block:: python >>> array = ak.Array([1.1, 1.1, 1.1, 2.2, 3.3, 3.3, 4.4, 4.4, 5.5]) >>> ak.run_lengths(array) <Array [3, 1, 2, 2, 1] type='5 * int64'> There are 3 instances of 1.1, followed by 1 instance of 2.2, 2 instances of 3.3, 2 instances of 4.4, and 1 instance of 5.5. The order and uniqueness of the input data doesn't matter, .. code-block:: python >>> array = ak.Array([1.1, 1.1, 1.1, 5.5, 4.4, 4.4, 1.1, 1.1, 5.5]) >>> ak.run_lengths(array) <Array [3, 1, 2, 2, 1] type='5 * int64'> just the difference between each value and its neighbors. The data can be nested, but runs don't cross list boundaries. .. code-block:: python >>> array = ak.Array([[1.1, 1.1, 1.1, 2.2, 3.3], [3.3, 4.4], [4.4, 5.5]]) >>> ak.run_lengths(array) <Array [[3, 1, 1], [1, 1], [1, 1]] type='3 * var * int64'> This function recognizes strings as distinguishable values. .. code-block:: python >>> array = ak.Array([["one", "one"], ["one", "two", "two"], ["three", "two", "two"]]) >>> ak.run_lengths(array) <Array [[2], [1, 2], [1, 2]] type='3 * var * int64'> Note that this can be combined with :py:obj:`ak.argsort` and :py:obj:`ak.unflatten` to compute a "group by" operation: .. code-block:: python >>> array = ak.Array([{"x": 1, "y": 1.1}, {"x": 2, "y": 2.2}, {"x": 1, "y": 1.1}, ... {"x": 3, "y": 3.3}, {"x": 1, "y": 1.1}, {"x": 2, "y": 2.2}]) >>> sorted = array[ak.argsort(array.x)] >>> sorted.x <Array [1, 1, 1, 2, 2, 3] type='6 * int64'> >>> ak.run_lengths(sorted.x) <Array [3, 2, 1] type='3 * int64'> >>> ak.unflatten(sorted, ak.run_lengths(sorted.x)).show() [[{x: 1, y: 1.1}, {x: 1, y: 1.1}, {x: 1, y: 1.1}], [{x: 2, y: 2.2}, {x: 2, y: 2.2}], [{x: 3, y: 3.3}]] Unlike a database "group by," this operation can be applied in bulk to many sublists (though the run lengths need to be fully flattened to be used as ``counts`` for :py:obj:`ak.unflatten`, and you need to specify ``axis=-1`` as the depth). .. code-block:: python >>> array = ak.Array([[{"x": 1, "y": 1.1}, {"x": 2, "y": 2.2}, {"x": 1, "y": 1.1}], ... [{"x": 3, "y": 3.3}, {"x": 1, "y": 1.1}, {"x": 2, "y": 2.2}]]) >>> sorted = array[ak.argsort(array.x)] >>> sorted.x <Array [[1, 1, 2], [1, 2, 3]] type='2 * var * int64'> >>> ak.run_lengths(sorted.x) <Array [[2, 1], [1, 1, 1]] type='2 * var * int64'> >>> counts = ak.flatten(ak.run_lengths(sorted.x), axis=None) >>> ak.unflatten(sorted, counts, axis=-1).show() [[[{x: 1, y: 1.1}, {x: 1, y: 1.1}], [{x: 2, y: 2.2}]], [[{x: 1, y: 1.1}], [{x: 2, y: 2.2}], [{x: 3, y: 3.3}]]] See also :py:obj:`ak.num`, :py:obj:`ak.argsort`, :py:obj:`ak.unflatten`.