ak.sum#
Defined in awkward.operations.ak_sum on line 24.
- ak.sum(array, axis=None, *, keepdims=False, mask_identity=False, highlevel=True, behavior=None, attrs=None)#
- Parameters:
array – Array-like data (anything
ak.to_layoutrecognizes).axis (None or int or str) – If None, combine all values from the array into a single scalar result; if an int, group by that axis:
0is the outermost,1is the first level of nested lists, etc., and negativeaxiscounts from the innermost:-1is the innermost,-2is the next level up, etc; if a str, it is interpreted as the name of the axis which maps to an int if named axes are present. Named axes are attached to an array usingak.with_named_axisand removed withak.without_named_axis; also see the Named axes user guide.keepdims (bool) – If False, this reducer decreases the number of dimensions by 1; if True, the reduced values are wrapped in a new length-1 dimension so that the result of this operation may be broadcasted with the original array.
mask_identity (bool) – If True, reducing over empty lists results in None (an option type); otherwise, reducing over empty lists results in the operation’s identity.
highlevel (bool) – If True, return an
ak.Array; otherwise, return a low-levelak.contents.Contentsubclass.behavior (None or dict) – Custom
ak.behaviorfor the output array, if high-level.attrs (None or dict) – Custom attributes for the output array, if high-level.
Sums over
array(many types supported, including all Awkward Arrays and Records). The identity of addition is0and it is usually not masked. This operation is the same as NumPy’s sum if all lists at a given dimension have the same length and no None values, but it generalizes to cases where they do not.For example, consider this
array, in which all lists at a given dimension have the same length.>>> array = ak.Array([[ 0.1, 0.2, 0.3], ... [10.1, 10.2, 10.3], ... [20.1, 20.2, 20.3], ... [30.1, 30.2, 30.3]])
A sum over
axis=-1combines the inner lists, leaving one value per outer list:>>> ak.sum(array, axis=-1) <Array [0.6, 30.6, 60.6, 90.6] type='4 * float64'>
while a sum over
axis=0combines the outer lists, leaving one value per inner list:>>> ak.sum(array, axis=0) <Array [60.4, 60.8, 61.2] type='3 * float64'>
Now with some values missing,
>>> array = ak.Array([[ 0.1, 0.2 ], ... [10.1 ], ... [20.1, 20.2, 20.3], ... [30.1, 30.2 ]])
The sum over
axis=-1results in>>> ak.sum(array, axis=-1) <Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'>
and the sum over
axis=0results in>>> ak.sum(array, axis=0) <Array [60.4, 50.6, 20.3] type='3 * float64'>
How we ought to sum over the innermost lists is unambiguous, but for all other
axisvalues, we must choose whether to align contents to the left before summing, to the right before summing, or something else. As suggested by the way the text has been aligned, we choose the left-alignment convention: the firstaxis=0result is the sum of all first elements:60.4 = 0.1 + 10.1 + 20.1 + 30.1
the second is the sum of all second elements:
50.6 = 0.2 + 20.2 + 30.2
and the third is the sum of the only third element:
20.3 = 20.3
The same is true if the values were None, rather than gaps:
>>> array = ak.Array([[ 0.1, 0.2, None], ... [10.1, None, None], ... [20.1, 20.2, 20.3], ... [30.1, 30.2, None]])
>>> ak.sum(array, axis=-1) <Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'> >>> ak.sum(array, axis=0) <Array [60.4, 50.6, 20.3] type='3 * float64'>
However, the missing value placeholder, None, allows us to align the remaining data differently:
>>> array = ak.Array([[None, 0.1, 0.2], ... [None, None, 10.1], ... [20.1, 20.2, 20.3], ... [None, 30.1, 30.2]])
Now the
axis=-1result is the same but theaxis=0result has changed:>>> ak.sum(array, axis=-1) <Array [0.3, 10.1, 60.6, 60.3] type='4 * float64'> >>> ak.sum(array, axis=0) <Array [20.1, 50.4, 60.8] type='3 * float64'>
because:
20.1 = 20.1 50.4 = 0.1 + 20.2 + 30.1 60.8 = 0.2 + 10.1 + 20.3 + 30.2
If, instead of missing numbers, we had missing lists,
>>> array = ak.Array([[ 0.1, 0.2, 0.3], ... None, ... [20.1, 20.2, 20.3], ... [30.1, 30.2, 30.3]])
then the placeholder would pass through the
axis=-1sum because summing over the inner dimension shouldn’t change the length of the outer dimension.>>> ak.sum(array, axis=-1) <Array [0.6, None, 60.6, 90.6] type='4 * ?float64'>
However, the
axis=0sum loses information about the None value.>>> ak.sum(array, axis=0) <Array [50.3, 50.6, 50.9] type='3 * float64'>
which is:
50.3 = 0.1 + (None) + 20.1 + 30.1 50.6 = 0.2 + (None) + 20.2 + 30.2 50.9 = 0.3 + (None) + 20.3 + 30.3
An
axis=0sum would be reducing that information if it had not been None, anyway. If the None values were replaced with0, the result foraxis=0would be the same. The result foraxis=-1would not be the same because this None is in the0axis, not the axis thataxis=-1sums over.The
keepdimsparameter ensures that the number of dimensions does not change: scalar results are put into new length-1 dimensions:>>> ak.sum(array, axis=-1, keepdims=True) <Array [[0.6], None, [60.6], [90.6]] type='4 * option[1 * float64]'> >>> ak.sum(array, axis=0, keepdims=True) <Array [[50.3, 50.6, 50.9]] type='1 * var * float64'>
and
axis=Noneignores all None values and adds up everything in the array (keepdimshas no effect).>>> ak.sum(array, axis=None) 151.8
The
mask_identity, which has no equivalent in NumPy, inserts None in the output wherever a reduction takes place over zero elements. This is different from reductions that are otherwise equal to the identity or are equal to the identity by cancellation.>>> array = ak.Array([[2.2, 2.2], [4.4, -2.2, -2.2], [], [0.0]]) >>> ak.sum(array, axis=-1) <Array [4.4, 0, 0, 0] type='4 * float64'> >>> ak.sum(array, axis=-1, mask_identity=True) <Array [4.4, 0, None, 0] type='4 * ?float64'>
The third list is reduced to
0ifmask_identity=Falsebecause0is the identity of addition, but it is reduced to None ifmask_identity=True.See also
ak.nansum.