# How to create arrays of lists#

```
import awkward as ak
import numpy as np
```

## From Python lists#

If you have a collection of Python lists, the easiest way to turn them into an Awkward Array is to pass them to the `ak.Array`

constructor, which recognizes a non-dict, non-NumPy iterable and calls `ak.from_iter()`

.

```
python_lists = [[1, 2, 3], [], [4, 5], [6], [7, 8, 9, 10]]
python_lists
```

```
[[1, 2, 3], [], [4, 5], [6], [7, 8, 9, 10]]
```

```
awkward_array = ak.Array(python_lists)
awkward_array
```

[[1, 2, 3], [], [4, 5], [6], [7, 8, 9, 10]] --------------------- type: 5 * var * int64

The lists of lists can be arbitrarily deep.

```
python_lists = [[[[], [1, 2, 3]]], [[[4, 5]]], []]
python_lists
```

```
[[[[], [1, 2, 3]]], [[[4, 5]]], []]
```

```
awkward_array = ak.Array(python_lists)
awkward_array
```

[[[[], [1, 2, 3]]], [[[4, 5]]], []] --------------------------------- type: 3 * var * var * var * int64

The “`var *`

” in the type string indicates nested levels of variable-length lists. This is an array of lists of lists of lists of integers.

The advantage of the Awkward Array is that the numerical data are now all in one array buffer and calculations are vectorized across the array, such as NumPy universal functions.

```
np.sqrt(awkward_array)
```

[[[[], [1, 1.41, 1.73]]], [[[2, 2.24]]], []] ----------------------------------- type: 3 * var * var * var * float64

Unlike Python lists, arrays consist of a homogeneous type. A Python list wouldn’t notice if numerical data were given at two different levels of nesting, but that’s a big difference to an Awkward Array.

```
union_array = ak.Array([[[[], [1, 2, 3]]], [[4, 5]], []])
union_array
```

[[[[], [1, 2, 3]]], [[4, 5]], []] ---------------------------- type: 3 * var * var * union[ var * int64, int64 ]

In this example, the data type is a “union” of two levels deep and three levels deep.

```
union_array.type
```

```
ArrayType(ListType(ListType(UnionType([ListType(NumpyType('int64')), NumpyType('int64')]))), 3, None)
```

Some operations are possible with union arrays, but not all. (Iteration in Numba is one such example.)

## From NumPy arrays#

The `ak.Array`

constructor loads NumPy arrays differently from Python lists. The inner dimensions of a NumPy array are guaranteed to have the same lengths, so they are interpreted as a fixed-length list type.

```
numpy_array = np.arange(2 * 3 * 5).reshape(2, 3, 5)
numpy_array
```

```
array([[[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]],
[[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]]])
```

```
regular_array = ak.Array(numpy_array)
regular_array
```

[[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]], [[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]]] -------------------------------------------------------------------- type: 2 * 3 * 5 * int64

The type in this case has no “`var *`

” in it, only “`2 *`

”, “`3 *`

”, and “`5 *`

”. It’s a length-2 array of length-3 lists containing length-5 lists of integers.

Furthermore, if NumPy arrays are *nested within* Python lists (or other iterables), they’ll be treated as variable-length (”`var *`

”) because there’s no guarantee at the start of a sequence that all NumPy arrays in the sequence will have the same shape.

```
numpy_arrays = [
np.arange(3 * 5).reshape(3, 5),
np.arange(3 * 5, 2 * 3 * 5).reshape(3, 5),
]
numpy_arrays
```

```
[array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]]),
array([[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24],
[25, 26, 27, 28, 29]])]
```

```
irregular_array = ak.Array(numpy_arrays)
irregular_array
```

[[[0, 1, 2, 3, 4], [5, 6, 7, 8, 9], [10, 11, 12, 13, 14]], [[15, 16, 17, 18, 19], [20, 21, 22, 23, 24], [25, 26, 27, 28, 29]]] -------------------------------------------------------------------- type: 2 * var * var * int64

Both `regular_array`

and `irregular_array`

have the same data values:

```
regular_array.to_list() == irregular_array.to_list()
```

```
True
```

but they have different types:

```
regular_array.type, irregular_array.type
```

```
(ArrayType(RegularType(RegularType(NumpyType('int64'), 5), 3), 2, None),
ArrayType(ListType(ListType(NumpyType('int64'))), 2, None))
```

This can make a difference in some operations, such as broadcasting.

If you want more control over this, use the explicit `ak.from_iter()`

and `ak.from_numpy()`

functions instead of the general-purpose `ak.Array`

constructor.

## Unflattening#

Another difference between `ak.from_iter()`

and `ak.from_numpy()`

is that iteration over Python lists is slow and necessarily copies the data, whereas ingesting a NumPy array is zero-copy. (You can see that it’s zero copy by changing the data in-place.)

In some cases, list-making can be vectorized. If you have a flat NumPy array of data and an array of “counts” that add up to the length of the data, then you can `ak.unflatten()`

it.

```
data = np.array([1, 2, 3, 4, 5, 6, 7, 8])
counts = np.array([3, 0, 1, 4])
unflattened = ak.unflatten(data, counts)
unflattened
```

[[1, 2, 3], [], [4], [5, 6, 7, 8]] --------------------- type: 4 * var * int64

The first list has length `3`

, the second has length `0`

, the third has length `1`

, and the last has length `4`

. This is close to Awkward Array’s internal representation of variable-length lists, so it can be performed quickly.

This function is named `ak.unflatten()`

because it has the opposite effect as `ak.flatten()`

and `ak.num()`

:

```
ak.flatten(unflattened)
```

[1, 2, 3, 4, 5, 6, 7, 8] --------------- type: 8 * int64

```
ak.num(unflattened)
```

[3, 0, 1, 4] --------------- type: 4 * int64

## With ArrayBuilder#

`ak.ArrayBuilder`

is described in more detail in this tutorial, but you can also construct arrays of lists using the `begin_list`

/`end_list`

methods or the `list`

context manager.

(This is what `ak.from_iter()`

uses internally to accumulate lists.)

```
builder = ak.ArrayBuilder()
builder.begin_list()
builder.append(1)
builder.append(2)
builder.append(3)
builder.end_list()
builder.begin_list()
builder.end_list()
builder.begin_list()
builder.append(4)
builder.append(5)
builder.end_list()
array = builder.snapshot()
array
```

[[1, 2, 3], [], [4, 5]] --------------------- type: 3 * var * int64

```
builder = ak.ArrayBuilder()
with builder.list():
builder.append(1)
builder.append(2)
builder.append(3)
with builder.list():
pass
with builder.list():
builder.append(4)
builder.append(5)
array = builder.snapshot()
array
```

[[1, 2, 3], [], [4, 5]] --------------------- type: 3 * var * int64

## In Numba#

Functions that Numba Just-In-Time (JIT) compiles can use `ak.ArrayBuilder`

or construct flat data and “counts” arrays for `ak.unflatten()`

.

(At this time, Numba can’t use context managers, the `with`

statement, in fully compiled code. `ak.ArrayBuilder`

can’t be constructed or converted to an array using `snapshot`

inside a JIT-compiled function, but can be outside the compiled context. Similarly, `ak.*`

functions like `ak.unflatten()`

can’t be called inside a JIT-compiled function, but can be outside.)

```
import numba as nb
```

```
@nb.jit
def append_list(builder, start, stop):
builder.begin_list()
for x in range(start, stop):
builder.append(x)
builder.end_list()
@nb.jit
def example(builder):
append_list(builder, 1, 4)
append_list(builder, 999, 999)
append_list(builder, 4, 6)
return builder
builder = example(ak.ArrayBuilder())
array = builder.snapshot()
array
```

[[1, 2, 3], [], [4, 5]] --------------------- type: 3 * var * int64

```
@nb.jit
def append_list(i, data, j, counts, start, stop):
for x in range(start, stop):
data[i] = x
i += 1
counts[j] = stop - start
j += 1
return i, j
@nb.jit
def example():
data = np.empty(5, np.int64)
counts = np.empty(3, np.int64)
i, j = 0, 0
i, j = append_list(i, data, j, counts, 1, 4)
i, j = append_list(i, data, j, counts, 999, 999)
i, j = append_list(i, data, j, counts, 4, 6)
return data, counts
data, counts = example()
array = ak.unflatten(data, counts)
array
```

[[1, 2, 3], [], [4, 5]] --------------------- type: 3 * var * int64