turicreate.SArray.unpack

SArray.unpack(column_name_prefix='X', column_types=None, na_value=None, limit=None)

Convert an SArray of list, array, or dict type to an SFrame with multiple columns.

unpack expands an SArray using the values of each list/array/dict as elements in a new SFrame of multiple columns. For example, an SArray of lists each of length 4 will be expanded into an SFrame of 4 columns, one for each list element. An SArray of lists/arrays of varying size will be expand to a number of columns equal to the longest list/array. An SArray of dictionaries will be expanded into as many columns as there are keys.

When unpacking an SArray of list or array type, new columns are named: column_name_prefix.0, column_name_prefix.1, etc. If unpacking a column of dict type, unpacked columns are named column_name_prefix.key1, column_name_prefix.key2, etc.

When unpacking an SArray of list or dictionary types, missing values in the original element remain as missing values in the resultant columns. If the na_value parameter is specified, all values equal to this given value are also replaced with missing values. In an SArray of array.array type, NaN is interpreted as a missing value.

turicreate.SFrame.pack_columns() is the reverse effect of unpack

Parameters:
column_name_prefix: str, optional

If provided, unpacked column names would start with the given prefix.

column_types: list[type], optional

Column types for the unpacked columns. If not provided, column types are automatically inferred from first 100 rows. Defaults to None.

na_value: optional

Convert all values that are equal to na_value to missing value if specified.

limit: list, optional

Limits the set of list/array/dict keys to unpack. For list/array SArrays, ‘limit’ must contain integer indices. For dict SArray, ‘limit’ must contain dictionary keys.

Returns:
out : SFrame

A new SFrame that contains all unpacked columns

Examples

To unpack a dict SArray

>>> sa = SArray([{ 'word': 'a',     'count': 1},
...              { 'word': 'cat',   'count': 2},
...              { 'word': 'is',    'count': 3},
...              { 'word': 'coming','count': 4}])

Normal case of unpacking SArray of type dict:

>>> sa.unpack(column_name_prefix=None)
Columns:
    count   int
    word    str

Rows: 4

Data:
+-------+--------+
| count |  word  |
+-------+--------+
|   1   |   a    |
|   2   |  cat   |
|   3   |   is   |
|   4   | coming |
+-------+--------+
[4 rows x 2 columns]

Unpack only keys with ‘word’:

>>> sa.unpack(limit=['word'])
Columns:
    X.word  str

Rows: 4

Data:
+--------+
| X.word |
+--------+
|   a    |
|  cat   |
|   is   |
| coming |
+--------+
[4 rows x 1 columns]
>>> sa2 = SArray([
...               [1, 0, 1],
...               [1, 1, 1],
...               [0, 1]])

Convert all zeros to missing values:

>>> sa2.unpack(column_types=[int, int, int], na_value=0)
Columns:
    X.0     int
    X.1     int
    X.2     int

Rows: 3

Data:
+------+------+------+
| X.0  | X.1  | X.2  |
+------+------+------+
|  1   | None |  1   |
|  1   |  1   |  1   |
| None |  1   | None |
+------+------+------+
[3 rows x 3 columns]