turicreate.SArray.unpack¶
-
SArray.
unpack
(column_name_prefix='X', column_types=None, na_value=None, limit=None)¶ Convert an SArray of list, array, or dict type to an SFrame with multiple columns.
unpack expands an SArray using the values of each list/array/dict as elements in a new SFrame of multiple columns. For example, an SArray of lists each of length 4 will be expanded into an SFrame of 4 columns, one for each list element. An SArray of lists/arrays of varying size will be expand to a number of columns equal to the longest list/array. An SArray of dictionaries will be expanded into as many columns as there are keys.
When unpacking an SArray of list or array type, new columns are named: column_name_prefix.0, column_name_prefix.1, etc. If unpacking a column of dict type, unpacked columns are named column_name_prefix.key1, column_name_prefix.key2, etc.
When unpacking an SArray of list or dictionary types, missing values in the original element remain as missing values in the resultant columns. If the na_value parameter is specified, all values equal to this given value are also replaced with missing values. In an SArray of array.array type, NaN is interpreted as a missing value.
turicreate.SFrame.pack_columns()
is the reverse effect of unpackParameters: - column_name_prefix: str, optional
If provided, unpacked column names would start with the given prefix.
- column_types: list[type], optional
Column types for the unpacked columns. If not provided, column types are automatically inferred from first 100 rows. Defaults to None.
- na_value: optional
Convert all values that are equal to na_value to missing value if specified.
- limit: list, optional
Limits the set of list/array/dict keys to unpack. For list/array SArrays, ‘limit’ must contain integer indices. For dict SArray, ‘limit’ must contain dictionary keys.
Returns: - out : SFrame
A new SFrame that contains all unpacked columns
Examples
To unpack a dict SArray
>>> sa = SArray([{ 'word': 'a', 'count': 1}, ... { 'word': 'cat', 'count': 2}, ... { 'word': 'is', 'count': 3}, ... { 'word': 'coming','count': 4}])
Normal case of unpacking SArray of type dict:
>>> sa.unpack(column_name_prefix=None) Columns: count int word str Rows: 4 Data: +-------+--------+ | count | word | +-------+--------+ | 1 | a | | 2 | cat | | 3 | is | | 4 | coming | +-------+--------+ [4 rows x 2 columns]
Unpack only keys with ‘word’:
>>> sa.unpack(limit=['word']) Columns: X.word str Rows: 4 Data: +--------+ | X.word | +--------+ | a | | cat | | is | | coming | +--------+ [4 rows x 1 columns]
>>> sa2 = SArray([ ... [1, 0, 1], ... [1, 1, 1], ... [0, 1]])
Convert all zeros to missing values:
>>> sa2.unpack(column_types=[int, int, int], na_value=0) Columns: X.0 int X.1 int X.2 int Rows: 3 Data: +------+------+------+ | X.0 | X.1 | X.2 | +------+------+------+ | 1 | None | 1 | | 1 | 1 | 1 | | None | 1 | None | +------+------+------+ [3 rows x 3 columns]