turicreate.SFrame.flat_map

SFrame.flat_map(self, column_names, fn, column_types='auto', seed=None)

Map each row of the SFrame to multiple rows in a new SFrame via a function.

The output of fn must have type List[List[…]]. Each inner list will be a single row in the new output, and the collection of these rows within the outer list make up the data for the output SFrame. All rows must have the same length and the same order of types to make sure the result columns are homogeneously typed. For example, if the first element emitted into in the outer list by fn is [43, 2.3, ‘string’], then all other elements emitted into the outer list must be a list with three elements, where the first is an int, second is a float, and third is a string. If column_types is not specified, the first 10 rows of the SFrame are used to determine the column types of the returned sframe.

Parameters:
column_names : list[str]

The column names for the returned SFrame.

fn : function

The function that maps each of the sframe row into multiple rows, returning List[List[…]]. All outputted rows must have the same length and order of types.

column_types : list[type], optional

The column types of the output SFrame. Default value will be automatically inferred by running fn on the first 10 rows of the input. If the types cannot be inferred from the first 10 rows, an error is raised.

seed : int, optional

Used as the seed if a random number generator is included in fn.

Returns:
out : SFrame

A new SFrame containing the results of the flat_map of the original SFrame.

Examples

Repeat each row according to the value in the ‘number’ column.

>>> sf = turicreate.SFrame({'letter': ['a', 'b', 'c'],
...                       'number': [1, 2, 3]})
>>> sf.flat_map(['number', 'letter'],
...             lambda x: [list(x.itervalues()) for i in range(0, x['number'])])
+--------+--------+
| number | letter |
+--------+--------+
|   1    |   a    |
|   2    |   b    |
|   2    |   b    |
|   3    |   c    |
|   3    |   c    |
|   3    |   c    |
+--------+--------+
[6 rows x 2 columns]