turicreate.SFrame.dropna

SFrame.dropna(columns=None, how='any', recursive=False)

Remove missing values from an SFrame. A missing value is either None or NaN. If how is ‘any’, a row will be removed if any of the columns in the columns parameter contains at least one missing value. If how is ‘all’, a row will be removed if all of the columns in the columns parameter are missing values.

If the columns parameter is not specified, the default is to consider all columns when searching for missing values.

Parameters:
columns : list or str, optional

The columns to use when looking for missing values. By default, all columns are used.

how : {‘any’, ‘all’}, optional

Specifies whether a row should be dropped if at least one column has missing values, or if all columns have missing values. ‘any’ is default.

recursive: bool

By default is False. If this flag is set to True, then nan check will be performed on each element of a sframe cell in a DFS manner if the cell has a nested structure, such as dict, list.

Returns:
out : SFrame

SFrame with missing values removed (according to the given rules).

See also

dropna_split
Drops missing rows from the SFrame and returns them.

Examples

Drop all missing values.

>>> sf = turicreate.SFrame({'a': [1, None, None], 'b': ['a', 'b', None]})
>>> sf.dropna()
+---+---+
| a | b |
+---+---+
| 1 | a |
+---+---+
[1 rows x 2 columns]

Drop rows where every value is missing.

>>> sf.dropna(any="all")
+------+---+
|  a   | b |
+------+---+
|  1   | a |
| None | b |
+------+---+
[2 rows x 2 columns]

Drop rows where column ‘a’ has a missing value.

>>> sf.dropna('a', any="all")
+---+---+
| a | b |
+---+---+
| 1 | a |
+---+---+
[1 rows x 2 columns]