turicreate.boosted_trees_classifier.BoostedTreesClassifier.extract_features¶
-
BoostedTreesClassifier.
extract_features
(dataset, missing_value_action='auto')¶ For each example in the dataset, extract the leaf indices of each tree as features.
For multiclass classification, each leaf index contains #num_class numbers.
The returned feature vectors can be used as input to train another supervised learning model such as a
LogisticClassifier
, or aSVMClassifier
.Parameters: - dataset : SFrame
Dataset of new observations. Must include columns with the same names as the features used for model training, but does not require a target column. Additional columns are ignored.
- missing_value_action: str, optional
Action to perform when missing values are encountered. This can be one of:
- ‘auto’: Choose a model dependent missing value policy.
- ‘impute’: Proceed with evaluation by filling in the missing
- values with the mean of the training data. Missing values are also imputed if an entire column of data is missing during evaluation.
- ‘none’: Treat missing value as is. Model must be able to handle
- missing value.
- ‘error’ : Do not proceed with prediction and terminate with
- an error message.
Returns: - out : SArray
An SArray of dtype array.array containing extracted features.
Examples
>>> data = turicreate.SFrame( 'https://static.turi.com/datasets/regression/houses.csv')
>>> # Regression Tree Models >>> data['regression_tree_features'] = model.extract_features(data)
>>> # Classification Tree Models >>> data['classification_tree_features'] = model.extract_features(data)