Turi Create
4.0
|
#include <toolkits/supervised_learning/standardization-inl.hpp>
Public Member Functions | |
virtual | ~standardization_interface ()=default |
standardization_interface ()=default | |
virtual void | transform (DenseVector &point) const =0 |
virtual void | inverse_transform (DenseVector &point) const =0 |
virtual void | inverse_transform (SparseVector &point) const =0 |
virtual void | transform (SparseVector &point) const =0 |
virtual void | save (turi::oarchive &oarc) const =0 |
virtual void | load (turi::iarchive &iarc)=0 |
size_t | get_total_size () const |
Protected Attributes | |
size_t | total_size |
Interface for affine transformation of data for machine learning and optimization purposes.
Feature scaling performs standardization of data for supervised learning methods. Since the range of values of raw data varies widely, in some machine learning algorithms, objective functions will not work properly without normalization. Therefore, the range of all features should be normalized so that each feature contributes approximately equally.
The standardization interface makes sure that you can implement various types of data standardization methods without effecting much of the code base.
Each standardization scheme requires the following methods:
*) Construction based on metadata: Given a complete metadata object, we can construct the standardization object.
*) Transform: Perform a transformation from the original space to the standardized space.
*) Inverse-Transform: Perform a transformation from the standardized space to the original space.
1) Norm-Rescaling: Given a column of data x, the norm re-scaling changes the column to: x' = x / ||x||
where ||x|| can be the L1, L2, or L-Inf norm.
PROS: Sparsity preserving. CONS: May not be the right thing to do for regularized problems.
2) Mean-Stdev: Given a column of data x, the norm re-scaling changes the column to: x' = (x - mean) / stdev
PROS: Statistically well documented. CONS: Sparsity breaking
3) Min-Max: Given a column of data x, the norm re-scaling changes the column to: x' = (x - min(x)) / (max(x) - min(x))
PROS: Well documented for SVM. CONS: Sparsity breaking
Definition at line 95 of file standardization-inl.hpp.
|
virtualdefault |
Default destructor.
|
default |
Default constructor.
|
inline |
Return the total size of all the variables in the space.
[out] | total_size | Size of all the variables in the space. |
Numeric : 1 Categorical : # Unique categories Vector : Size of the vector. CategoricalVector : # Unique categories. Dictionary : # Keys
For reference encoding, subtract 1 from the Categorical and Categorical-Vector types.
Definition at line 191 of file standardization-inl.hpp.
|
pure virtual |
Inverse transform a point from the standardized space to the original space.
[in,out] | point(DenseVector) | Point to be transformed. |
Implemented in turi::supervised::l2_rescaling.
|
pure virtual |
Inverse transform a point from the standardized space to the original space.
[in,out] | point(SparseVector) | Point to be transformed. |
Implemented in turi::supervised::l2_rescaling.
|
pure virtual |
Serialization – Load object
Load this class from a Turi iarc object.
[in] | iarc | Turi iarc object |
Implemented in turi::supervised::l2_rescaling.
|
pure virtual |
Serialization – Save object
Save this class to a Turi oarc object.
[in] | oarc | Turi oarc object |
Implemented in turi::supervised::l2_rescaling.
|
pure virtual |
Transform a point from the original space to the standardized space.
[in,out] | point(DenseVector) | Point to be transformed. |
Implemented in turi::supervised::l2_rescaling.
|
pure virtual |
Transform a point from the original space to the standardized space.
[in,out] | point(SparseVector) | Point to be transformed. |
Implemented in turi::supervised::l2_rescaling.
|
protected |
Definition at line 99 of file standardization-inl.hpp.