Turi Create
4.0
|
#include <core/storage/sframe_data/dataframe.hpp>
Public Member Functions | |
void | read_csv (const std::string &path, char delimiter, bool use_header) |
size_t | nrows () const |
bool | empty () const |
void | set_type (std::string key, flex_type_enum type) |
size_t | ncols () const |
Returns the number of columns in the dataframe. | |
bool | contains (std::string key) const |
bool | contains_nan (std::string key) const |
std::pair< flex_type_enum, std::vector< flexible_type > & > | operator[] (std::string key) |
std::pair< flex_type_enum, const std::vector< flexible_type > & > | operator[] (std::string key) const |
void | print () const |
void | set_column (std::string key, const std::vector< flexible_type > &val, flex_type_enum type) |
void | set_column (std::string key, std::vector< flexible_type > &&val, flex_type_enum type) |
void | remove_column (std::string key) |
void | save (oarchive &oarc) const |
Serializer. | |
void | load (iarchive &iarc) |
Deserializer. | |
void | clear () |
Clears the contents of the dataframe. | |
Public Attributes | |
std::vector< std::string > | names |
A vector storing the name of columns. | |
std::map< std::string, flex_type_enum > | types |
A map from the column name to the type of the column. | |
std::map< std::string, std::vector< flexible_type > > | values |
Type that represents a Pandas-like dataframe: A in memory column-wise representation of a table. The dataframe_t is simply a map from column name to a column of records, where every column is the same length, and all values within a column have the same type. This is also the type used for transferring between pandas dataframe objects and C++.
Each cell in the dataframe is represented by a flexible_type object, while this technically allows every cell to be an arbitrary type, we do not permit that behavior. We require and assume that every cell in a column be of the same type. This is with the exception of empty cells (NaNs in Pandas) which are of type UNDEFINED.
Definition at line 39 of file dataframe.hpp.
|
inline |
Returns true if the dataframe contains a column with the given name.
Definition at line 94 of file dataframe.hpp.
|
inline |
Returns true if the column contains undefined flexible_type value.
Definition at line 101 of file dataframe.hpp.
|
inline |
Returns true if the dataframe is empty.
Definition at line 77 of file dataframe.hpp.
|
inline |
Returns the number of rows in the dataframe
Definition at line 69 of file dataframe.hpp.
|
inline |
Column index operator. Can be used to extract a column from the dataframe. Returns a pair of (type, reference to column)
Definition at line 120 of file dataframe.hpp.
|
inline |
Const column index operator. Can be used to extract a column from the dataframe. Returns a pair of (type, reference to column)
Definition at line 129 of file dataframe.hpp.
void turi::dataframe_t::print | ( | ) | const |
Prints the contents of the dataframe to std::cerr
void turi::dataframe_t::read_csv | ( | const std::string & | path, |
char | delimiter, | ||
bool | use_header | ||
) |
Fill the dataframe with the content from a csv file.
void turi::dataframe_t::remove_column | ( | std::string | key | ) |
Remove the column.
void turi::dataframe_t::set_column | ( | std::string | key, |
const std::vector< flexible_type > & | val, | ||
flex_type_enum | type | ||
) |
Sets the value of a column of the dataframe.
void turi::dataframe_t::set_column | ( | std::string | key, |
std::vector< flexible_type > && | val, | ||
flex_type_enum | type | ||
) |
Sets the value of a column of the dataframe, consuming the vector value
void turi::dataframe_t::set_type | ( | std::string | key, |
flex_type_enum | type | ||
) |
Convert the values in the column into the specified type. Throws an exception if the column is not found, or the conversion cannot be made.
std::map<std::string, std::vector<flexible_type> > turi::dataframe_t::values |
A map from the column name to the values of the column. Every column must have the same length, and all values within a column must be of the same type. The UNDEFINED type is an exception to the rule and may be used anywhere to designate an empty entry.
Definition at line 51 of file dataframe.hpp.