Turi Create
4.0
|
Modules | |
Technical Details: Serialization | |
Classes | |
class | turi::dir_archive |
class | turi::iarchive |
The serialization input archive object which, provided with a reference to an istream, will read from the istream, providing deserialization capabilities. More... | |
class | turi::iarchive_soft_fail |
When this archive is used to deserialize an object, and the object does not support serialization, failure will only occur at runtime. Otherwise equivalent to turi::iarchive. More... | |
struct | turi::IS_POD_TYPE |
Inheriting from this type will force the serializer to treat the derived type as a POD type. More... | |
struct | turi::gl_is_pod< T > |
Tests if T is a POD type. More... | |
class | turi::oarchive |
The serialization output archive object which, provided with a reference to an ostream, will write to the ostream, providing serialization capabilities. More... | |
class | turi::oarchive_soft_fail |
When this archive is used to serialize an object, and the object does not support serialization, failure will only occur at runtime. Otherwise equivalent to turi::oarchive. More... | |
struct | turi::unsupported_serialize |
Inheritting from this class will prevent the serialization of the derived class. Used for debugging purposes. More... | |
Macros | |
#define | BEGIN_OUT_OF_PLACE_LOAD(arc, tname, tval) |
Macro to make it easy to define out-of-place loads. More... | |
#define | BEGIN_OUT_OF_PLACE_SAVE(arc, tname, tval) |
Macro to make it easy to define out-of-place saves. More... | |
#define | TURI_UNSERIALIZABLE(tname) |
A macro which disables the serialization of type so that it will fault at runtime. More... | |
Functions | |
template<typename OutArcType , typename RandomAccessIterator > | |
void | turi::serialize_iterator (OutArcType &oarc, RandomAccessIterator begin, RandomAccessIterator end) |
Serializes the contents between the iterators begin and end. More... | |
template<typename OutArcType , typename InputIterator > | |
void | turi::serialize_iterator (OutArcType &oarc, InputIterator begin, InputIterator end, size_t vsize) |
Serializes the contents between the iterators begin and end. More... | |
template<typename InArcType , typename T , typename OutputIterator > | |
void | turi::deserialize_iterator (InArcType &iarc, OutputIterator result) |
The accompanying function to serialize_iterator() Reads elements from the stream and writes it to the output iterator. More... | |
template<typename T > | |
std::string | turi::serialize_to_string (const T &t) |
Serializes a object to a string. More... | |
template<typename T > | |
void | turi::deserialize_from_string (const std::string &s, T &t) |
Deserializes a object from a string. More... | |
We have a custom serialization scheme which is designed for performance rather than compatibility. It does not perform type checking, It does not perform pointer tracking, and has only limited support across platforms. It has been tested, and should be compatible across x86 platforms.
For a summary of all serialization functionality see Serialization For more technical details, see Technical Details: Serialization .
There are two serialization classes turi::oarchive and turi::iarchive. The former does output, while the latter does input. To include all serialization headers, #include <turicreate/serialization/serialization_includes.hpp>.
To serialize data to disk, you just create an output archive, and associate it wiith an output stream.
For instance, to serialize to a file called "file.bin":
The << stream operators are then used to write data into the archive.
To read back, you use the iarchive with an input stream, and read back the variables in the same order:
So what type of data is serializable?
All integer datatypes are serializable.
bool
char
and unsigned char
short
and unsigned short
int
and unsigned int
long
and unsigned long
long long
and unsigned long long
Since all fixed width integer types from stdint (int16_t, int32_t, etc) are derived from these basic types, all fixed width integer types are also serializable.
int16_t
and uint16_t
int32_t
and uint32_t
int64_t
and uint64_t
All integer types are saved in their raw binary form without any additional re-encoding. It is therefore important to deserialize with the same integer width as what was serialized.
The following code will fail in dramatic ways:
All floating point data types are serializable.
float
double
long double
if your compiler supports quad precision.Similar to integer types, all floating types are saved in raw binary form without re-encoding. You must deserialize with the same floating point width as what was serialized. (i.e. if you serialize a double, you must deserialize a double.
The following template containers are serializable as long as the contained types are all serializable. This can be recursively applied.
std::vector
std::list
std::set
std::map
boost::unordered_set
boost::unordered_map
For instance, a std::vector<int>
is serializable. A std::list<std::vector<int> >
is therefore also serializable.
There is special handling for the std::vector<T> for performance in the event that T is a simple POD (Plain Old Data) data type. POD types are data types which occupy a contiguous region in memory. For instance, basic types (double, int, etc), or structs which contains only basic types. Such types can be copied or replicated using a simple mem-copy operation and can be greatly acceleration during serialization / deserialization. All basic data types are automatically POD types. We will discuss structs and other user types in the next section.
To serialize a struct/class, all you need to do is to define a public load/save function. For instance:
The save() and load() function prototypes must match exactly. Other conditions are that the class must be Default Constructible:
And that the class must be Assignable:
After which, TestClass
becomes serializable, and can be stored and read from an archive:
Since TestClass
is now serializable, containers of TestClass listed in Containers are also serializable.
As mentioned in Containers, POD data types occupy a contiguous region in memory and hence can be serialized and deserialized very quickly. Ideally, determination of whether a data type is POD or not should be handled by the compiler. However, this capability is only available in C++11 and not all compilers support it yet. We therefore implemented a simple workaround which will allow you to identify to the serializer that a class is POD, and avoid writing a save/load function.
We consider the following Coordinate struct.
This struct can be defined to be a POD type using an accelerated serializer by simply inheriting from turi::IS_POD_TYPE
Now, Coordinate variables, or even vector<Coordinate> variables will serialize/deserialize faster. Also, you avoid writing a save() and load() function.
In some situations, you may find that you need to make a data type serializable, but the data type is implemented by someone else, in a different library, making it impossible to extend and write a member save() and load() function as described in User Structs and Classes.
In this situation, it is necessary to implement an "Out of place" serializer. This is unfortunately somewhat more complicated.
For instance, if there is an external type implemented by some other library called Matrix which I would like to make serializable. The following code will have to be written in the global namespace
To facilitate reading and writing of data from the archives, the output oarchive object provides an turi::oarchive::write() oarchive::write() function which directly writes a sequence of bytes to the stream. Similarly, the input iarchive object provides a turi::iarchive::read() iarchive::read() function which directly reads a sequence of bytes from the stream.
For instance, if the Matrix type example above is defined in the following way:
An "out of place" serializer could be implemented the following way:
#define BEGIN_OUT_OF_PLACE_LOAD | ( | arc, | |
tname, | |||
tval | |||
) |
Macro to make it easy to define out-of-place loads.
In the event that it is impractical to implement a save() and load() function in the class one wnats to serialize, it is necessary to define an "out of save" save and load.
See Out of Place Serialization for an example
Definition at line 314 of file iarchive.hpp.
#define BEGIN_OUT_OF_PLACE_SAVE | ( | arc, | |
tname, | |||
tval | |||
) |
Macro to make it easy to define out-of-place saves.
In the event that it is impractical to implement a save() and load() function in the class one wnats to serialize, it is necessary to define an "out of save" save and load.
See Out of Place Serialization for an example
Definition at line 346 of file oarchive.hpp.
#define TURI_UNSERIALIZABLE | ( | tname | ) |
A macro which disables the serialization of type so that it will fault at runtime.
Writing TURI_UNSERIALIZABLE(T) for some typename T in the global namespace will result in an assertion failure if any attempt is made to serialize or deserialize the type T. This is largely used for debugging purposes to enforce that certain types are never serialized.
Definition at line 46 of file unsupported_serialize.hpp.
|
inline |
Deserializes a object from a string.
Deserializes a serializable object t from a string using the deserializer.
T | the type of object to deserialize. Typically will be inferred by the compiler. |
s | The string to deserialize |
t | A reference to the object which will contain the deserialized object when the function returns |
Definition at line 54 of file serialize_to_from_string.hpp.
void turi::deserialize_iterator | ( | InArcType & | iarc, |
OutputIterator | result | ||
) |
The accompanying function to serialize_iterator() Reads elements from the stream and writes it to the output iterator.
Note that this requires an additional template parameter T which is the "type of object to deserialize" This is necessary for instance for the map type. The map<T,U>::value_type
is pair<const T,U>
which is not useful since I cannot assign to it. In this case, T=pair<T,U>
OutArcType | The output archive type. |
T | The type of values to deserialize |
OutputIterator | The type of the output iterator to be written to. This should not need to be specified. The compiler will typically infer this correctly. |
iarc | A reference to the input archive |
result | The output iterator to write to |
Definition at line 100 of file iterator.hpp.
void turi::serialize_iterator | ( | OutArcType & | oarc, |
RandomAccessIterator | begin, | ||
RandomAccessIterator | end | ||
) |
Serializes the contents between the iterators begin and end.
This function prefers random access iterators since it needs a distance between the begin and end iterator. This function as implemented will work for other input iterators but is extremely inefficient.
OutArcType | The output archive type. This should not need to be specified. The compiler will typically infer this correctly. |
RandomAccessIterator | The iterator type. This should not need to be specified. The compiler will typically infer this correctly. |
oarc | A reference to the output archive to write to. |
begin | The start of the iterator range to write. |
end | The end of the iterator range to write. |
Definition at line 36 of file iterator.hpp.
void turi::serialize_iterator | ( | OutArcType & | oarc, |
InputIterator | begin, | ||
InputIterator | end, | ||
size_t | vsize | ||
) |
Serializes the contents between the iterators begin and end.
This functions takes all iterator types, but takes a "count" for efficiency. This count is checked and will return failure if the number of elements serialized does not match the count
OutArcType | The output archive type. This should not need to be specified. The compiler will typically infer this correctly. |
InputIterator | The iterator type. This should not need to be specified. The compiler will typically infer this correctly. |
oarc | A reference to the output archive to write to. |
begin | The start of the iterator range to write. |
end | The end of the iterator range to write. |
vsize | The distance between the iterators begin and end. Must match std::distance(begin, end); |
Definition at line 67 of file iterator.hpp.
|
inline |
Serializes a object to a string.
Converts a serializable object t to a string using the serializer.
T | the type of object to serialize. Typically will be inferred by the compiler. |
t | The object to serializer |
Definition at line 28 of file serialize_to_from_string.hpp.