Turi Create  4.0
turi::v2_block_impl Namespace Reference

Namespaces

 DOUBLE_RESERVED_FLAGS
 
 VECTOR_RESERVED_FLAGS
 

Classes

struct  block_info
 
class  block_manager
 
class  block_writer
 
struct  decode_double_stream
 
struct  decode_double_stream_legacy
 
struct  decode_ndvector_stream
 
struct  decode_number_stream
 
struct  decode_string_stream
 
struct  decode_vector_stream
 
class  encoded_block
 
class  encoded_block_range
 
struct  typed_decode_stream
 

Typedefs

typedef std::tuple< size_t, size_t > column_address
 
typedef std::tuple< size_t, size_t, size_t > block_address
 

Enumerations

enum  BLOCK_FLAGS
 

Functions

void encode_number (block_info &info, oarchive &oarc, const std::vector< flexible_type > &data)
 
void decode_number (iarchive &iarc, std::vector< flexible_type > &ret, size_t num_undefined)
 
void encode_double (block_info &info, oarchive &oarc, const std::vector< flexible_type > &data)
 
void decode_double (iarchive &iarc, std::vector< flexible_type > &ret, size_t num_undefined)
 
void decode_double_legacy (iarchive &iarc, std::vector< flexible_type > &ret, size_t num_undefined)
 
bool typed_decode (const block_info &info, char *start, size_t len, std::vector< flexible_type > &ret)
 
void typed_encode (const std::vector< flexible_type > &data, block_info &info, oarchive &oarc)
 

Detailed Description

SFrame v2 Format Implementation Detail

Typedef Documentation

◆ block_address

typedef std::tuple<size_t, size_t, size_t> turi::v2_block_impl::block_address

A block address is a tuple of segment_id, column number, block number within the segment

Definition at line 65 of file sarray_v2_block_types.hpp.

◆ column_address

typedef std::tuple<size_t, size_t> turi::v2_block_impl::column_address

A column address is a tuple of segment_id, column number within the segment

Definition at line 59 of file sarray_v2_block_types.hpp.

Enumeration Type Documentation

◆ BLOCK_FLAGS

Types of blocks

Definition at line 30 of file sarray_v2_block_types.hpp.

Function Documentation

◆ decode_double()

void turi::v2_block_impl::decode_double ( iarchive iarc,
std::vector< flexible_type > &  ret,
size_t  num_undefined 
)

Decodes a collection of doubles into 'data'. Entries in data which are of type flex_type_enum::UNDEFINED will be skipped, and there must be exactly num_undefined number of them.

This is the 2nd generation floating point encoder. its use is flagged by turning on the block flag BLOCK_ENCODING_EXTENSION. The format is basically:

  • one byte: encoding format. LEGACY, or INTEGER. If LEGACY: The old encoder is used If INTEGER: The floating point values are encoded as integers.

◆ decode_double_legacy()

void turi::v2_block_impl::decode_double_legacy ( iarchive iarc,
std::vector< flexible_type > &  ret,
size_t  num_undefined 
)

Decodes a collection of doubles into 'data'. Entries in data which are of type flex_type_enum::UNDEFINED will be skipped, and there must be exactly num_undefined number of them. It simply decodes a block using frame_of_reference_decode_128() and fills in data with it.

Note
We have an explicit implementation here that is equivalent to decode_number_stream for performance reasons since this is a very commonly encountered function.

◆ decode_number()

void turi::v2_block_impl::decode_number ( iarchive iarc,
std::vector< flexible_type > &  ret,
size_t  num_undefined 
)

Decodes a collection of numbers into 'data'. Entries in data which are of type flex_type_enum::UNDEFINED will be skipped, and there must be exactly num_undefined number of them. It simply decodes a block using frame_of_reference_decode_128() and fills in data with it.

Note
We have an explicit implementation here that is equivalent to decode_number_stream for performance reasons since this is a very commonly encountered function.

◆ encode_double()

void turi::v2_block_impl::encode_double ( block_info info,
oarchive oarc,
const std::vector< flexible_type > &  data 
)

Encodes a collection of doubles in data, skipping all UNDEFINED values. It simply loops through the data, collecting a block of up to MAX_INTEGERS_PER_BLOCK numbers and calls frame_of_reference_encode_128() on it.

This is the 2nd generation vector decoder. its use is flagged by turning on the block flag BLOCK_ENCODING_EXTENSION.

Note
The coding does not store the number of values stored. The decoder decode_number() requires the number of values to decode correctly.

◆ encode_number()

void turi::v2_block_impl::encode_number ( block_info info,
oarchive oarc,
const std::vector< flexible_type > &  data 
)

Encodes a collection of numbers in data, skipping all UNDEFINED values. It simply loops through the data, collecting a block of up to MAX_INTEGERS_PER_BLOCK numbers and calls frame_of_reference_encode_128() on it.

Note
The coding does not store the number of values stored. The decoder decode_number() requires the number of values to decode correctly.

◆ typed_decode()

bool turi::v2_block_impl::typed_decode ( const block_info info,
char *  start,
size_t  len,
std::vector< flexible_type > &  ret 
)

Decodes a collection of flexible_type values. The array must be of contiguous type, but permitting undefined values.

See typed_encode() for details.

Note
The coding does not store the number of values stored. This is stored in the block_info (block.num_elem)

◆ typed_encode()

void turi::v2_block_impl::typed_encode ( const std::vector< flexible_type > &  data,
block_info info,
oarchive oarc 
)

Encodes a collection of flexible_type values. The array must be of contiguous type, but permitting undefined values.

There is a two byte header to the block.

  • num_types: 1 byte
    • if 0, the block is empty.
    • if 1, the array is of contiguous type (see next byte)
    • if 2, the array is of contiguous type, but has missing values.
  • type: 1 byte.
  • [undefined bitfield]: if type is 2, this contains a bitfield of (round_op(#elem / 8) bytes) listing the positions of all the UNDEFINED fields)
  • type specific encoding:
    • if integer or float, encode_number() is called
    • if string, encode_string() is called
    • otherwise, direct serialization is currently used.
    • If UNDEFINED (i.e. array is of all UNDEFINED values, nothing is written)
Note
The coding does not store the number of values stored. This is stored in the block_info (block.num_elem)