Turi Create  4.0

Classes

class  turi::query_eval::operator_impl< planner_node_type::APPEND_NODE >
 
class  turi::query_eval::operator_impl< planner_node_type::BINARY_TRANSFORM_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::CONSTANT_NODE >
 
class  turi::query_eval::operator_impl< planner_node_type::GENERALIZED_TRANSFORM_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::GENERALIZED_UNION_PROJECT_NODE >
 
class  turi::query_eval::operator_impl< planner_node_type::LAMBDA_TRANSFORM_NODE >
 
class  turi::query_eval::operator_impl< planner_node_type::LOGICAL_FILTER_NODE >
 
struct  turi::query_eval::query_operator_attributes
 
class  turi::query_eval::query_operator
 
struct  turi::query_eval::operator_impl< planner_node_type::PROJECT_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::RANGE_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::REDUCE_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::SFRAME_SOURCE_NODE >
 
class  turi::query_eval::operator_impl< planner_node_type::TERNARY_OPERATOR >
 
class  turi::query_eval::operator_impl< planner_node_type::TRANSFORM_NODE >
 
struct  turi::query_eval::operator_impl< planner_node_type::UNION_NODE >
 
struct  turi::query_eval::planner_node
 

Typedefs

typedef std::shared_ptr< planner_nodeturi::query_eval::pnode_ptr
 A handy typedef.
 

Functions

 turi::query_eval::operator_impl< planner_node_type::SARRAY_SOURCE_NODE >::DECL_CORO_STATE (execute)
 
static size_t turi::query_eval::operator_impl< planner_node_type::SARRAY_SOURCE_NODE >::unique_sarray_tag (const std::shared_ptr< sarray< flexible_type > > &sa)
 

Detailed Description

All the available query plan operators are defined in here.

Introduction

There are 2 parts to an operator. 1: How to execute the operator (a specialization of operator_impl) 2: The logical node in the graph (planner_node)

First, to define a new operator, add to the enum in operator_properties and define your new new operator as a class operator_impl<enum> (the operator itself is just a specialization of the operator_impl class around the enum)

There are no constraints on the constructor or destructor of the class, or what members the class should have. The class should contain enough information to perform the execute() method.

The functions which must be implemented are

A couple of routines to convert back of forth between planner_node and the operator implementation:

operator_impl<enum>::make_planner_node(...)
operator_impl<enum>::plan_to_operator(...)

Additionally the following static methods must be provided

operator_impl<enum>::infer_type(...)

A length inference routine:

operator_impl<enum>::infer_length(...)

Example

To give an example, to implement a transform operator (see the source in transform.hpp). The example here defines a simplified version that only transforms a single column of flexible_type.

The class is:

template<>
class operator_impl<planner_node_type::TRANSFORM_NODE> : public query_operator {

To define a transform, we define constructor to accept a generic function. We need to know the actual output type of the function ahead of time.

inline operator_impl(const std::function<flexible_type(flexible_type)>& f,
flex_type_enum output_type) {
: m_transform_fn(f), m_output_type(output_type)
{ }
inline std::shared_ptr<query_operator> clone() const {
return std::make_shared<operator_impl>(*this);
}

We next define the name of the operator, as well as the attributes. Since transform takes 1 input and returns 1 output, we just return LINEAR.

static std::string name() { return "transform"; }
static query_operator_attributes attributes() {
query_operator_attributes ret;
ret.attribute_bitfield = query_operator_attributes::LINEAR;
ret.num_inputs = 1;
return ret;
}

Execute is a little subtle. It takes a query_context, reads from it writing to an output buffer (also provided by the query_context) and calls emit() to write the output buffer out.

inline void execute(query_context& context) {
while(1) {
// gets the next batch of rows from the 0'th input stream. Since this
// operator only takes 1 input stream (unlike say... a binary transform
// like '+' which would take 2 input streams.
// rows here is a shared pointer to a \ref sframe_rows object
auto rows = context.get_next(0);
// when rows returns a null pointer, we are done
if (rows == nullptr) break;
// get the next output buffer and resize to fit
auto output = context.get_output_buffer();
output->resize(1, rows->num_rows());
// iterate through the input set of rows
// writing to the output set of rows
auto iter = rows->cbegin();
auto output_iter = output->begin();
while(iter != rows->cend()) {
auto outval = m_transform_fn((*iter[0]));
(*output_iter)[0] = outval;
++output_iter;
++iter;
}
// emit and loop for the next batch
// This will actually perform a user mode context switch to the
// next operator in the sequence.
context.emit(output);
}
}

We need converters between planner_node and this operator. The planner_node really just stores a bunch of information in a generic map->any and we have to figure out our own schema in it. A bunch of assertions checks are removed for readability.

static std::shared_ptr<planner_node> make_planner_node(
std::shared_ptr<planner_node> source,
std::function<flexible_type(flexible_type)>fn,
flex_type_enum output_type) {
return planner_node::make_shared(planner_node_type::TRANSFORM_NODE,
{{"output_type", (int)(output_type)},
{{"function", any(fn)}},
{source});
}
static std::shared_ptr<query_operator> from_planner_node(
std::shared_ptr<planner_node> pnode) {
typedef std::function<flexible_type(flexible_type)> transform_type;
transform_type fn;
flex_type_enum output_type =
(flex_type_enum)(flex_int)(pnode->operator_parameters["output_type"]);
fn = pnode->any_operator_parameters["function"].as<transform_type>();
// call my own constructor
return std::make_shared<operator_impl>(fn, output_type);
}

To guide the graph verification and planning process of the logical graph, a couple of methods around type or length inference has to be provided.

static std::vector<flex_type_enum> infer_type(std::shared_ptr<planner_node> pnode) {
// we know our output type since we set it in ahead of time
return {(flex_type_enum)(int)(pnode->operator_parameters["output_type"])};
}
static int64_t infer_length(std::shared_ptr<planner_node> pnode) {
// The length of our output is the length of our input.
// So we recursively call the infer_planner_node_length on the input node.
return infer_planner_node_length(pnode->inputs[0]);
}

Finally, for convenience, we might typedef the whole class to something easier to read

typedef operator_impl<planner_node_type::TRANSFORM_NODE> op_transform;

Function Documentation

◆ DECL_CORO_STATE()

turi::query_eval::operator_impl< planner_node_type::SARRAY_SOURCE_NODE >::DECL_CORO_STATE ( execute  )

A "sarray_source" operator generates values from a physical sarray.

◆ unique_sarray_tag()

static size_t turi::query_eval::operator_impl< planner_node_type::SARRAY_SOURCE_NODE >::unique_sarray_tag ( const std::shared_ptr< sarray< flexible_type > > &  sa)
inlinestatic

Given an sarray, returns a small number uniquely associated with that sarray. This number is unique over the course of the program run.

Definition at line 148 of file sarray_source.hpp.