Turi Create  4.0
Unity FFI

The Unity FFI layer describes the basic C++ <–> Python interface layer. Using this layer directly requires a decent breadth of knowledge of the underlying architecture. It is recommended that you first take a look at the serialization documentation, the CPPIPC documentation.

The unity system basically is a client-server architecture which uses the CPPIPC library to share objects on the server, to the client. The server is implemented in C++ toolkits/unity, and the client is implemented mostly in Cython in src/cython_interface. Much of the common code is in src/sframe/unity. This in theory allows the C++ and Python processes to be located in different processes, or even different machines / systems across the network. This is less efficient than sharing pointers between C++ and Python directly, but this provides a more generic interface for which the Python end can be easily swapped out to, for instance, support new languages.

Basic Types

To allow effective communication between the server and the client (where the client is Python), we need to be able to translate between Python types and C++ types. However, since Python types are dynamic types, and C++ does not quite have an equivalent, we need to build a solution for this.

These two are the two key dynamic object types:

  • turi::flexible_type A runtime typed object capable of storing an integer, a double, a string, or vector<double>. It also can contain two recursive types, but these are at the moment not used. In Python, these are used in settings where I need to transport an object (say... a dictionary) containing elements of variable types to C++. For instance, a regular dictionary gets converted to std::map<std::string, flexible_type>.
  • turi::variant_type A more generic type that describes a union over a fixed set of types to support other stuff like passing SFrames, SArrays, SGraphs, or other user defined classes around. The flex

Unity Types

To expose a type X

  • X_interface.hpp contains the CPPIPC interface descriptor. i.e. this contains the call to GENERATE_INTERFACE_AND_PROXY. The base class must be called X_base and the proxy class must be called X_proxy.
  • X.hpp/cpp contains the server-side implementation. X.hpp must NOT be included by X_interface.hpp. The server-side class is called X.

For an example of the above convention, see unity_global.

For the unity subsystem, the following key types are exposed from the server to the client. We follow the following file naming convention for any exposed type.

  • turi::unity_sframe This represents the immutable scalable Dataframe object. It is a lazy evaluated datastructure.
  • turi::unity_sgraph This represents the immutable graph object and implements all the graph manipulation primitives. (object naming and file names all follow the convention above). Like the unity_sframe, the unity_sgraph implementation is that it is an immutable, dataflow, lazy-evaluated datastructure. The details of the object are out of scope of this brief paragraph, but to summarize quickly, the unity_sgraph datastructure wraps a shared_ptr to a node in a lazy evaluation graph (turi::lazy_eval_future), which in turn evaluates to a unity_sgraph_impl object, where all the graph manipulation logic really lives.
  • turi::model_base The model object describes the interface for the return type of all toolkit executions. This object breaks convention in that it can have numerous server-side implementations, as long as it implements the described interface. The interface is highly generic (essentially representing a generalized key-value map) to suit the purposes of almost all toolkits. This is what the SDK (turicreate_extension_interface) uses to export functionality to Python.

Exposing New Types to Python

While The Extension interface (turicreate_extension_interface) should be sufficient for most purposes, on occasion it might be necessary to expose a new fundamental type to Python. This is unfortunately, a slightly involving process, but we do not expect this to happen too frequently (You should really use the extension interface).

On a high level, you first implement an object exposed via the CPPIPC interface. Then you need to persuade Cython to pick up the proxy object, and then further wrapping the proxy object in a Python class if necessary.

Lets walk through a simple example. Say we are going to expose a new type called "counter" which implements a simple integer counter with two functions, increment(int), and int get(). The tasks involved are essentially:

Creating Base and Proxy

First, we create a new header for the CPPIPC base and proxy classes. Going with the convention, the file is called counter_interface.hpp

Note
Much of this example is written without testing... If someone tries it, that would be fantastic
/// This is in src/model_server/lib/counter_interface.hpp
#ifndef TURI_COUNTER_INTERFACE_HPP
#define TURI_COUNTER_INTERFACE_HPP
#include <core/system/cppipc/cppipc.hpp>
#include <core/system/cppipc/magic_macros.hpp>
namespace turi {
GENERATE_INTERFACE_AND_PROXY(counter_base, counter_proxy,
(int, get, )
(void, increment, (int))
)
} // namespace turi

Creating the Implementation

The implementation header is in in counter.hpp. (it must inherit from counter_base)

/// This is in src/model_server/lib/counter.hpp
#ifndef TURI_COUNTER_HPP
#define TURI_COUNTER_HPP
#include <unity/lib/counter_interface.hpp>
namespace turi {
class counter: public counter_base {
private:
int val;
public:
counter();
int get();
void increment(int v);
};
#endif
} // namespace turi

And the implementation itself is in counter.cpp. While the hpp/cpp split is not strictly necessary (especially for such a simple class, it is still good practice.

/// This is in src/model_server/lib/counter.cpp
#ifndef TURI_COUNTER_HPP
#define TURI_COUNTER_HPP
#include <unity/lib/counter.hpp>
namespace turi {
counter::counter():val(0) { }
int counter::get() {
return val;
}
void counter::increment(int v) {
val += v;
}
} // namespace turi

counter.cpp has to be added to the UNITY_FILES list in the unity CMakeLists.txt

Registering the New Type

Finally, the type has to be registered with CPPIPC on the server side. See unity/server/unity_server.cpp (which implements the main() function for the unity_server) and look for a sequence of calls to

cppipc::comm_server::register_type, and add the lines:

server->register_type<turi::counter>([](){
return new turi::counter();
});

Remember you have to include src/model_server/lib/counter.hpp at the top of unity_server.cpp.

Done!. Thats it to create a counter object and expose it via CPPIPC. Next, we need Python to be able to see the object type. And that is the tricky part.

Exposing the Proxy to Cython

The counter_proxy object must now be exposed to python via cython. This is can be a mildly annoying process due to Cython oddities, especially if your functions take interesting types (like maps and vectors).

Note
When interesting types are used, we suggest the use of dataframe, flexible_types (or the containers of flexible_types which are appropriate translators), or simple types which Cython have already implemented the translators for. The Cython documentation (http://docs.cython.org) is an crucial resource (as poorly organized as it is).

All the Cython classes are implemented in src/python/turicreate/cython. To expose the counter object, first create a counter.pxd file (definition).

# This is in src/python/turicreate/cython/counter.pxd
cdef extern from "<unity/lib/counter_interface.hpp>" namespace 'turi':
cdef cppclass counter_proxy:
int get() except +
void increment(int) except +

Now, this just makes Cython aware of the counter_proxy type. We further wrap the counter_proxy object time with a Cython class which can ensure correct allocation and deletion behavior. This wrapper can also further wrap functions to provide translators from Python types to C++ types. It is also at this point, we can switch from C++ conventions to Python conventions.

# Continuing on in src/python/turicreate/cython/counter.pxd
cdef class Counter:
# the wrapped proxy object
cdef counter_proxy* thisptr
# a reference to the comm_client
cdef _cli
cpdef get(self)
cpdef increment(int v)

This new Counter class needs an implementation in counter.pyx

# In src/python/turicreate/cython/counter.pyx
from .ipc cimport PyCommClient
cdef class Counter:
def __cinit__(self, PyCommClient cli):
assert cli, "CommClient is Null"
self.thisptr = new counter_proxy(deref(cli.thisptr))
self._cli = cli
def __dealloc__(self):
# cleanup
del self.thisptr
cpdef get(self):
return self.thisptr.get()
cpdef increment(int v)
self.thisptr.get(v)

Using the Counter type

Now, in Python, once a a connection has been established, a counter object can be created with

c = Counter(turicreate.connect.getConn())
c.increment(5)
c.get() # returns 5
c.increment(2)
c.get() # returns 7

Note that in some situations, you may in fact want to rename the Counter class as CounterProxy, and further wrap it with a Counter class fully implemented in Python. For instance, this is done with the Graph object to provide a fully Pythonic interface (The Cython cdef limitations can make it very difficult to implement interesting functions).

Implementing Toolkits

This section describes the old toolkit interface, a deprecated method for exporting objects and methods to Python. The new SDK method via turicreate_extension_interface is preferred.

A toolkit is very simply stated, a collection of functions, all with a common input/output interface. That is:

turi::toolkit_response_type exec(turi::toolkit_invocation& invoke);

Both toolkit_invocation and toolkit_response are highly generic, and mainly contain a special map datastructure called turi::variant_map_type which can contain within it, a graph, a dataframe, a model or a flexible_type, and has automatic translators to and from Python.

A toolkit may contain many functions of the above form, and to expose it to the unity_server requires the writing of a "registration" function of the type:

std::vector<turi::toolkit_specification> get_registration();

which basically lists all the functions to expose.

Lets try this out with a simple example. We are going to implement a toolkit which simply adds one to an input integer/float. This is actually fully implemented in unity/server/toolkits/demo_addone.* files.

  • unity_implementing_toolkits_add_one
  • unity_implementing_toolkits_registration
  • unity_implementing_toolkits_usage

Toolkit Implementation

The invoke datastructure basically contains within it a member called "params" which is really a map from string to a turi::variant_type variant type. This type is special because it is wrappers are implemented so that it is correctly recognized by Python and can be correctly converted to and from a Python dictionary.

We first implement the toolkit. Since the toolkit essentially takes a map as an input, and a map as an output, it is up to us to specify the actual argument interface (i.e. what the contents of the map should be). Here, we state that the input map should have a field "x" which contains either an integer, or a float. The return map should also have a field "x" which is of the same type as the input, but contains the value incremented by one.

// in demo_addone.cpp
namespace turi {
namespce demo_addone {
toolkit_response_type add_one(toolkit_invocation& invoke) {
toolkit_response_type ret_status;
// you should always parse all the arguments first.
// turi::safe_varmap_map will throw a descriptive string exception
// on failure.
flexible_type x = safe_varmap_get<flexible_type>(invoke.params, "x");
// Check if the types are correct. If anything is incorrect,
// we throw a string which will get printed on the client side.
if (x.get_type() == flex_type_enum::INTEGER ||
x.get_type() == flex_type_enum::FLOAT) {
x = x + 1;
} else {
throw("x of invalid type. Must be an integer or a float!");
}
// return the output, flagging success.
ret_status.params["x"] = x;
ret_status.success = true;
return ret_status;
}
} // namespace demo_addone
} // namespace turi

Thats it!

Toolkit Registration

Now, to implement the registration function. This is quite straightforward. Basically, it just associates the C++ function with a name.

// in demo_addone.cpp
namespace turi {
namespce demo_addone {
std::vector<toolkit_specification> get_registration() {
toolkit_specification spec;
spec.name = "demo_addone";
spec.toolkit_execute_function = add_one;
return {spec};
}
} // namespace demo_addone
} // namespace turi

A demo_addone.hpp should also be created which exposes the get_registration() function. Finally, the unity_server binary (in src/server/unity_server.cpp) must register the toolkit. (Search for a collection of register_toolkit calls)

g_toolkits->register_toolkit(turi::demo_addone::get_registration());

The demo_addone.hpp must of course be included at the top of unity_server.cpp

Toolkit Usage

Thats it! To run from Python:

val = {'x':5}
import turicreate.toolkits.main as main
ret = main.run("demo_addone", val)

Advanced Toolkits

While the above is a simple demonstration of the toolkit interface, it covers all the key concepts of toolkit implementation. Much of the rest of the work of implementing a toolkit is in making the interface friendly.

  • Wrapping the toolkit call in Python to make the interface friendly.
  • Providing incremental metrics via the simple metrics interface
  • If toolkit should return a complex state that should not be all communicated back to the client, toolkit should instead store an return a model object, in which case the Python-side has to provide a friendly interface for the model object.

It might be useful to see the pagerank toolkit to see how this is done.