Turi Create
4.0
|
#include <core/data/sframe/gl_sgraph.hpp>
Public Member Functions | |
gl_sgraph (const gl_sframe &vertices, const gl_sframe &edges, const std::string &vid_field="__id", const std::string &src_field="__src_id", const std::string &dst_field="__dst_id") | |
gl_sgraph (const std::string &directory) | |
gl_sframe | get_edges (const std::vector< vid_pair > &ids=std::vector< vid_pair >(), const std::map< std::string, flexible_type > &fields=std::map< std::string, flexible_type >()) const |
gl_sframe | get_vertices (const std::vector< flexible_type > &ids=std::vector< flexible_type >(), const std::map< std::string, flexible_type > &fields=std::map< std::string, flexible_type >()) const |
std::map< std::string, flexible_type > | summary () const |
size_t | num_vertices () const |
size_t | num_edges () const |
std::vector< std::string > | get_fields () const |
std::vector< std::string > | get_vertex_fields () const |
std::vector< std::string > | get_edge_fields () const |
std::vector< flex_type_enum > | get_vertex_field_types () const |
std::vector< flex_type_enum > | get_edge_field_types () const |
gl_sgraph | add_vertices (const gl_sframe &vertices, const std::string &vid_field) const |
gl_sgraph | add_edges (const gl_sframe &edges, const std::string &src_field, const std::string &dst_field) const |
gl_sgraph | select_vertex_fields (const std::vector< std::string > &fields) const |
gl_sgraph | select_edge_fields (const std::vector< std::string > &fields) const |
gl_sgraph | select_fields (const std::vector< std::string > &fields) const |
gl_gframe | vertices () |
gl_gframe | edges () |
gl_sgraph | triple_apply (const lambda_triple_apply_fn &lambda, const std::vector< std::string > &mutated_fields) const |
void | save (const std::string &directory) const |
void | save_reference (const std::string &directory) const |
void | add_vertex_field (gl_sarray column_data, const std::string &field) |
void | add_vertex_field (const flexible_type &column_data, const std::string &field) |
void | remove_vertex_field (const std::string &field) |
void | rename_vertex_fields (const std::vector< std::string > &oldnames, const std::vector< std::string > &newnames) |
void | add_edge_field (gl_sarray column_data, const std::string &field) |
void | add_edge_field (const flexible_type &column_data, const std::string &field) |
void | remove_edge_field (const std::string &field) |
void | rename_edge_fields (const std::vector< std::string > &oldnames, const std::vector< std::string > &newnames) |
virtual std::shared_ptr< unity_sgraph > | get_proxy () const |
A scalable graph data structure backed by persistent storage (gl_sframe).
The SGraph (gl_sgraph) data structure allows arbitrary dictionary attributes on vertices and edges, provides flexible vertex and edge query functions, and seamless transformation to and from SFrame.
There are several ways to create an SGraph. The simplest way is to make an empty graph then add vertices and edges with the add_vertices and add_edges methods.
Columns in the gl_sframes that are not used as id fields are assumed to be vertex or edge attributes.
gl_sgraph objects can also be created from vertex and edge lists stored in gl_sframe.
The most convenient way to access vertex and edge data is through the vertices and edges. Both functions return a GFrame (gl_gframe) object. GFrame is like SFrame but is bound to the host SGraph, so that modification to GFrame is applied to SGraph, and vice versa.
For instance, the following code shows how to add/remove columns to/from the vertex data. The change is applied to SGraph.
You can also query for specific vertices and edges using the get_vertices and get_edges functionality.
For instance,
selects out going edges of 0, incoming edges of 1, edge 2->3, such that the edge attribute "like_fish" evaluates to 1.
In addition, you can perform other non-mutating gl_sframe operations like groupby, join, logical_filter in the same way, and the returned object will be gl_sframe.
In the case where you want to perform vertex-specified operations, such as "gather"/"scatter" over the neighborhood of each vertex, we provide triple_apply which is essentially a "parallel for" over (Vertex, Edge, Vertex) triplets.
For instance, the following code shows how to implement the update function for synchronous pagerank.
gl_sgraph is structurally immutable but data (or field) Mutable. You can add new vertices and edges, but the operation returns a new Ï
Please checkout turicreate/sdk_example/sgraph_example.cpp for a concrete example.
Definition at line 148 of file gl_sgraph.hpp.
turi::gl_sgraph::gl_sgraph | ( | const gl_sframe & | vertices, |
const gl_sframe & | edges, | ||
const std::string & | vid_field = "__id" , |
||
const std::string & | src_field = "__src_id" , |
||
const std::string & | dst_field = "__dst_id" |
||
) |
Construct gl_sgraph with given vertex data and edge data.
vertices | Vertex data. Must include an ID column with the name specified by "vid_field." Additional columns are treated as vertex attributes. |
edges | Edge data. Must include source and destination ID columns as specified by "src_field" and "dst_field". Additional columns are treated as edge attributes. |
vid_field | Optional. The name of vertex ID column in the "vertices" gl_sframe. |
src_field | Optional. The name of source ID column in the "edges" gl_sframe. |
dst_field | Optional. The name of destination ID column in the "edges" gl_sframe. |
Example
Produces output:
|
explicit |
void turi::gl_sgraph::add_edge_field | ( | gl_sarray | column_data, |
const std::string & | field | ||
) |
void turi::gl_sgraph::add_edge_field | ( | const flexible_type & | column_data, |
const std::string & | field | ||
) |
Add a new edge field filled with constant data. Using edges() is preferred.
column_data | the constant data to fill the new field column. |
field | name of the new edge field. |
gl_sgraph turi::gl_sgraph::add_edges | ( | const gl_sframe & | edges, |
const std::string & | src_field, | ||
const std::string & | dst_field | ||
) | const |
Add edges to the gl_sgraph and return the new graph.
Input edges should be in the form of gl_sframe and "src_field" and "dst_field" specifies which two columns contain the id of source vertex IDs and target vertices. Remaining columns are assumed to hold additional vertex attributes. If these attributes are not already present in the graph's edge data, they are added, with existing edges acquiring the missing value FLEX_UNDEFINED.
edges | Edge data. An gl_sframe whose "src_field" and "dst_field" columns contain the source and target vertex IDs. Additional columns are treated as edge attributes. |
src_field | Optional. Specifies the source id column in the edges gl_sframe. |
dst_field | Optional. Specifies the target id column in the edges gl_sframe. |
Example:
Produces output:
void turi::gl_sgraph::add_vertex_field | ( | gl_sarray | column_data, |
const std::string & | field | ||
) |
Add a new vertex field with given field name and column_data. Using vertices() is preferred.
column_data | gl_sarray of size equals to num_vertices. The order of column_data is aligned with the order which vertices are stored. |
field | name of the new vertex field. |
void turi::gl_sgraph::add_vertex_field | ( | const flexible_type & | column_data, |
const std::string & | field | ||
) |
Add a new vertex field filled with constant data. Using vertices() is preferred.
column_data | the constant data to fill the new field column. |
field | name of the new vertex field. |
gl_sgraph turi::gl_sgraph::add_vertices | ( | const gl_sframe & | vertices, |
const std::string & | vid_field | ||
) | const |
Add vertices to the gl_sgraph and return the new graph.
Input vertices should be in the form of gl_sframe and "vid_field" specifies which column contains the vertex ID. Remaining columns are assumed to hold additional vertex attributes. If these attributes are not already present in the graph's vertex data, they are added, with existing vertices acquiring the missing value FLEX_UNDEFINED.
vertices | Vertex data. An gl_sframe whose "vid_field" column contains the vertex IDs. Additional columns are treated as vertex attributes. |
vid_field | Optional. Specifies the vertex id column in the vertices gl_sframe. |
Example:
Produces output:
gl_gframe turi::gl_sgraph::edges | ( | ) |
Returns a convenient "SFrame like" handler for the edges in this gl_sgraph.
While a regular gl_sframe is independent of any gl_sgraph, a gl_gframe is bound (or points) to an gl_sgraph. Modifying fields of the returned gl_gframe changes the edge data of the gl_sgraph. Also, modifications to the fields in the gl_sgraph, will be reflected in the gl_gframe.
Example:
Produces output:
std::vector<flex_type_enum> turi::gl_sgraph::get_edge_field_types | ( | ) | const |
Return the types of edge fields in the graph.
std::vector<std::string> turi::gl_sgraph::get_edge_fields | ( | ) | const |
Return the names of edge fields in the graph.
gl_sframe turi::gl_sgraph::get_edges | ( | const std::vector< vid_pair > & | ids = std::vector< vid_pair >() , |
const std::map< std::string, flexible_type > & | fields = std::map< std::string, flexible_type >() |
||
) | const |
Return a collection of edges and their attributes.
This function is used to find edges by vertex IDs, filter on edge attributes, or list in-out * neighbors of vertex sets.
ids | Optional. Array of pairs of source and target vertices, each corresponding to an edge to fetch. Only edges in this list are returned. FLEX_UNDEFINED can be used to designate a wild card. For instance, {{1,3}, {2,FLEX_UNDEFINED}, {FLEX_UNDEFINED, 5}} will fetch the edge 1->3, all outgoing edges of 2 and all incoming edges of 5. ids may be left empty, which implies an array of all wild cards. |
fields | Optional. Dictionary specifying equality constraints on field values. For example, { {"relationship", "following"} }, returns only edges whose 'relationship' field equals 'following'. FLEX_UNDEFINED can be used as a value to designate a wild card. e.g. { {"relationship", FLEX_UNDEFINED} } will find all edges with the field 'relationship' regardless of the value. |
Example:
Produces output:
std::vector<std::string> turi::gl_sgraph::get_fields | ( | ) | const |
Return the names of both vertex fields and edge fields in the graph.
|
virtual |
Retrieves a pointer to the underlying unity_sgraph
std::vector<flex_type_enum> turi::gl_sgraph::get_vertex_field_types | ( | ) | const |
Return the types of vertex fields in the graph.
std::vector<std::string> turi::gl_sgraph::get_vertex_fields | ( | ) | const |
Return the names of vertex fields in the graph.
gl_sframe turi::gl_sgraph::get_vertices | ( | const std::vector< flexible_type > & | ids = std::vector< flexible_type >() , |
const std::map< std::string, flexible_type > & | fields = std::map< std::string, flexible_type >() |
||
) | const |
Return a collection of vertices and their attributes.
ids | List of vertex IDs to retrieve. Only vertices in this list will be returned. |
fields | Dictionary specifying equality constraint on field values. For example { {"gender", "M"} }, returns only vertices whose 'gender' field is 'M'. FLEX_UNDEFINED can be used to designate a wild card. For example, { {"relationship", FLEX_UNDEFINED } } will find all vertices with the field 'relationship' regardless of the value. |
Example:
Produces output:
size_t turi::gl_sgraph::num_edges | ( | ) | const |
Return the number of edges in the graph.
size_t turi::gl_sgraph::num_vertices | ( | ) | const |
Return the number of vertices in the graph.
void turi::gl_sgraph::remove_edge_field | ( | const std::string & | field | ) |
Removes the edge field
name | of the field to be removed |
void turi::gl_sgraph::remove_vertex_field | ( | const std::string & | field | ) |
Removes the vertex field
name | of the field to be removed |
void turi::gl_sgraph::rename_edge_fields | ( | const std::vector< std::string > & | oldnames, |
const std::vector< std::string > & | newnames | ||
) |
Renames the edge fields
oldnames | list of names of the fields to be renamed |
newnames | list of new names of the fields, aligned with oldnames. |
void turi::gl_sgraph::rename_vertex_fields | ( | const std::vector< std::string > & | oldnames, |
const std::vector< std::string > & | newnames | ||
) |
Renames the vertex fields
oldnames | list of names of the fields to be renamed |
newnames | list of new names of the fields, aligned with oldnames. |
void turi::gl_sgraph::save | ( | const std::string & | directory | ) | const |
Save the sgraph into a directory.
void turi::gl_sgraph::save_reference | ( | const std::string & | directory | ) | const |
Save the sgraph using reference to other SFrames.
gl_sgraph turi::gl_sgraph::select_edge_fields | ( | const std::vector< std::string > & | fields | ) | const |
Return a new gl_sgraph with only the selected edge fields. Other edge fields are discarded, while fields that do not exist in the gl_sgraph are ignored. Vertex fields remain the same in the new graph.
fields | A list of field names to select. |
Example:
Produces output:
gl_sgraph turi::gl_sgraph::select_fields | ( | const std::vector< std::string > & | fields | ) | const |
gl_sgraph turi::gl_sgraph::select_vertex_fields | ( | const std::vector< std::string > & | fields | ) | const |
Return a new gl_sgraph with only the selected vertex fields. Other vertex fields are discarded, while fields that do not exist in the gl_sgraph are ignored. Edge fields remain the same in the new graph.
fields | A list of field names to select. |
Example:
Produces output:
std::map<std::string, flexible_type> turi::gl_sgraph::summary | ( | ) | const |
Return the number of vertices and edges as a dictionary.
Example:
Produces output:
gl_sgraph turi::gl_sgraph::triple_apply | ( | const lambda_triple_apply_fn & | lambda, |
const std::vector< std::string > & | mutated_fields | ||
) | const |
Apply a user defined lambda function on each of the edge triples, and returns the new graph.
An edge_triple is a simple struct containing source, edge and target of type std::map<std::string, flexible_type>. The lambda function is applied once on each of the edge_triple in parallel, with locking on both source and target vertices to prevent race conditions. The following pseudo code describes the effect of the function:
This function enables super easy implementations of common graph computations like degree_count, weighted_pagerank, connected_component, etc.
Example
Produces output:
gl_gframe turi::gl_sgraph::vertices | ( | ) |
Returns a convenient "SFrame like" handler for the vertices in this gl_sgraph.
While a regular gl_sframe is independent of any gl_sgraph, a gl_gframe is bound (or points) to an gl_sgraph. Modifying fields of the returned gl_gframe changes the vertex data of the gl_sgraph. Also, modifications to the fields in the gl_sgraph, will be reflected in the gl_gframe.
Example:
Produces output: