SmartGraphs
SmartGraphs are only available in theEnterprise Edition,also available as managed service.
This chapter describes the smart-graph
module, which enables you to managegraphs at scale. It will give a vast performance benefit for all graphs shardedin an ArangoDB Cluster. On a single server this feature is pointless, hence itis only available in cluster mode.
In terms of querying there is no difference between SmartGraphs andGeneral Graphs. The former is a transparent replacement for the latter.For graph querying please refer to AQL Graph Operationsand General Graph Functions sections.The optimizer is clever enough to identify whether it is a SmartGraph or not.
The difference is only in the management section: creating and modifying theunderlying collections of the graph. For a detailed API reference please referto SmartGraph Management.
Do the hands-onArangoDB SmartGraphs Tutorialto learn more.
What makes a graph smart?
Most graphs have one feature that divides the entire graph into several smallersubgraphs. These subgraphs have a large amount of edges that only connectvertices in the same subgraph and only have few edges connecting vertices fromother subgraphs.
Examples for these graphs are:
Social NetworksTypically the feature here is the region/country users live in.Every user typically has more contacts in the same region/country then shehas in other regions/countries
Transport SystemsFor those also the feature is the region/country. You have many localtransportation but only few across countries.
E-CommerceIn this case probably the category of products is a good feature.Often products of the same category are bought together.
If this feature is known, SmartGraphs can make use if it.
When creating a SmartGraph you have to define a smartAttribute, which is thename of an attribute stored in every vertex. The graph will than beautomatically sharded in such a way that all vertices with the same value arestored on the same physical machine, all edges connecting vertices withidentical smartAttribute values are stored on this machine as well.During query time the query optimizer and the query executor both know forevery document exactly where it is stored and can thereby minimize networkoverhead. Everything that can be computed locally will be computed locally.
Benefits of SmartGraphs
Because of the above described guaranteed sharding, the performance of queriesthat only cover one subgraph have a performance almost equal to an only localcomputation. Queries that cover more than one subgraph require some networkoverhead. The more subgraphs are touched the more network cost will apply.However the overall performance is never worse than the same query using aGeneral Graph.
Getting started
First of all, SmartGraphs cannot use existing collections. When switching toSmartGraph from an existing dataset you have to import the data into a freshSmartGraph. This switch can be easily achieved witharangodump andarangorestore.The only thing you have to change in this pipeline is that you create the newcollections with the SmartGraph module before starting arangorestore
.
Create a graph
In contrast to General Graphs we have to add more options when creating thegraph. The two options smartGraphAttribute
and numberOfShards
arerequired and cannot be modified later.
- arangosh> var graph_module = require("@arangodb/smart-graph");
- arangosh> var graph = graph_module._create("myGraph", [], [], {smartGraphAttribute: "region", numberOfShards: 9});
- arangosh> graph_module._graph("myGraph");
Show execution results
Hide execution results
- {[SmartGraph]
- }
Add vertex collections
This is analogous to General Graphs. Unlike with General Graphs, thecollections must not exist when creating the SmartGraph. The SmartGraphmodule will create them for you automatically to set up the sharding for allthese collections correctly. If you create collections via the SmartGraphmodule and remove them from the graph definition, then you may re-add themwithout trouble however, as they will have the correct sharding.
- arangosh> graph._addVertexCollection("shop");
- arangosh> graph._addVertexCollection("customer");
- arangosh> graph._addVertexCollection("pet");
- arangosh> graph_module._graph("myGraph");
Show execution results
Hide execution results
- {[SmartGraph]
- "customer" : [ArangoCollection 2010166, "customer" (type document, status loaded)],
- "pet" : [ArangoCollection 2010176, "pet" (type document, status loaded)],
- "shop" : [ArangoCollection 2010156, "shop" (type document, status loaded)]
- }
Define relations on the Graph
Adding edge collections works the same as with General Graphs, but again, thecollections are created by the SmartGraph module to set up sharding correctlyso they must not exist when creating the SmartGraph (unless they have thecorrect sharding already).
- arangosh> var rel = graph_module._relation("isCustomer", ["shop"], ["customer"]);
- arangosh> graph._extendEdgeDefinitions(rel);
- arangosh> graph_module._graph("myGraph");
Show execution results
Hide execution results
- {[SmartGraph]
- "isCustomer" : [ArangoCollection 2010216, "isCustomer" (type edge, status loaded)],
- "shop" : [ArangoCollection 2010186, "shop" (type document, status loaded)],
- "customer" : [ArangoCollection 2010196, "customer" (type document, status loaded)],
- "pet" : [ArangoCollection 2010206, "pet" (type document, status loaded)]
- }