Gremlin API

Gremlin is a language specialized to work with Property Graphs. Gremlin is part of TinkerPop Open Source products. For more information:

To know more about Gremlin and TinkerPop‘s products subscribe to the Gremlin Group.

Get Started

Launch the gremlin.sh (or gremlin.bat on Windows OS) console script located in the bin directory:

  1. > gremlin.bat
  2. \,,,/
  3. (o o)
  4. -----oOOo-(_)-oOOo-----

Open the graph database

Before playing with Gremlin you need a valid OrientGraph instance that points to an OrientDB database. To know all the database types look at Storage types.

When you’re working with a local or an in-memory database, if the database does not exist it’s created for you automatically. Using the remote connection you need to create the database on the target server before using it. This is due to security restrictions.

Once created the OrientGraph instance with a proper URL is necessary to assign it to a variable. Gremlin is written in Groovy, so it supports all the Groovy syntax, and both can be mixed to create very powerful scripts!

Example with a local database (see below for more information about it):

  1. gremlin> g = new OrientGraph("plocal:/home/gremlin/db/demo");
  2. ==>orientgraph[plocal:/home/gremlin/db/demo]

Some useful links:

Working with local database

This is the most often used mode. The console opens and locks the database for exclusive use. This doesn’t require starting an OrientDB server.

  1. gremlin> g = new OrientGraph("plocal:/home/gremlin/db/demo");
  2. ==>orientgraph[plocal:/home/gremlin/db/demo]

Working with a remote database

To open a database on a remote server be sure the server is up and running first. To start the server just launch server.sh (or server.bat on Windows OS) script. For more information look at OrientDB Server

  1. gremlin> g = new OrientGraph("remote:localhost/demo");
  2. ==>orientgraph[remote:localhost/demo]

Working with in-memory database

In this mode the database is volatile and all the changes will be not persistent. Use this in a clustered configuration (the database life is assured by the cluster itself) or just for test.

  1. gremlin> g = new OrientGraph("memory:demo");
  2. ==>orientgraph[memory:demo]

Use security

OrientDB supports security by creating multiple users and roles associated with certain privileges. To know more look at Security. To open the graph database with a different user than the default, pass the user and password as additional parameters:

  1. gremlin> g = new OrientGraph("memory:demo", "reader", "reader");
  2. ==>orientgraph[memory:demo]

Create a new Vertex

To create a new vertex, use the addVertex() method. The vertex will be created and a unique id will be displayed as the return value.

  1. g.addVertex();
  2. ==>v[#5:0]

Create an edge

To create a new edge between two vertices, use the addEdge(v1, v2, label) method. The edge will be created with the label specified.

In the example below two vertices are created and assigned to a variable (Gremlin is based on Groovy), then an edge is created between them.

  1. gremlin> v1 = g.addVertex();
  2. ==>v[#5:0]
  3. gremlin> v2 = g.addVertex();
  4. ==>v[#5:1]
  5. gremlin> e = g.addEdge(v1, v2, 'friend');
  6. ==>e[#6:0][#5:0-friend->#5:1]

Save changes

OrientDB assigns a temporary identifier to each vertex and edge that is created. To save them to the database stopTransaction(SUCCESS) should be called

  1. gremlin> g.stopTransaction(SUCCESS)

Retrieve a vertex

To retrieve a vertex by its ID, use the v(id) method passing the RecordId as an argument (with or without the prefix ‘#’). This example retrieves the first vertex created in the above example.

  1. gremlin> g.v('5:0')
  2. ==>v[#5:0]

Get all the vertices

To retrieve all the vertices in the opened graph use .V (V in upper-case):

  1. gremlin> g.V
  2. ==>v[#5:0]
  3. ==>v[#5:1]

Retrieve an edge

Retrieving an edge is very similar to retrieving a vertex. Use the e(id) method passing the RecordId as an argument (with or without the prefix ‘#’). This example retrieves the first edge created in the previous example.

  1. gremlin> g.e('6:0')
  2. ==>e[#6:0][#5:0-friend->#5:1]

Get all the edges

To retrieve all the edges in the opened graph use .E (E in upper-case):

  1. gremlin> g.E
  2. ==>e[#6:0][#5:0-friend->#5:1]

Traversal

The power of Gremlin is in traversal. Once you have a graph loaded in your database you can traverse it in many different ways.

Basic Traversal

To display all the outgoing edges of the first vertex just created append the .outE at the vertex. Example:

  1. gremlin> v1.outE
  2. ==>e[#6:0][#5:0-friend->#5:1]

To display all the incoming edges of the second vertex created in the previous examples append the .inE at the vertex. Example:

  1. gremlin> v2.inE
  2. ==>e[#6:0][#5:0-friend->#5:1]

In this case the edge is the same because it’s the outgoing edge of 5:0 and the incoming edge of 5:1.

For more information look at the Basic Traversal with Gremlin.

Filter results

This example returns all the outgoing edges of all the vertices with label equal to ‘friend’.

  1. gremlin> g.V.outE('friend')
  2. ==>e[#6:0][#5:0-friend->#5:1]

Close the database

To close a graph use the shutdown() method:

  1. gremlin> g.shutdown()
  2. ==>null

This is not strictly necessary because OrientDB always closes the database when the Gremlin console quits.

Create complex paths

Gremlin allows you to concatenate expressions to create more complex traversals in a single line:

  1. v1.outE.inV

Of course this could be much more complex. Below is an example with the graph taken from the official documentation:

  1. g = new OrientGraph('memory:test')
  2. // calculate basic collaborative filtering for vertex 1
  3. m = [:]
  4. g.v(1).out('likes').in('likes').out('likes').groupCount(m)
  5. m.sort{a,b -> a.value <=> b.value}
  6. // calculate the primary eigenvector (eigenvector centrality) of a graph
  7. m = [:]; c = 0;
  8. g.V.out.groupCount(m).loop(2){c++ < 1000}
  9. m.sort{a,b -> a.value <=> b.value}

Passing input parameters

Some Gremlin expressions require declaration of input parameters to be run. This is the case, for example, of bound variables, as described in JSR223 Gremlin Script Engine. OrientDB has enabled a mechanism to pass variables to a Gremlin pipeline declared in a command as described below:

  1. Map<String, Object> params = new HashMap<String, Object>();
  2. params.put("map1", new HashMap());
  3. params.put("map2", new HashMap());
  4. db.command(new OCommandSQL("select gremlin('
  5. current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
  6. ')")).execute(params);

GremlinPipeline

You can also use native Java GremlinPipeline like:

  1. new GremlinPipeline(g.getVertex(1)).out("knows").property("name").filter(new PipeFunction<String,Boolean>() {
  2. public Boolean compute(String argument) {
  3. return argument.startsWith("j");
  4. }
  5. }).back(2).out("created");

For more information: Using Gremlin through Java

Declaring output

In the simplest case, the output of the last step (https://github.com/tinkerpop/gremlin/wiki/Gremlin-Steps) in the Gremlin pipeline corresponds to the output of the overall Gremlin expression. However, it is possible to instruct the Gremlin engine to consider any of the input variables as output. This can be declared as:

  1. Map<String, Object> params = new HashMap<String, Object>();
  2. params.put("map1", new HashMap());
  3. params.put("map2", new HashMap());
  4. params.put("output", "map2");
  5. db.command(new OCommandSQL("select gremlin('
  6. current.as('id').outE.label.groupCount(map1).optional('id').sideEffect{map2=it.map();map2+=map1;}
  7. ')")).execute(params);

There are more possibilities to define the output in the Gremlin pipelines. So this mechanism is expected to be extended in the future. Please, contact OrientDB mailing list to discuss customized outputs.

Conclusions

Now you’ve learned how to use Gremlin on top of OrientDB. The best place to go in depth with this powerful language is the Gremlin WiKi.