Creating a new Op: Python implementation
So suppose you have looked through the library documentation and you don’t seea function that does what you want.
If you can implement something in terms of existing Ops, you should do that.Odds are your function that uses existing Theano expressions is short,has no bugs, and potentially profits from optimizations that have already beenimplemented.
However, if you cannot implement an Op in terms of existing Ops, you have towrite a new one. Don’t worry, Theano was designed to make it easy to add newOps, Types, and Optimizations.
As an illustration, this tutorial shows how to write a simple Python-basedoperations which performs operations onType, double<Double>
… It also shows how to implement tests that.. ensure the proper working of an op.
Note
This is an introductury tutorial and as such it does not cover how to makean op that returns a view or modifies the values in its inputs. Thus, allops created with the instructions described here MUST return newlyallocated memory or reuse the memory provided in the parameteroutput_storage
of the perform()
function. SeeViews and inplace operations for an explanation on how to do this.
If your op returns a view or changes the value of its inputswithout doing as prescribed in that page, Theano will run, but willreturn correct results for some graphs and wrong results for others.
It is recommended that you run your tests in DebugMode (Theano flagmode=DebugMode
) since it verifies if your op behaves correctly in thisregard.
Theano Graphs refresher
Theano represents symbolic mathematical computations as graphs. Those graphsare bi-partite graphs (graphs with 2 types of nodes), they are composed ofinterconnected Apply and Variable nodes.Variable nodes represent data in the graph, either inputs, outputs orintermediary values. As such, Inputs and Outputs of a graph are lists of TheanoVariable nodes. Apply nodes perform computation on thesevariables to produce new variables. Each Apply node has a link to aninstance of Op which describes the computation to perform. This tutorialdetails how to write such an Op instance. Please refers toGraph Structures for a more detailed explanation about the graphstructure.
Op’s basic methods
An op is any Python object which inherits from gof.Op
.This section provides an overview of the basic methods you typically have toimplement to make a new op. It does not provide extensive coverage of all thepossibilities you may encounter or need. For that refer toOp’s contract.
- import theano
- class MyOp(theano.Op):
- # Properties attribute
- __props__ = ()
- #itypes and otypes attributes are
- #compulsory if make_node method is not defined.
- #They're the type of input and output respectively
- itypes = None
- otypes = None
- #Compulsory if itypes and otypes are not defined
- def make_node(self, *inputs):
- pass
- # Python implementation:
- def perform(self, node, inputs_storage, output_storage):
- pass
- # Other type of implementation
- # C implementation: [see theano web site for other functions]
- def c_code(self, node, inputs, outputs, sub):
- pass
- # Other implementations (pycuda, ...):
- def make_thunk(self, node, storage_map, _, _2, impl=None):
- pass
- # optional:
- check_input = True
- def __init__(self, *args):
- pass
- def grad(self, inputs, g):
- pass
- def R_op(self, inputs, eval_points):
- pass
- def infer_shape(node, input_shapes):
- pass
An op has to implement some methods defined in the the interface ofgof.Op
. More specifically, it is mandatory for an op to define eitherthe method make_node()
or itypes
, otypes
and one of theimplementation methods, either perform()
, Op.c_code()
or make_thunk()
.
make_node()
method creates an Apply node representing the application of the op on the inputs provided. This method is reponsible for three things:
- it first checks that the input Variables types are compatible with the current op. If the op cannot be applied on the provided input types, it must raises an exception (such as
TypeError
).- it operates on the Variables found in
*inputs
in Theano’s symbolic language to infer the type of the symbolic output Variables. It creates output Variables of a suitable symbolic Type to serve as the outputs of this op’s application.- it creates an Apply instance with the input and output Variable, and return the Apply instance.
perform()
method defines the Python implementation of an op. It takes several arguments:
node
is a reference to an Apply node which was previously obtained via theOp
‘smake_node()
method. It is typically not used in simple ops, but it contains symbolic information that could be required for complex ops.inputs
is a list of references to data which can be operated on using non-symbolic statements, (i.e., statements in Python, Numpy).output_storage
is a list of storage cells where the output is to be stored. There is one storage cell for each output of the op. The data put inoutput_storage
must match the type of the symbolic output. It is forbidden to change the length of the list(s) contained inoutput_storage
. A function Mode may allowoutput_storage
elements to persist between evaluations, or it may resetoutput_storage
cells to hold a value ofNone
. It can also pre-allocate some memory for the op to use. This feature can allowperform
to reuse memory between calls, for example. If there is something preallocated in theoutput_storage
, it will be of the good dtype, but can have the wrong shape and have any stride pattern.
perform()
method must be determined by the inputs. That is to say, when applied to identical inputs the method must return the same outputs.
gof.Op
allows some other way to define the op implentation. For instance, it is possible to defineOp.c_code()
to provide a C-implementation to the op. Please refers to tutorial Extending Theano with a C Op for a description ofOp.c_code()
and other related c_methods. Note that an op can provide both Python and C implementation.
make_thunk()
method is another alternative toperform()
. It returns a thunk. A thunk is defined as a zero-arguments function which encapsulates the computation to be performed by an op on the arguments of its corresponding node. It takes several parameters:
node
is the Apply instance for which a thunk is requested,storage_map
is a dict of lists which maps variables to a one-element lists holding the variable’s current value. The one-element list acts as pointer to the value and allows sharing that “pointer” with other nodes and instances.compute_map
is also a dict of lists. It maps variables to one-element lists holding booleans. If the value is 0 then the variable has not been computed and the value should not be considered valid. If the value is 1 the variable has been computed and the value is valid. If the value is 2 the variable has been garbage-collected and is no longer valid, but shouldn’t be required anymore for this call. The returned function must ensure that it sets the computed variables as computed in the compute_map.impl
allow to select between multiple implementation. It should have a default value of None.
make_thunk()
is useful if you want to generate code and compile it yourself. For example, this allows you to use PyCUDA to compile GPU code and keep state in the thunk.If
make_thunk()
is defined by an op, it will be used by Theano to obtain the op’s implementation.perform()
andOp.c_code()
will be ignored.If
make_node()
is not defined, theitypes
andotypes
are used by the Op’smake_node()
method to implement the functionality ofmake_node()
method mentioned above.
Op’s auxiliary methods
There are other methods that can be optionally defined by the op:
The
str()
method provides a meaningful string representation of your op.
eq()
andhash()
define respectivelly equality between two ops and the hash of an op instance. They will be used by the optimization phase to merge nodes that are doing equivalent computations (same inputs, same operation). Two ops that are equal accordingeq()
should return the same output when they are applied on the same inputs.The
props
lists the properties that influence how the computation is performed (Ususally these are those that you set ininit()
). It must be a tuple. If you don’t have any properties, then you should set this attribute to the emtpy tuple ().
props
enables the automatic generation of appropriateeq()
andhash()
. Given the methodeq()
, automatically generated fromprops
, two ops will be equal if they have the same values for all the properties listed inprops
. Given to the methodhash()
automatically generated fromprops
, two ops will be have the same hash if they have the same values for all the properties listed inprops
.props
will also generate a suitablestr()
for your op. This requires development version after September 1st, 2014 or version 0.7.The
infer_shape()
method allows to infer the shape of the op output variables, without actually computing the outputs. It takes as inputnode
, a reference to the op Apply node, and a list of Theano symbolic Varables (i0_shape
,i1_shape
, …) which are the shape of the op input Variables.infer_shape()
returns a list where each element is a tuple representing the shape of one output. This could be helpful if one only needs the shape of the output instead of the actual outputs, which can be useful, for instance, for optimization procedures.The
grad()
method is required if you want to differentiate some cost whose expression includes your op. The gradient may be specified symbolically in this method. It takes two argumentsinputs
andoutput_gradients
which are both lists of symbolic Theano Variables and those must be operated on using Theano’s symbolic language. The grad method must return a list containing one Variable for each input. Each returned Variable represents the gradient with respect to that input computed based on the symbolic gradients with respect to each output. If the output is not differentiable with respect to an input then this method should be defined to return a variable of type NullType for that input. Likewise, if you have not implemented the grad computation for some input, you may return a variable of type NullType for that input. Please refer tograd()
for a more detailed view.The
R_op()
method is needed if you wanttheano.tensor.Rop
to work with your op. This function implements the application of the R-operator on the function represented by your op. Let assume that function is , with input , applying the R-operator means computing the Jacobian of and right-multiplying it by , the evaluation point, namely: .The optional boolean
check_input
attribute is used to specify if you want the types used in your op to check their inputs in their c_code. It can be used to speed up compilation, reduce overhead (particularly for scalars) and reduce the number of generated C files.
Example: Op definition
- import theano
- #Using make_node
- class DoubleOp1(theano.Op):
- __props__ = ()
- def make_node(self, x):
- x = theano.tensor.as_tensor_variable(x)
- # Note: using x_.type() is dangerous, as it copies x's broadcasting
- # behaviour
- return theano.Apply(self, [x], [x.type()])
- def perform(self, node, inputs, output_storage):
- x = inputs[0]
- z = output_storage[0]
- z[0] = x * 2
- def infer_shape(self, node, i0_shapes):
- return i0_shapes
- def grad(self, inputs, output_grads):
- return [output_grads[0] * 2]
- def R_op(self, inputs, eval_points):
- # R_op can receive None as eval_points.
- # That mean there is no diferientiable path through that input
- # If this imply that you cannot compute some outputs,
- # return None for those.
- if eval_points[0] is None:
- return eval_points
- return self.grad(inputs, eval_points)
- doubleOp1 = DoubleOp1()
- #Using itypes and otypes
- class DoubleOp2(theano.Op):
- __props__ = ()
- itypes = [theano.tensor.dmatrix]
- otypes = [theano.tensor.dmatrix]
- def perform(self, node, inputs, output_storage):
- x = inputs[0]
- z = output_storage[0]
- z[0] = x * 2
- def infer_shape(self, node, i0_shapes):
- return i0_shapes
- def grad(self, inputs, output_grads):
- return [output_grads[0] * 2]
- def R_op(self, inputs, eval_points):
- # R_op can receive None as eval_points.
- # That mean there is no diferientiable path through that input
- # If this imply that you cannot compute some outputs,
- # return None for those.
- if eval_points[0] is None:
- return eval_points
- return self.grad(inputs, eval_points)
- doubleOp2 = DoubleOp2()
At a high level, the code fragment declares a class (e.g., DoubleOp1
) and thencreates one instance of it (e.g., doubleOp1
).
We often gloss over this distinction, but will be precise here:doubleOp1
(the instance) is an Op, not DoubleOp1
(the class which is asubclass of theano.Op
). You can call doubleOp1(tensor.vector())
on aVariable to build an expression, and in the expression there will bea .op
attribute that refers to doubleOp1
.
The make_node
method creates a node to be included in the expression graph.It runs when we apply our Op (doubleOp1
) to the Variable (x
), asin doubleOp1(tensor.vector())
.When an Op has multiple inputs, their order in the inputs argument to Apply
is important: Theano will call make_node(*inputs)
to copy the graph,so it is important not to change the semantics of the expression by changingthe argument order.
All the inputs
and outputs
arguments to Apply
must be Variables.A common and easy way to ensure inputs are variables is to run them throughas_tensor_variable
. This function leaves TensorType variables alone, raisesan error for non-TensorType variables, and copies any numpy.ndarray
intothe storage for a TensorType Constant. The make_node
method dictates theappropriate Type for all output variables.
The perform
method implements the Op’s mathematical logic in Python.The inputs (here x
) are passed by value, but a single output is returnedindirectly as the first element of single-element lists. If doubleOp1
hada second output, it would be stored in output_storage[1][0]
.
In some execution modes, the output storage might contain the return value ofa previous call. That old value can be reused to avoid memory re-allocation,but it must not influence the semantics of the Op output.
You can try the new Op as follows:
- import theano
- x = theano.tensor.matrix()
- f = theano.function([x], DoubleOp1()(x))
- import numpy
- inp = numpy.random.rand(5, 4)
- out = f(inp)
- assert numpy.allclose(inp * 2, out)
- print(inp)
- print(out)
- [[ 0.08257206 0.34308357 0.5288043 0.06582951]
- [ 0.65977826 0.10040307 0.5402353 0.55472296]
- [ 0.82358552 0.29502171 0.97387481 0.0080757 ]
- [ 0.77327215 0.65401857 0.76562992 0.94145702]
- [ 0.8452076 0.30500101 0.88430501 0.95818655]]
- [[ 0.16514411 0.68616713 1.0576086 0.13165902]
- [ 1.31955651 0.20080613 1.08047061 1.10944593]
- [ 1.64717104 0.59004341 1.94774962 0.0161514 ]
- [ 1.5465443 1.30803715 1.53125983 1.88291403]
- [ 1.6904152 0.61000201 1.76861002 1.9163731 ]]
- import theano
- x = theano.tensor.matrix()
- f = theano.function([x], DoubleOp2()(x))
- import numpy
- inp = numpy.random.rand(5, 4)
- out = f(inp)
- assert numpy.allclose(inp * 2, out)
- print(inp)
- print(out)
- [[ 0.02443785 0.67833979 0.91954769 0.95444365]
- [ 0.60853382 0.7770539 0.78163219 0.92838837]
- [ 0.04427765 0.37895602 0.23155797 0.4934699 ]
- [ 0.20551517 0.7419955 0.34500905 0.49347629]
- [ 0.24082769 0.49321452 0.24566545 0.15351132]]
- [[ 0.04887571 1.35667957 1.83909538 1.90888731]
- [ 1.21706764 1.55410779 1.56326439 1.85677674]
- [ 0.08855531 0.75791203 0.46311594 0.9869398 ]
- [ 0.41103034 1.48399101 0.69001811 0.98695258]
- [ 0.48165539 0.98642904 0.4913309 0.30702264]]
Example: props definition
We can modify the previous piece of code in order to demonstratethe usage of the props
attribute.
We create an Op that takes a variable x
and returns a*x+b
.We want to say that two such ops are equal when their values of a
and b
are equal.
- import theano
- class AXPBOp(theano.Op):
- """
- This creates an Op that takes x to a*x+b.
- """
- __props__ = ("a", "b")
- def __init__(self, a, b):
- self.a = a
- self.b = b
- super(AXPBOp, self).__init__()
- def make_node(self, x):
- x = theano.tensor.as_tensor_variable(x)
- return theano.Apply(self, [x], [x.type()])
- def perform(self, node, inputs, output_storage):
- x = inputs[0]
- z = output_storage[0]
- z[0] = self.a * x + self.b
- def infer_shape(self, node, i0_shapes):
- return i0_shapes
- def grad(self, inputs, output_grads):
- return [a * output_grads[0] + b]
The use of props
savesthe user the trouble of implementing eq()
and hash()
manually. It also generates a default str()
method that prints theattribute names and their values.
We can test this by running the following segment:
- mult4plus5op = AXPBOp(4, 5)
- another_mult4plus5op = AXPBOp(4, 5)
- mult2plus3op = AXPBOp(2, 3)
- assert mult4plus5op == another_mult4plus5op
- assert mult4plus5op != mult2plus3op
- x = theano.tensor.matrix()
- f = theano.function([x], mult4plus5op(x))
- g = theano.function([x], mult2plus3op(x))
- import numpy
- inp = numpy.random.rand(5, 4).astype(numpy.float32)
- assert numpy.allclose(4 * inp + 5, f(inp))
- assert numpy.allclose(2 * inp + 3, g(inp))
How To Test it
Theano has some functionalities to simplify testing. These help test theinfer_shape
, grad
and R_op
methods. Put the following codein a file and execute it with the theano-nose
program.
Basic Tests
Basic tests are done by you just by using the op and checking that itreturns the right answer. If you detect an error, you must raise anexception. You can use the assert
keyword to automatically raise anAssertionError
.
- import numpy
- import theano
- from theano.tests import unittest_tools as utt
- from theano import config
- class test_Double(utt.InferShapeTester):
- def setUp(self):
- super(test_Double, self).setUp()
- self.op_class = DoubleOp
- self.op = DoubleOp()
- def test_basic(self):
- x = theano.tensor.matrix()
- f = theano.function([x], self.op(x))
- inp = numpy.asarray(numpy.random.rand(5, 4), dtype=config.floatX)
- out = f(inp)
- # Compare the result computed to the expected value.
- utt.assert_allclose(inp * 2, out)
We call utt.assert_allclose(expected_value, value)
to compareNumPy ndarray.This raise an error message with more information. Also,the default tolerance can be changed with the Theano flagsconfig.tensor.cmp_sloppy
that take values in 0, 1 and 2. Thedefaul value do the most strict comparison, 1 and 2 make less strictcomparison.
Testing the infer_shape
When a class inherits from the InferShapeTester
class, it gets theself._compile_and_check
method that tests the op’s infer_shape
method. It tests that the op gets optimized out of the graph if onlythe shape of the output is needed and not the outputitself. Additionally, it checks that the optimized graph computesthe correct shape, by comparing it to the actual shape of the computedoutput.
self._compile_and_check
compiles a Theano function. It takes asparameters the lists of input and output Theano variables, as would beprovided to theano.function
, and a list of real values to pass to thecompiled function. It also takes the op class as a parameterin order to verify that no instance of it appears in the shape-optimized graph.
If there is an error, the function raises an exception. If you want tosee it fail, you can implement an incorrect infer_shape
.
When testing with input values with shapes that take the same valueover different dimensions (for instance, a square matrix, or a tensor3with shape (n, n, n), or (m, n, m)), it is not possible to detect ifthe output shape was computed correctly, or if some shapes with thesame value have been mixed up. For instance, if the infer_shape usesthe width of a matrix instead of its height, then testing with onlysquare matrices will not detect the problem. This is why theself._compile_and_check
method prints a warning in such a case. Ifyour op works only with such matrices, you can disable the warning with thewarn=False
parameter.
- from theano.tests import unittest_tools as utt
- from theano import config
- class test_Double(utt.InferShapeTester):
- # [...] as previous tests.
- def test_infer_shape(self):
- x = theano.tensor.matrix()
- self._compile_and_check([x], # theano.function inputs
- [self.op(x)], # theano.function outputs
- # Always use not square matrix!
- # inputs data
- [numpy.asarray(numpy.random.rand(5, 4),
- dtype=config.floatX)],
- # Op that should be removed from the graph.
- self.op_class)
Testing the gradient
The function verify_gradverifies the gradient of an op or Theano graph. It compares theanalytic (symbolically computed) gradient and the numericgradient (computed through the Finite Difference Method).
If there is an error, the function raises an exception. If you want tosee it fail, you can implement an incorrect gradient (for instance, by removingthe multiplication by 2).
- def test_grad(self):
- theano.tests.unittest_tools.verify_grad(self.op,
- [numpy.random.rand(5, 7, 2)])
Testing the Rop
The class RopLop_checker
defines the functionsRopLop_checker.check_mat_rop_lop()
, RopLop_checker.check_rop_lop()
andRopLop_checker.check_nondiff_rop()
. These allow to test theimplementation of the Rop method of a particular op.
For instance, to verify the Rop method of the DoubleOp, you can use this:
- import numpy
- import theano.tests
- from theano.tests.test_rop import RopLop_checker
- class test_DoubleRop(RopLop_checker):
- def setUp(self):
- super(test_DoubleRop, self).setUp()
- def test_double_rop(self):
- self.check_rop_lop(DoubleRop()(self.x), self.in_shape)
Testing GPU Ops
When using the old GPU backend, Ops to be executed on the GPU should inheritfrom theano.sandbox.cuda.GpuOp
and not theano.Op
. This allowsTheano to distinguish them. Currently, we use this to test if theNVIDIA driver works correctly with our sum reduction code on the GPU.
Running Your Tests
To perform your tests, you may select either one of the threefollowing methods:
theano-nose
The method of choice to conduct tests is to run the filetheano-nose
. In a regular Theano installation, the latter will beon the operating system’s path and directly accessible from anyfolder. Otherwise, it can be accessed in the Theano/bin
folder. The following command lines may be used for the correspondingpurposes:
theano-nose —theano
: Run every test found in Theano’s path.theano-nose foldername
: Run every test found in the folder _folder_name.theano-nose testfile.py
: Run every test found in the file _test_file.py.
The following are particularly useful for development purposes sincethey call for particular classes or even for particular tests:
theano-nose testfile.py:test_DoubleRop
: Run every test found inside theclass _test_DoubleRop.theano-nose testfile.py:test_DoubleRop.test_double_op
: Run only the test_test_double_op in the class test_DoubleRop.
Help with the use and functionalities of theano-nose
may beobtained by running it with the command line parameter —help
(-h)
.
nosetests
The command nosetests
can also be used. Although it lacks theuseful functionalities that theano-nose
provides, nosetests
can be called similarly to theano-nose
from any folder in Python’spath like so:
nosetests [suffix similar to the above]
.
More documentation on nosetests
is available here:nosetests.
In-file
One may also add a block of code similar to the following at the endof the file containing a specific test of interest and run thefile. In this example, the test test_DoubleRop in the classtest_double_op would be performed.
- if __name__ == '__main__':
- t = test_DoubleRop("test_double_rop")
- t.setUp()
- t.test_double_rop()
We recommend that when we execute a file, we run all tests in thatfile. This can be done by adding this at the end of your test files:
- if __name__ == '__main__':
- unittest.main()
Exercise
Run the code of the DoubleOp example above.
Modify and execute to compute: x * y.
Modify and execute the example to return two outputs: x + y and x - y.
You can omit the Rop functions. Try to implement the testing apparatusdescribed above.
(Notice that Theano’s current elemwise fusion optimization isonly applicable to computations involving a single output. Hence, to gainefficiency over the basic solution that is asked here, the two operations wouldhave to be jointly optimized explicitly in the code.)
Random numbers in tests
Making tests errors more reproducible is a good practice. To make yourtests more reproducible, you need a way to get the same randomnumbers. You can do this by seeding NumPy’s random numbergenerator.
For convenience, the classes InferShapeTester and RopLop_checkeralready do this for you. If you implement your own setUp
function,don’t forget to call the parent setUp
function.
For more details see Using Random Values in Test Cases.
as_op
as_op is a python decorator that converts a python function into abasic Theano op that will call the supplied function during execution.
This isn’t the recommended way to build an op, but allows for a quickimplementation.
It takes an optional infer_shape()
parameter that must have thissignature:
- def infer_shape(node, input_shapes):
- # ...
- return output_shapes
- - `input_shapes` and `output_shapes` are lists of tuples that
- represent the shape of the corresponding inputs/outputs.
Note
Not providing the infer_shape method prevents shape-relatedoptimizations from working with this op. For exampleyour_op(inputs, …).shape will need the op to be executed justto get the shape.
Note
As no grad is defined, this means you won’t be able todifferentiate paths that include this op.
Note
It converts the Python function to a callable object that takes asinputs Theano variables that were declared.
Note
The python function wrapped by the as_op decorator needs to return a newdata allocation, no views or in place modification of the input.
as_op Example
- import theano
- import numpy
- from theano import function
- from theano.compile.ops import as_op
- def infer_shape_numpy_dot(node, input_shapes):
- ashp, bshp = input_shapes
- return [ashp[:-1] + bshp[-1:]]
- @as_op(itypes=[theano.tensor.fmatrix, theano.tensor.fmatrix],
- otypes=[theano.tensor.fmatrix], infer_shape=infer_shape_numpy_dot)
- def numpy_dot(a, b):
- return numpy.dot(a, b)
You can try it as follows:
- x = theano.tensor.fmatrix()
- y = theano.tensor.fmatrix()
- f = function([x, y], numpy_dot(x, y))
- inp1 = numpy.random.rand(5, 4).astype('float32')
- inp2 = numpy.random.rand(4, 7).astype('float32')
- out = f(inp1, inp2)
Exercise
Run the code of the numpy_dot example above.
Modify and execute to compute: numpy.add and numpy.subtract.
- Modify and execute the example to return two outputs: x + y
- and x - y.
Documentation and Coding Style
Please always respect the Requirements for Quality Contributions or your contributionwill not be accepted.
NanGuardMode and AllocEmpty
NanGuardMode help users find where in the graph NaN appear. Butsometimes, we want some variables to not be checked. For example, inthe old GPU back-end, we use a float32 CudaNdarray to store the MRGrandom number generator state (they are integers). So if NanGuardModecheck it, it will generate false positive. Another case is related to[Gpu]AllocEmpty or some computation on it (like done by Scan).
You can tell NanGuardMode to do not check a variable with:variable.tag.nan_guard_mode_check
. Also, this tag automaticallyfollow that variable during optimization. This mean if you tag avariable that get replaced by an inplace version, it will keep thattag.
Final Note
A more extensive discussion of this section’s content may be found inthe advanced tutorial Extending Theano.
The section Other ops includes more instructions forthe following specific cases: