- Debugging Theano: FAQ and Troubleshooting
- Isolating the Problem/Testing Theano Compiler
- Interpreting Error Messages
- Using Test Values
- “How do I Print an Intermediate Value in a Function?”
- “How do I Print a Graph?” (before or after compilation)
- “The Function I Compiled is Too Slow, what’s up?”
- “Why does my GPU function seem to be slow?”
- “How do I Step through a Compiled Function?”
- How to Use pdb
- Dumping a Function to help debug
- Breakpoint during Theano function execution
Debugging Theano: FAQ and Troubleshooting
There are many kinds of bugs that might come up in a computer program.This page is structured as a FAQ. It provides recipes to tackle commonproblems, and introduces some of the tools that we use to find problems in ourown Theano code, and even (it happens) in Theano’s internals, inUsing DebugMode.
Isolating the Problem/Testing Theano Compiler
You can run your Theano function in a DebugMode.This tests the Theano optimizations and helps to find where NaN, inf and other problems come from.
Interpreting Error Messages
Even in its default configuration, Theano tries to display useful errormessages. Consider the following faulty code.
- import numpy as np
- import theano
- import theano.tensor as T
- x = T.vector()
- y = T.vector()
- z = x + x
- z = z + y
- f = theano.function([x, y], z)
- f(np.ones((2,)), np.ones((3,)))
Running the code above we see:
- Traceback (most recent call last):
- ...
- ValueError: Input dimension mis-match. (input[0].shape[0] = 3, input[1].shape[0] = 2)
- Apply node that caused the error: Elemwise{add,no_inplace}(<TensorType(float64, vector)>, <TensorType(float64, vector)>, <TensorType(float64, vector)>)
- Inputs types: [TensorType(float64, vector), TensorType(float64, vector), TensorType(float64, vector)]
- Inputs shapes: [(3,), (2,), (2,)]
- Inputs strides: [(8,), (8,), (8,)]
- Inputs scalar values: ['not scalar', 'not scalar', 'not scalar']
- HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags 'optimizer=fast_compile'. If that does not work, Theano optimization can be disabled with 'optimizer=None'.
- HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint of this apply node.
Arguably the most useful information is approximately half-way throughthe error message, where the kind of error is displayed along with itscause (ValueError: Input dimension mis-match. (input[0].shape[0] = 3,input[1].shape[0] = 2).Below it, some other information is given, such as the apply node thatcaused the error, as well as the input types, shapes, strides andscalar values.
The two hints can also be helpful when debugging. Using the theano flagoptimizer=fast_compile
or optimizer=None
can often tell youthe faulty line, while exception_verbosity=high
will display adebugprint of the apply node. Using these hints, the end of the errormessage becomes :
- Backtrace when the node is created:
- File "test0.py", line 8, in <module>
- z = z + y
- Debugprint of the apply node:
- Elemwise{add,no_inplace} [id A] <TensorType(float64, vector)> ''
- |Elemwise{add,no_inplace} [id B] <TensorType(float64, vector)> ''
- | |<TensorType(float64, vector)> [id C] <TensorType(float64, vector)>
- | |<TensorType(float64, vector)> [id C] <TensorType(float64, vector)>
- |<TensorType(float64, vector)> [id D] <TensorType(float64, vector)>
We can here see that the error can be traced back to the line z = z + y
.For this example, using optimizer=fast_compile
worked. If it did not,you could set optimizer=None
or use test values.
Using Test Values
As of v.0.4.0, Theano has a new mechanism by which graphs are executedon-the-fly, before a theano.function
is ever compiled. Since optimizationshaven’t been applied at this stage, it is easier for the user to locate thesource of some bug. This functionality is enabled through the config flagtheano.config.compute_test_value
. Its use is best shown through thefollowing example. Here, we use exception_verbosity=high
andoptimizer=fast_compile
, which would not tell you the line at fault.optimizer=None
would and it could therefore be used instead of test values.
- import numpy
- import theano
- import theano.tensor as T
- # compute_test_value is 'off' by default, meaning this feature is inactive
- theano.config.compute_test_value = 'off' # Use 'warn' to activate this feature
- # configure shared variables
- W1val = numpy.random.rand(2, 10, 10).astype(theano.config.floatX)
- W1 = theano.shared(W1val, 'W1')
- W2val = numpy.random.rand(15, 20).astype(theano.config.floatX)
- W2 = theano.shared(W2val, 'W2')
- # input which will be of shape (5,10)
- x = T.matrix('x')
- # provide Theano with a default test-value
- #x.tag.test_value = numpy.random.rand(5, 10)
- # transform the shared variable in some way. Theano does not
- # know off hand that the matrix func_of_W1 has shape (20, 10)
- func_of_W1 = W1.dimshuffle(2, 0, 1).flatten(2).T
- # source of error: dot product of 5x10 with 20x10
- h1 = T.dot(x, func_of_W1)
- # do more stuff
- h2 = T.dot(h1, W2.T)
- # compile and call the actual function
- f = theano.function([x], h2)
- f(numpy.random.rand(5, 10))
Running the above code generates the following error message:
- Traceback (most recent call last):
- File "test1.py", line 31, in <module>
- f(numpy.random.rand(5, 10))
- File "PATH_TO_THEANO/theano/compile/function_module.py", line 605, in __call__
- self.fn.thunks[self.fn.position_of_error])
- File "PATH_TO_THEANO/theano/compile/function_module.py", line 595, in __call__
- outputs = self.fn()
- ValueError: Shape mismatch: x has 10 cols (and 5 rows) but y has 20 rows (and 10 cols)
- Apply node that caused the error: Dot22(x, DimShuffle{1,0}.0)
- Inputs types: [TensorType(float64, matrix), TensorType(float64, matrix)]
- Inputs shapes: [(5, 10), (20, 10)]
- Inputs strides: [(80, 8), (8, 160)]
- Inputs scalar values: ['not scalar', 'not scalar']
- Debugprint of the apply node:
- Dot22 [id A] <TensorType(float64, matrix)> ''
- |x [id B] <TensorType(float64, matrix)>
- |DimShuffle{1,0} [id C] <TensorType(float64, matrix)> ''
- |Flatten{2} [id D] <TensorType(float64, matrix)> ''
- |DimShuffle{2,0,1} [id E] <TensorType(float64, 3D)> ''
- |W1 [id F] <TensorType(float64, 3D)>
- HINT: Re-running with most Theano optimization disabled could give you a back-traces when this node was created. This can be done with by setting the Theano flags 'optimizer=fast_compile'. If that does not work, Theano optimization can be disabled with 'optimizer=None'.
If the above is not informative enough, by instrumenting the code everso slightly, we can get Theano to reveal the exact source of the error.
- # enable on-the-fly graph computations
- theano.config.compute_test_value = 'warn'
- ...
- # input which will be of shape (5, 10)
- x = T.matrix('x')
- # provide Theano with a default test-value
- x.tag.test_value = numpy.random.rand(5, 10)
In the above, we are tagging the symbolic matrix x with a special testvalue. This allows Theano to evaluate symbolic expressions on-the-fly (bycalling the perform
method of each op), as they are being defined. Sourcesof error can thus be identified with much more precision and much earlier inthe compilation pipeline. For example, running the above code yields thefollowing error message, which properly identifies line 24 as the culprit.
- Traceback (most recent call last):
- File "test2.py", line 24, in <module>
- h1 = T.dot(x, func_of_W1)
- File "PATH_TO_THEANO/theano/tensor/basic.py", line 4734, in dot
- return _dot(a, b)
- File "PATH_TO_THEANO/theano/gof/op.py", line 545, in __call__
- required = thunk()
- File "PATH_TO_THEANO/theano/gof/op.py", line 752, in rval
- r = p(n, [x[0] for x in i], o)
- File "PATH_TO_THEANO/theano/tensor/basic.py", line 4554, in perform
- z[0] = numpy.asarray(numpy.dot(x, y))
- ValueError: matrices are not aligned
The compute_test_value
mechanism works as follows:
- Theano
constants
andshared
variables are used as is. No need to instrument them. - A Theano variable (i.e.
dmatrix
,vector
, etc.) should begiven a special test value through the attributetag.test_value
. - Theano automatically instruments intermediate results. As such, any quantityderived from x will be given a
tag.test_value
automatically.
compute_test_value
can take the following values:
off
: Default behavior. This debugging mechanism is inactive.raise
: Compute test values on the fly. Any variable for which a testvalue is required, but not provided by the user, is treated as an error. Anexception is raised accordingly.warn
: Idem, but a warning is issued instead of an Exception.ignore
: Silently ignore the computation of intermediate test values, if avariable is missing a test value.
Note
This feature is currently incompatible with Scan
and also with opswhich do not implement a perform
method.
It is also possible to override variables repr
method to have them return tag.test_value.
- x = T.scalar('x')
- # Assigning test value
- x.tag.test_value = 42
- # Enable test value printing
- theano.config.print_test_value = True
- print(x.__repr__())
- # Disable test value printing
- theano.config.print_test_value = False
- print(x.__repr__())
Running the code above returns the following output:
- x
- array(42.0)
- x
“How do I Print an Intermediate Value in a Function?”
Theano provides a ‘Print’ op to do this.
- import numpy
- import theano
- x = theano.tensor.dvector('x')
- x_printed = theano.printing.Print('this is a very important value')(x)
- f = theano.function([x], x * 5)
- f_with_print = theano.function([x], x_printed * 5)
- #this runs the graph without any printing
- assert numpy.all( f([1, 2, 3]) == [5, 10, 15])
- #this runs the graph with the message, and value printed
- assert numpy.all( f_with_print([1, 2, 3]) == [5, 10, 15])
- this is a very important value __str__ = [ 1. 2. 3.]
Since Theano runs your program in a topological order, you won’t have precisecontrol over the order in which multiple Print()
ops are evaluated. For a moreprecise inspection of what’s being computed where, when, and how, see the discussion“How do I Step through a Compiled Function?”.
Warning
Using this Print
Theano Op can prevent some Theanooptimization from being applied. This can also happen withstability optimization. So if you use this Print and have nan, tryto remove them to know if this is the cause or not.
“How do I Print a Graph?” (before or after compilation)
Theano provides two functions (theano.pp()
andtheano.printing.debugprint()
) to print a graph to the terminal before or aftercompilation. These two functions print expression graphs in different ways:pp()
is more compact and math-like, debugprint()
is more verbose.Theano also provides theano.printing.pydotprint()
that creates a png image of the function.
You can read about them in printing – Graph Printing and Symbolic Print Statement.
“The Function I Compiled is Too Slow, what’s up?”
First, make sure you’re running in FAST_RUN
mode. Even thoughFAST_RUN
is the default mode, insist by passing mode='FAST_RUN'
to theano.function
(or theano.make
) or by setting config.mode
to FAST_RUN
.
Second, try the Theano profiling. This will tell you whichApply
nodes, and which ops are eating up your CPU cycles.
Tips:
- Use the flags
floatX=float32
to require type float32 instead of float64;Use the Theano constructors matrix(),vector(),… instead of dmatrix(), dvector(),…since they respectively involve the default types float32 and float64. - Check in the
profile
mode that there is noDot
op in the post-compilationgraph while you are multiplying two matrices of the same type.Dot
should beoptimized todot22
when the inputs are matrices and of the same type. This canstill happen when usingfloatX=float32
when one of the inputs of the graph isof type float64.
“Why does my GPU function seem to be slow?”
When you compile a theano function, if you do not get the speedup that you expect over theCPU performance of the same code. It is oftentimes due to the fact that some Ops might be runningon CPU instead GPU. If that is the case, you can use assert_no_cpu_op to check if thereis a CPU Op on your computational graph. assert_no_cpu_op can take the following one of the threeoptions:
warn
: Raise a warningpdb
: Stop with a pdb in the computational graph during the compilationraise
: Raise an error,if there is a CPU Op in the computational graph.
It is possible to use this mode by providing the flag in THEANO_FLAGS, such as:THEANO_FLAGS="float32,device=gpu,assert_no_cpu_op='raise'" python test.py
But note that this optimization will not catch all the CPU Ops, it might miss someOps.
“How do I Step through a Compiled Function?”
You can use MonitorMode
to inspect the inputs and outputs of eachnode being executed when the function is called. The code snipped belowshows how to print all inputs and outputs:
- from __future__ import print_function
- import theano
- def inspect_inputs(i, node, fn):
- print(i, node, "input(s) value(s):", [input[0] for input in fn.inputs],
- end='')
- def inspect_outputs(i, node, fn):
- print(" output(s) value(s):", [output[0] for output in fn.outputs])
- x = theano.tensor.dscalar('x')
- f = theano.function([x], [5 * x],
- mode=theano.compile.MonitorMode(
- pre_func=inspect_inputs,
- post_func=inspect_outputs))
- f(3)
- 0 Elemwise{mul,no_inplace}(TensorConstant{5.0}, x) input(s) value(s): [array(5.0), array(3.0)] output(s) value(s): [array(15.0)]
When using these inspect_inputs
and inspect_outputs
functionswith MonitorMode
, you should see [potentially a lot of] printed output.Every Apply
node will be printed out,along with its position in the graph, the arguments to the functions perform
orc_code
and the output it computed.Admittedly, this may be a huge amount ofoutput to read through if you are using big tensors… but you can choose toadd logic that would, for instance, printsomething out only if a certain kind of op were used, at a certain programposition, or only if a particular value showed up in one of the inputs or outputs.A typical example is to detect when NaN values are added into computations, whichcan be achieved as follows:
- import numpy
- import theano
- # This is the current suggested detect_nan implementation to
- # show you how it work. That way, you can modify it for your
- # need. If you want exactly this method, you can use
- # ``theano.compile.monitormode.detect_nan`` that will always
- # contain the current suggested version.
- def detect_nan(i, node, fn):
- for output in fn.outputs:
- if (not isinstance(output[0], numpy.random.RandomState) and
- numpy.isnan(output[0]).any()):
- print('*** NaN detected ***')
- theano.printing.debugprint(node)
- print('Inputs : %s' % [input[0] for input in fn.inputs])
- print('Outputs: %s' % [output[0] for output in fn.outputs])
- break
- x = theano.tensor.dscalar('x')
- f = theano.function([x], [theano.tensor.log(x) * x],
- mode=theano.compile.MonitorMode(
- post_func=detect_nan))
- f(0) # log(0) * 0 = -inf * 0 = NaN
- *** NaN detected ***
- Elemwise{Composite{(log(i0) * i0)}} [id A] ''
- |x [id B]
- Inputs : [array(0.0)]
- Outputs: [array(nan)]
To help understand what is happening in your graph, you candisable the local_elemwise_fusion
and all inplace
optimizations. The first is a speed optimization that merges elemwiseoperations together. This makes it harder to know which particularelemwise causes the problem. The second optimization makes some ops’outputs overwrite their inputs. So, if an op creates a bad output, youwill not be able to see the input that was overwritten in the post_func
function. To disable those optimizations (with a Theano version after0.6rc3), define the MonitorMode like this:
- mode = theano.compile.MonitorMode(post_func=detect_nan).excluding(
- 'local_elemwise_fusion', 'inplace')
- f = theano.function([x], [theano.tensor.log(x) * x],
- mode=mode)
Note
The Theano flags optimizer_including
, optimizer_excluding
and optimizer_requiring
aren’t used by the MonitorMode, theyare used only by the default
mode. You can’t use the default
mode with MonitorMode, as you need to define what you monitor.
To be sure all inputs of the node are available during the call topost_func
, you must also disable the garbage collector. Otherwise,the execution of the node can garbage collect its inputs that aren’tneeded anymore by the Theano function. This can be done with the Theanoflag:
- allow_gc=False
How to Use pdb
In the majority of cases, you won’t be executing from the interactive shellbut from a set of Python scripts. In such cases, the use of the Pythondebugger can come in handy, especially as your models become more complex.Intermediate results don’t necessarily have a clear name and you can getexceptions which are hard to decipher, due to the “compiled” nature of thefunctions.
Consider this example script (“ex.py”):
- import theano
- import numpy
- import theano.tensor as T
- a = T.dmatrix('a')
- b = T.dmatrix('b')
- f = theano.function([a, b], [a * b])
- # matrices chosen so dimensions are unsuitable for multiplication
- mat1 = numpy.arange(12).reshape((3, 4))
- mat2 = numpy.arange(25).reshape((5, 5))
- f(mat1, mat2)
This is actually so simple the debugging could be done easily, but it’s forillustrative purposes. As the matrices can’t be multiplied element-wise(unsuitable shapes), we get the following exception:
- File "ex.py", line 14, in <module>
- f(mat1, mat2)
- File "/u/username/Theano/theano/compile/function_module.py", line 451, in __call__
- File "/u/username/Theano/theano/gof/link.py", line 271, in streamline_default_f
- File "/u/username/Theano/theano/gof/link.py", line 267, in streamline_default_f
- File "/u/username/Theano/theano/gof/cc.py", line 1049, in execute ValueError: ('Input dimension mis-match. (input[0].shape[0] = 3, input[1].shape[0] = 5)', Elemwise{mul,no_inplace}(a, b), Elemwise{mul,no_inplace}(a, b))
The call stack contains some useful information to trace back the sourceof the error. There’s the script where the compiled function was called –but if you’re using (improperly parameterized) prebuilt modules, the errormight originate from ops in these modules, not this script. The last linetells us about the op that caused the exception. In this case it’s a “mul”involving variables with names “a” and “b”. But suppose we instead had anintermediate result to which we hadn’t given a name.
After learning a few things about the graph structure in Theano, we can usethe Python debugger to explore the graph, and then we can get runtimeinformation about the error. Matrix dimensions, especially, are useful topinpoint the source of the error. In the printout, there are also 2 of the 4dimensions of the matrices involved, but for the sake of example say we’dneed the other dimensions to pinpoint the error. First, we re-launch withthe debugger module and run the program with “c”:
- python -m pdb ex.py
- > /u/username/experiments/doctmp1/ex.py(1)<module>()
- -> import theano
- (Pdb) c
Then we get back the above error printout, but the interpreter breaks inthat state. Useful commands here are
- “up” and “down” (to move up and down the call stack),
- “l” (to print code around the line in the current stack position),
- “p variable_name” (to print the string representation of ‘variable_name’),
- “p dir(object_name)”, using the Python dir() function to print the list of an object’s members
Here, for example, I do “up”, and a simple “l” shows me there’s a localvariable “node”. This is the “node” from the computation graph, so byfollowing the “node.inputs”, “node.owner” and “node.outputs” links I canexplore around the graph.
That graph is purely symbolic (no data, just symbols to manipulate itabstractly). To get information about the actual parameters, you explore the“thunk” objects, which bind the storage for the inputs (and outputs) withthe function itself (a “thunk” is a concept related to closures). Here, toget the current node’s first input’s shape, you’d therefore do “pthunk.inputs[0][0].shape”, which prints out “(3, 4)”.
Dumping a Function to help debug
If you are reading this, there is high chance that you emailed ourmailing list and we asked you to read this section. This sectionexplain how to dump all the parameter passed totheano.function()
. This is useful to help us reproduce a problemduring compilation and it doesn’t request you to make a self containedexample.
For this to work, we need to be able to import the code for all Op inthe graph. So if you create your own Op, we will need thiscode. Otherwise, we won’t be able to unpickle it. We already have allthe Ops from Theano and Pylearn2.
- # Replace this line:
- theano.function(...)
- # with
- theano.function_dump(filename, ...)
- # Where filename is a string to a file that we will write to.
Then send us filename.
Breakpoint during Theano function execution
You can set a breakpoint during the execution of a Theano function withPdbBreakpoint
.PdbBreakpoint
automaticallydetects available debuggers and uses the first available in the following order:pudb, ipdb, or pdb.