Visualize task graphs
visualize (*args, **kwargs) | Visualize several dask graphs at once. |
Before executing your computation you might consider visualizing the underlying task graph.By looking at the inter-connectedness of tasksyou can learn more about potential bottleneckswhere parallelism may not be possile,or areas where many tasks depend on each other,which may cause a great deal of communication.
The .visualize
method and dask.visualize
function work exactly likethe .compute
method and dask.compute
function,except that rather than computing the result,they produce an image of the task graph.
By default the task graph is rendered from top to bottom.In the case that you prefer to visualize it from left to right, passrankdir="LR"
as a keyword argument to .visualize
.
- import dask.array as da
- x = da.ones((15, 15), chunks=(5, 5))
- y = x + x.T
- # y.compute()
- y.visualize(filename='transpose.svg')
Note that the visualize
function is powered by the GraphVizsystem library. This library has a few considerations:
- You must install both the graphviz system library (with tools like apt-get, yum, or brew)and the graphviz Python library.If you use Conda then you need to install
python-graphviz
,which will bring along thegraphviz
system library as a dependency. - Graphviz takes a while on graphs larger than about 100 nodes.For large computations you might have to simplify your computation a bitfor the visualize method to work well.