
Memory management utils

Utility functions for memory management. Currently primarily for GPU.


gpu_mem_get(id=None) Tests found for gpu_mem_get:

  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_by_id [source]

To run tests please refer to this guide.

get total, used and free memory (in MBs) for gpu id. if id is not passed, currently selected torch device is used


  • for gpu returns GPUMemory(total, free, used)
  • for cpu returns GPUMemory(0, 0, 0)
  • for invalid gpu id returns GPUMemory(0, 0, 0)


gpu_mem_get_all() Tests found for gpu_mem_get_all:

  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_all [source]

To run tests please refer to this guide.

get total, used and free memory (in MBs) for each available gpu


  • for gpu returns [ GPUMemory(total_0, free_0, used_0), GPUMemory(total_1, free_1, used_1), .... ]
  • for cpu returns []


gpu_mem_get_free() No tests found for gpu_mem_get_free. To contribute a test please refer to this guide and this discussion.

get free memory (in MBs) for the currently selected gpu id, w/o emptying the cache


gpu_mem_get_free_no_cache() No tests found for gpu_mem_get_free_no_cache. To contribute a test please refer to this guide and this discussion.

get free memory (in MBs) for the currently selected gpu id, after emptying the cache


gpu_mem_get_used() Tests found for gpu_mem_get_used:

  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_measure_consumed_reclaimed [source]

To run tests please refer to this guide.

get used memory (in MBs) for the currently selected gpu id, w/o emptying the cache


gpu_mem_get_used_no_cache() No tests found for gpu_mem_get_used_no_cache. To contribute a test please refer to this guide and this discussion.

get used memory (in MBs) for the currently selected gpu id, after emptying the cache



gpu_mem_get_used_fast(gpu_handle) No tests found for gpu_mem_get_used_fast. To contribute a test please refer to this guide and this discussion.

get used memory (in MBs) for the currently selected gpu id, w/o emptying the cache, and needing the gpu_handle arg



gpu_with_max_free_mem() Tests found for gpu_with_max_free_mem:

  • pytest -sv tests/test_utils_mem.py::test_gpu_with_max_free_mem [source]

To run tests please refer to this guide.

get [gpu_id, its_free_ram] for the first gpu with highest available RAM


  • for gpu returns: gpu_with_max_free_ram_id, its_free_ram
  • for cpu returns: None, 0


preload_pytorch() No tests found for preload_pytorch. To contribute a test please refer to this guide and this discussion.

preload_pytorch is helpful when GPU memory is being measured, since the first time any operation on cuda is performed by pytorch, usually about 0.5GB gets used by CUDA context.

class GPUMemory[test]

GPUMemory(total, free, used) :: tuple No tests found for GPUMemory. To contribute a test please refer to this guide and this discussion.

GPUMemory(total, free, used)

GPUMemory is a namedtuple that is returned by functions like gpu_mem_get and gpu_mem_get_all.


b2mb(num) No tests found for b2mb. To contribute a test please refer to this guide and this discussion.

convert Bs to MBs and round down

b2mb is a helper utility that just does int(bytes/2**20)

Memory Tracing Utils

class GPUMemTrace[source][test]

GPUMemTrace(silent=False, ctx=None, on_exit_report=True) Tests found for GPUMemTrace:

  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_trace [source]
  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_trace_ctx [source]

To run tests please refer to this guide.

Trace allocated and peaked GPU memory usage (deltas).


  • silent: a shortcut to make report and report_n_reset silent w/o needing to remove those calls - this can be done from the constructor, or alternatively you can call silent method anywhere to do the same.
  • ctx: default context note in reports
  • on_exit_report: auto-report on ctx manager exit (default True)


  • Delta Used is the difference between current used memory and used memory at the start of the counter.

  • Delta Peaked is the memory overhead if any. It’s calculated in two steps:

    1. The base measurement is the difference between the peak memory and the used memory at the start of the counter.
    2. Then if delta used is positive it gets subtracted from the base value.

      It indicates the size of the blip.

      Warning: currently the peak memory usage tracking is implemented using a python thread, which is very unreliable, since there is no guarantee the thread will get a chance at running at the moment the peak memory is occuring (or it might not get a chance to run at all). Therefore we need pytorch to implement multiple concurrent and resettable torch.cuda.max_memory_allocated counters. Please vote for this feature request.

Usage Examples:


  1. from fastai.utils.mem import GPUMemTrace
  2. def some_code(): pass
  3. mtrace = GPUMemTrace()

Example 1: basic measurements via report (prints) and via data (returns) accessors

  1. some_code()
  2. mtrace.report()
  3. delta_used, delta_peaked = mtrace.data()
  4. some_code()
  5. mtrace.report('2nd run of some_code()')
  6. delta_used, delta_peaked = mtrace.data()

report‘s optional subctx argument can be helpful if you have many report calls and you want to understand which is which in the outputs.

Example 2: measure in a loop, resetting the counter before each run

  1. for i in range(10):
  2. mtrace.reset()
  3. some_code()
  4. mtrace.report(f'i={i}')

reset resets all the counters.

Example 3: like example 2, but having report automatically reset the counters

  1. mtrace.reset()
  2. for i in range(10):
  3. some_code()
  4. mtrace.report_n_reset(f'i={i}')

The tracing starts immediately upon the GPUMemTrace object creation, and stops when that object is deleted. But it can also be stoped, started manually as well.

  1. mtrace.start()
  2. mtrace.stop()

stop is in particular useful if you want to freeze the GPUMemTrace object and to be able to query its data on stop some time down the road.


In reports you can print a main context passed via the constructor:

  1. mtrace = GPUMemTrace(ctx="foobar")
  2. mtrace.report()


  1. Used Peaked MB: 0 0 (foobar)

and then add subcontext notes as needed:

  1. mtrace = GPUMemTrace(ctx="foobar")
  2. mtrace.report('1st try')
  3. mtrace.report('2nd try')


  1. Used Peaked MB: 0 0 (foobar: 1st try)
  2. Used Peaked MB: 0 0 (foobar: 2nd try)

Both context and sub-context are optional, and are very useful if you sprinkle GPUMemTrace in different places around the code.

You can silence report calls w/o needing to remove them via constructor or silent:

  1. mtrace = GPUMemTrace(silent=True)
  2. mtrace.report() # nothing will be printed
  3. mtrace.silent(silent=False)
  4. mtrace.report() # printing resumed
  5. mtrace.silent(silent=True)
  6. mtrace.report() # nothing will be printed

Context Manager:

GPUMemTrace can also be used as a context manager:

Report the used and peaked deltas automatically:

  1. with GPUMemTrace(): some_code()

If you wish to add context:

  1. with GPUMemTrace(ctx='some context'): some_code()

The context manager uses subcontext exit to indicate that the report comes after the context exited.

The reporting is done automatically, which is especially useful in functions due to return call:

  1. def some_func():
  2. with GPUMemTrace(ctx='some_func'):
  3. # some code
  4. return 1
  5. some_func()


  1. Used Peaked MB: 0 0 (some_func: exit)

so you still get a perfect report despite the return call here. ctx is useful for specifying the context in case you have many of those calls through your code and you want to know which is which.

And, of course, instead of doing the above, you can use gpu_mem_trace decorator to do it automatically, including using the function or method name as the context. Therefore, the example below does the same without modifying the function.

  1. @gpu_mem_trace
  2. def some_func():
  3. # some code
  4. return 1
  5. some_func()

If you don’t wish the automatic reporting, just pass on_exit_report=False in the constructor:

  1. with GPUMemTrace(ctx='some_func', on_exit_report=False) as mtrace:
  2. some_code()
  3. mtrace.report("measured in ctx")

or the same w/o the context note:

  1. with GPUMemTrace(on_exit_report=False) as mtrace: some_code()
  2. print(mtrace) # or mtrace.report()

And, of course, you can get the numerical data (in rounded MBs):

  1. with GPUMemTrace() as mtrace: some_code()
  2. delta_used, delta_peaked = mtrace.data()


gpu_mem_trace(func) Tests found for gpu_mem_trace:

  • pytest -sv tests/test_utils_mem.py::test_gpu_mem_trace_decorator [source]

To run tests please refer to this guide.

A decorator that runs GPUMemTrace w/ report on func

This allows you to decorate any function or method with:

  1. @gpu_mem_trace
  2. def my_function(): pass
  3. # run:
  4. my_function()

and it will automatically print the report including the function name as a context:

  1. Used Peaked MB: 0 0 (my_function: exit)

In the case of methods it’ll print a fully qualified method, e.g.:

  1. Used Peaked MB: 0 0 (Class.function: exit)

Company logo

©2021 fast.ai. All rights reserved.
Site last generated: Jan 5, 2021