Profiling

The Python standard library includes functionality to profile code. In that mode function invocations and the time spent in them are recorded. The accelerate.profiler module extends that functionality by also recording the functions’ signatures, which is useful as often the precise control flow (and thus function performance) depends on the argument types. For numpy array types, this includes not only the dtype attribute, but also the array’s shape. To demonstrate this, let us define a simple dot function and profile it without signatures, to match the behaviour of the Python standard library profile module.

from accelerate import profiler

def dot(a, b):
  sum=0
  for i in range(len(a)):
      sum += a[i]*b[i]
  return sum

a = np.arange(16, dtype=np.float32)
b = np.arange(16, dtype=np.float32)

p = profiler.Profile(signatures=False)
p.enable()
dot(a, b)
p.disable()
p.print_stats()

which will generate output like this:

      3 function calls in 0.000 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.000    0.000 builtins.len
     1    0.000    0.000    0.000    0.000 dot.py:7(dot)
     1    0.000    0.000    0.000    0.000 {method 'disable' of 'prof.Profiler' objects}

However, by default the Profile constructor’s signature flag is set to True, resulting in this output instead:

      3 function calls (2 primitive calls) in 0.000 seconds

Ordered by: standard name

ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     1    0.000    0.000    0.000    0.000 dot.py:1(disable())
   2/1    0.000    0.000    0.000    0.000 dot.py:7(dot(a:ndarray(dtype=float32, shape=(16,)), b:ndarray(dtype=float32, shape=(16,))))

For more realistic code the call graph (and thus table of function calls) is obviously much bigger, so working with the data in tabular form is not very convenient. The accelerate.profiler module therefore also provides functionality to visualize the data. Instead of calling the print_stats() method we may call the accelerate.profiler.plot() function. Running in an interactive notebook, this results in output like this:

../_images/profiling.png

The accelerate.profiler API

class accelerate.profiler.Profile(custom_timer=None, time_unit=None, subcalls=True, builtins=True, signatures=True)

Builds a profiler object using the specified timer function. The default timer is a fast built-in one based on real time. For custom timer functions returning integers, time_unit can be a float specifying a scale (i.e. how long each integer unit is, in seconds).

clear()

Clear all profiling information collected so far.

disable()

Stop collecting profiling information.

enable(subcalls=True, builtins=True)

Start collecting profiling information. If ‘subcalls’ is True, also records for each function statistics separated according to its current caller. If ‘builtins’ is True, records the time spent in built-in functions separately from their caller.

getstats() → list of profiler_entry objects

Return all information collected by the profiler. Each profiler_entry is a tuple-like object with the following attributes:

code code object callcount how many times this was called reccallcount how many times called recursively totaltime total time in this entry inlinetime inline time in this entry (not in subcalls) calls details of the calls

The calls attribute is either None or a list of profiler_subentry objects:

code called code object callcount how many times this is called reccallcount how many times this is called recursively totaltime total time spent in this call inlinetime inline time (not in further subcalls)
print_stats(sort=-1)

Print a table with profile statistics.

accelerate.profiler.plot(profiler)

Generate visualization of the current profile statistics in profiler. Right now this is only supported for interactive notebooks