Profiling is a process where we analyze 'time' (time complexity) taken by various parts of our code. It let us better understand which parts of our code are slowing down other parts. This can helps us make better decisions to utilize underlying resources efficiently.
Apart from "time", we can also profile 'memory' (space complexity) usage by various parts of our code. This can helps better organize memory and optimize memory usage.
Default Python installation comes with two useful Python libraries "cProfile" and "profile" for profiling Python code. We have already covered a tutorial on cProfile and profile which helps us understand how much time is taken by the various functions of our code.
Apart from these two, there are many other Python libraries that let us profile Python code/script/program.
The cProfile and profile provide profiling results on a function basis but don't give us information on line by line basis of function. The results generated by these libraries have time taken by function calls but no information about time taken by individual lines of each function. Python has a library called line_profiler which can help us better understand the time taken by individual lines of our code.
As a part of this tutorial, we have explained how to use Python library "line_profiler" to profile Python code/script/program. We have explained how to profile whole Python script as well as individual functions using "line_profiler". We have also covered how to use "line_profiler" in different contexts like from command line, in Python script, and in Jupyter notebook.
Below, we have listed important sections of tutorial to give an overview of material covered.
Our first example of using line_profiler explains how to use it from a command prompt/shell.
We have designed a simple python script that has one main_func() function which calls 3 other functions. All three other functions generate 100000 random numbers between 1-100 and then take an average of it. We have manually inserted time wait in all functions so that each takes different time to complete even though all perform the same functionality. We need to decorate each function that we want to profile in script with @profile decorator.
random_number_average.py
import time
import random
@profile
def very_slow_random_generator():
time.sleep(5)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
@profile
def slow_random_generator():
time.sleep(2)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
@profile
def fast_random_generator():
time.sleep(1)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
@profile
def main_func():
result = fast_random_generator()
print(result)
result = slow_random_generator()
print(result)
result = very_slow_random_generator()
print(result)
main_func()
The "kernprof" command lets us profile code/script/program using either "cProfile" or "line-by-line profiler". We'll be concentrating line by line profiler. To profiler code using line by line profiler, we need to provide option '-l' or '-line-by-line' else it'll use "cProfile".
The kernprof command will generate a file named script_name.lprof after it has completed profiling. The profiling results are stored in the .lprof file generated in the same directory.
If our script is taking any parameters then we can give them as well after script name (exactly same way as we execute script in command prompt/shell).
Please make a NOTE that our command starts with '!' because we have executed them in Jupyter notebook. If you are executing them in a command prompt or shell then you don't need to include an exclamation mark at the beginning of a command.
!kernprof -l random_number_average.py
We can then call the below line to see the output of profiling by executing the below line.
!python -m line_profiler random_number_average.py.lprof
The result generated by line_profiler has one table per function. The table has the same rows as per the number of lines of code in that function.
If we want to generate output to the shell/command line after the script is profiled using kernprof then we can call it using -v or --view option. This can be useful when profiling results are small else it'll flood output.
We can check version of "line_profiler" with '-V' or '--version' options.
!kernprof --version
As we mentioned earlier, the default measurement unit is 1e-6 which measures time in microseconds. We can change it using '-u' or '--unit' options.
Below, we have changed default units from 1e-6 to 1e-3. This will now measure time in milliseconds.
!kernprof -l --view --unit 1e-3 random_number_average.py
We can inform profiler to remove functions that are not called using '-z' or '--skip-zero' options.
We can change the default filename to which "line_profiler" writes profiling results using '-o' or '--outfile' options.
!kernprof -l -o random_generator.dat random_number_average.py
!python -m line_profiler random_generator.dat
As a part of our second example of using line_profiler, we'll explain how we can profile python code by adding few extra lines of code to our script rather than passing it to kernprof from the command line/shell.
Below we have re-written code from our previous example again.
import time
import random
def very_slow_random_generator():
time.sleep(5)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def slow_random_generator():
time.sleep(2)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def fast_random_generator():
time.sleep(1)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def main_func():
result = fast_random_generator()
print(result)
result = slow_random_generator()
print(result)
result = very_slow_random_generator()
print(result)
We then need to create an object of LineProfiler class first. We then need to create a wrapper around main_func() by calling the LineProfiler instance passing it main_func. We can then execute that line profiler wrapper which will execute main_func().
from line_profiler import LineProfiler
lprofiler = LineProfiler()
lp_wrapper = lprofiler(main_func)
lp_wrapper()
Once the line profiler wrapper (main_func) completes, we can call the print_stats() method on the line profiler instance and it'll print the profiling result.
We can see from the below results printed that it only profiled code of main_func. We'll need to make a modification to code a bit if we want to profile other functions as well.
lprofiler.print_stats()
Below we are creating the LineProfiler instance again. We are then calling add_function() on it three times passing it the name of three random average generator functions so that it profiles code of them as well. We are then wrapping main_func by calling the line profiler instance. We can then call the line profiler wrapper which will execute main_func() but this time it'll profile code for other three random average generator functions as well along with main_func.
lprofiler = LineProfiler()
lprofiler.add_function(fast_random_generator)
lprofiler.add_function(slow_random_generator)
lprofiler.add_function(very_slow_random_generator)
lp_wrapper = lprofiler(main_func)
lp_wrapper()
lprofiler.print_stats()
lprofiler.functions
lprofiler.timer_unit
Our third example explains how to use line_profiler in a jupyter notebook.
We first need to load line_profiler as an external extension into the jupyter notebook using the %load_ext magic command. This will make the %lprun line magic command available to us which we can use to profile Python code.
If you are someone who is new to magic commands in Notebooks then we would recommend that you check our tutorial on it in your free time. It'll help you better manage notebooks.
%load_ext line_profiler
Below we have created a simple function that generated a numpy array of a random number between 1-100 of size 1000x1000. We are then taking an average of an array and returning average. At the start of the function, we are pausing script for 5 seconds as well.
import time
import numpy as np
def very_slow_random_generator():
time.sleep(5)
arr1 = np.random.randint(1,100, size=(1000,1000))
avg = arr1.mean()
return avg
We can then call the very_slow_random_generator() function by first registering it with the %lprun line magic command and then calling it so that code inside it will be profiled line by line.
Please make a NOTE that we have to register all functions that we need to profile using '-f' option.
%lprun -f very_slow_random_generator very_slow_random_generator()
Timer unit: 1e-06 s
Total time: 5.01748 s
File: <ipython-input-13-8e4f20603e75>
Function: very_slow_random_generator at line 5
Line # Hits Time Per Hit % Time Line Contents
==============================================================
5 def very_slow_random_generator():
6 1 5005205.0 5005205.0 99.8 time.sleep(5)
7 1 10931.0 10931.0 0.2 arr1 = np.random.randint(1,100, size=(1000,1000))
8 1 1344.0 1344.0 0.0 avg = arr1.mean()
9 1 1.0 1.0 0.0 return avg
Below we have re-written code from the previous example. We'll try to profile all functions in the below code by using the %lprun line magic command.
import time
import random
def very_slow_random_generator():
time.sleep(5)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def slow_random_generator():
time.sleep(2)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def fast_random_generator():
time.sleep(1)
arr = [random.randint(1,100) for i in range(100000)]
return sum(arr) / len(arr)
def main_func():
result = fast_random_generator()
print(result)
result = slow_random_generator()
print(result)
result = very_slow_random_generator()
print(result)
Below we have explained how we can profile all functions using the %lprun line magic command. We first need to register all functions with %lprun with -f option and then calling main_func(). This way it'll profile code of all functions.
%lprun -f main_func -f fast_random_generator -f slow_random_generator -f very_slow_random_generator main_func()
Timer unit: 1e-06 s
Total time: 5.37966 s
File: <ipython-input-38-c9c4218f4095>
Function: very_slow_random_generator at line 4
Line # Hits Time Per Hit % Time Line Contents
==============================================================
4 def very_slow_random_generator():
5 1 5005309.0 5005309.0 93.0 time.sleep(5)
6 1 373891.0 373891.0 7.0 arr = [random.randint(1,100) for i in range(100000)]
7 1 457.0 457.0 0.0 return sum(arr) / len(arr)
Total time: 2.33906 s
File: <ipython-input-38-c9c4218f4095>
Function: slow_random_generator at line 9
Line # Hits Time Per Hit % Time Line Contents
==============================================================
9 def slow_random_generator():
10 1 2002214.0 2002214.0 85.6 time.sleep(2)
11 1 336388.0 336388.0 14.4 arr = [random.randint(1,100) for i in range(100000)]
12 1 454.0 454.0 0.0 return sum(arr) / len(arr)
Total time: 1.35545 s
File: <ipython-input-38-c9c4218f4095>
Function: fast_random_generator at line 14
Line # Hits Time Per Hit % Time Line Contents
==============================================================
14 def fast_random_generator():
15 1 1001058.0 1001058.0 73.9 time.sleep(1)
16 1 353907.0 353907.0 26.1 arr = [random.randint(1,100) for i in range(100000)]
17 1 482.0 482.0 0.0 return sum(arr) / len(arr)
Total time: 9.07528 s
File: <ipython-input-38-c9c4218f4095>
Function: main_func at line 19
Line # Hits Time Per Hit % Time Line Contents
==============================================================
19 def main_func():
20 1 1355544.0 1355544.0 14.9 result = fast_random_generator()
21 1 118.0 118.0 0.0 print(result)
22
23 1 2339153.0 2339153.0 25.8 result = slow_random_generator()
24 1 351.0 351.0 0.0 print(result)
25
26 1 5379754.0 5379754.0 59.3 result = very_slow_random_generator()
27 1 362.0 362.0 0.0 print(result)
This ends our small tutorial explaining how to use 'line_profiler' to profile Python code in detail. We have covered how to use "line_profiler" to profile Python code from command line ("kernprof" command), in Jupyter notebooks ("%lprun" Magic Command), and in Python script ("LineProfiler" object).
If you are more comfortable learning through video tutorials then we would recommend that you subscribe to our YouTube channel.
When going through coding examples, it's quite common to have doubts and errors.
If you have doubts about some code examples or are stuck somewhere when trying our code, send us an email at coderzcolumn07@gmail.com. We'll help you or point you in the direction where you can find a solution to your problem.
You can even send us a mail if you are trying something new and need guidance regarding coding. We'll try to respond as soon as possible.
If you want to