9. Methods of Inspection#
9.1. Line profiler#
line_profiler is a Python module for analyzing code execution time, especially for each line of code execution time in detail. It can help us locate performance bottlenecks in our code and find areas that can be optimized. By using line_profiler, we can better understand the performance characteristics of the code and target optimizations to improve the efficiency of program execution.
9.1.1. Install#
Run the command in the terminal:pip install line-profiler
9.1.2. Example#
Let’s take a simple example: calculating the sum of squares of a large array.
large_arr = np.random.randint(1, 100, size=100000)
def sum_of_squares(arr):
result = 0
for num in arr:
result += num ** 2
return result
import line_profiler
profiler = line_profiler.LineProfiler()
profiler.add_function(sum_of_squares)
profiler.run("sum_of_squares(large_arr)")
profiler.print_stats()
Timer unit: 1e-07 s
Total time: 0.105808 s
File: C:\Users\DELL\AppData\Local\Temp\ipykernel_15492\2110549466.py
Function: sum_of_squares at line 3
Line # Hits Time Per Hit % Time Line Contents
==============================================================
3 def sum_of_squares(arr):
4 1 4.0 4.0 0.0 result = 0
5 100000 406596.0 4.1 38.4 for num in arr:
6 100000 651475.0 6.5 61.6 result += num ** 2
7 1 3.0 3.0 0.0 return result
Line #
: The line number in the code.
Hits
: The number of times the code was executed.
Time
: Total time. Indicates the total time (in seconds) accumulated in all executions of the code.
Per Hit
: Average Time. Represents the average time (Time/Hits) of each execution of the code.
% Time
: Percentage time. Indicates the percentage of time spent running the code out of the total program runtime.
In the output, we can see that for the sum_of_squares
function, each line represents the corresponding line of code in the function (from 3 to 7 lines) and lists the number of times each line of code was executed, the total elapsed time, the average elapsed time, and the percentage time.
For example, in line 5 of the code, the line for num in arr
: was executed 100,000 times at runtime, taking a total of 406,596 microseconds (i.e., 0.406596 seconds), with an average of 4.1 microseconds (i.e., 0.0000041 seconds), and occupying 38.4% of the total runtime.
In this case, the result += num ** 2
in the for loop takes up the vast majority of the time. This suggests that when working with large-scale data, we may be able to get better performance by using NumPy’s vectorization operation instead of the for loop: result = np.sum(arr ** 2)
.
Therefore, using line_profiler
can help us locate performance bottlenecks and identify potential optimization points to improve code execution efficiency.
9.2. View check#
When creating a view of an array, it shares the same data buffer as the original array. Typically, using a view is faster as it avoids data copying and additional memory allocation. Hence, it’s necessary to perform a view check.
9.2.1. Example#
Let’s start by creating an array C.
import numpy as np
C = np.arange(12)
C = np.arange(12).reshape(4,3)
C
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
Make x equal to the first row of the C array:
x = C[0,:]
x
array([0, 1, 2])
When we change the first element of the array, x changes accordingly because x is view.
C[0,0] = 100
C
array([[100, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
[ 9, 10, 11]])
x
array([100, 1, 2])
For example, in the following example, we first create a NumPy array.
import numpy as np
data_list = [x**2 for x in range(1000**2)]
data_np = np.array(data_list)
Next calculate the square root. Here we calculate the execution time.
start_time = time.time()
result = np.sqrt(data_np)
end_time = time.time()
print(f"execution time: {end_time - start_time} seconds")
NumPy execution time: 0.007866621017456055 seconds
We can use .base
to check if data_np is a view.
print(data_np.base)
None
The results show that it isn’t, so here we can create a view for it and recalculate the sum of squares.
start_time = time.time()
data_np_view = data_np.view()
result = np.sqrt(data_np_view)
end_time = time.time()
print(f"execution time: {end_time - start_time} seconds")
execution time: 0.00401759147644043 seconds
Timer unit: 1e-07 s
print(data_np_view.base)
[ 0 1 4 ... 999994000009 999996000004
999998000001]
np.shares_memory(data_np_view, data_np)
True