The NumPy Library

Try me

Open In ColabBinder

The Numpy (Numerical Python) is a package of numerical functions to effectively work with multidimensional data structures in Python. In Python, it is possible to work with anidated lists to work with multidimensional structures (arrays and matrix), but this is not efficient. The Numpy library defines the numpy array object to provide an efficient and convenient object to define multidimensional structures.

To use Numpy in your Notebooks and programs, you first need to import the package (in this example we use the alias np):

[1]:
import numpy as np

The Numpy Array

The numpy array uses a similar structure to a Python list, although as mentioned above, it provides additional functionalities to easily create and manipulate multidimensional data structures. The data in an array are called elements and they are accessed using brackets, just as with Python lists. The dimensions of a numpy array are called axes. The elements within an axe are separated using commas and surrounded by brackets. Axes are also separated by brackets, so that a numpy array is represented as an anidated python list. The rank is the number of axis of an array. The shape is a list representing the number of elements in each axis. The elements of a numpy array can be of any numerical type.

[3]:
a = np.array([1, 2, 3, 4]) #this creates a one dimensional array of size 4
print("My first Numpy array:")
print(a)
b = np.array([[1,2,3,4],[5,6,7,8]]) #This creates a 2-dimensional (rank 2) 2x4 array
print("My second Numpy array:")
print(b)

#You can use indexing as in arrays:
print("element in position (1,2) is:")
print(b[1,2])

print("Number of dimensions:")
print(b.ndim) #number of dimensions or rank

print("Shape of array:")
print(b.shape) #shape (eg n rows, m columns)

print("Total number of elements:")
print(b.size) #number of elements
My first Numpy array:
[1 2 3 4]
My second Numpy array:
[[1 2 3 4]
 [5 6 7 8]]
element in position (1,2) is:
7
Number of dimensions:
2
Shape of array:
(2, 4)
Total number of elements:
8

Create Numpy Arrays

Numpy includes several functions for creating numpy arrays initialized with convenient ranks, shapes, or elements with constant or random values.

Some examples:

[8]:
o = np.ones((3,2)) # array of 3x2 1s
print(o)

b=np.zeros((3,4))  # array of 3x4 zeroes
print(b)

c=np.random.random(3) #array of 3x1 random numbers
print(c)

d=np.full((2,2),12)  # array of 2x2 12s
print(d)

e = np.random.randint(low=10, high=100, size=(4,4)) # array of shape 4x4 of integer numbers drawn from a discrete uniform distribution in the range 10 - 100.
print(e)

identity_matrix =np.eye(3,3) # identity array of size 3x3
print(identity_matrix)

[[1. 1.]
 [1. 1.]
 [1. 1.]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[0.71574091 0.54968971 0.72723399]
[[12 12]
 [12 12]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]

Creating sequences

Some useful functions for creating lists are arange, linspace and random.randint:

  • arange(start, end, step): creates a numpy array with elements ranging from start to end incrementing by step. Only end is required, using only end will create an evenly spaced range from 0 to end.

  • linspace(start,end,numvalues): creates a numpy array with numvalues elements with evenly distributed values ranging from start to end. The increment is calculated by the function so that the resulting number of elements matches the numvalues input parameter.

  • random.randint(low, high, size): creates a numpy array of size size with integer values selected at random in the interval between low and high-1.

[ ]:
a = np.arange(0, 10, 2)
print(a)

b=np.linspace(0,10,6)
print(b)

c = np.random.randint(0, 2, 10)
print(c)

Element wise operations

You can apply element-wise arithmetic and logical calculations to numpy arrays using arithmetic or logical operators. The functions np.exp(), np.sqrt(), or np.log() are other examples of functions that operate in the elements of a numpy array. You can check the entire list of available functions in the official Numpy documentation. Some examples:

[11]:
x =np.array([[1,2,3,4],[5,6,7,8]])
y =np.array([[9,10,11,12],[13,14,15,16]])
print(x+y)
print(y-x)
print(np.sqrt(y))
print(np.log(x))
print(x**2)
print(x+5)
[[10 12 14 16]
 [18 20 22 24]]
[[8 8 8 8]
 [8 8 8 8]]
[[3.         3.16227766 3.31662479 3.46410162]
 [3.60555128 3.74165739 3.87298335 4.        ]]
[[0.         0.69314718 1.09861229 1.38629436]
 [1.60943791 1.79175947 1.94591015 2.07944154]]
[[ 1  4  9 16]
 [25 36 49 64]]
[[ 6  7  8  9]
 [10 11 12 13]]

Note that in the last examples, we are adding a scalar value to a numpy array. In general, we can apply arithmetic operations on array of different dimensions, given that the smallest dimension between the operands is one, or that the arrays have the same dimensions. When this condition is met, numpy will expand the smaller array to match the shape of the larger array with an operation called broadcasting.

Indexing

One of the main advantages of using Numpy is that it eases access to data in the array using a very convenient syntax which allows to access members in every dimension of the array using indexing and slicing. The syntax is:

array_name[start:end:step, start:end:step, ...]

That is, you can use the same syntax as with Python lists to access elements in the array in each dimension, using commas to separate in each dimension, for instance, take a look to the following examples:

[ ]:
x =np.array([[1, 2, 3, 4],
             [5, 6, 7, 8],
             [7, 8, 9, 10]])

# Access all the elements of the first row
print("first row:")
print(x[0, :])

# Access all the elements of the first column
print("first column:")
print(x[:, 0])

# Access a 2x2 sub-matrix using slicing
print("2x2 sub-matrix")
print(x[0:2, 1:3])

Note that the number of start:end:step pairs is equal to the number of dimensions of the array. Also, remember that the start and end values are optional and the default values are 0 and the size of the dimension, respectively. As a reminder:

  • The start value is optional and the default value is 0

  • The stop value is optional and the default value is the size of the dimension

  • The step value is optional and the default value is 1.

  • If the step value is negative, the array is traversed in reverse order.

  • If the start value is negative, it is assumed to be the size of the dimension minus the absolute value of the start value.

  • If the end value is negative, it is assumed to be the size of the dimension minus the absolute value of the end value.

CSV files with Numpy

Luckily for us, Numpy provides methods to load data from a CSV file into an array and to write arrays to csv files.

The function loadtxt allows to load data from CSV files in a numpy array, for example:

[ ]:
import numpy as np

my_arr = np.loadtxt('exercise1.csv', delimiter=',', skiprows=1, usecols=(2, 3))
print(my_arr.mean(axis=0))

In the example, we loaded the csv described above and indicated that the field delimiter is a comma using the named argument delimiter. We also ignored the header using the skiprows named argument and specifying that we want to skip exactly one row. Finally, since we are only interested in the temperature and humidity readings (the only ones containing numerical values, we use the named argument usecols to only load data in columns 2 and 3 (yeah, you guessed it, column indexing starts in 0). The result is an array with two columns, so we can for instance calculate the mean temperature and humidity using the mean() method on axis 0 (rows).

Equivalently, the function savetxt saves the values of a Numpy array into a CSV file:

[ ]:
my_arr = np.arange(1,9)
my_arr = my_arr.reshape((2,4))
print(my_arr)
np.savetxt('my_array.csv', my_arr, delimiter=",", fmt='%i')

This will create a csv file named ‘my_array.csv’ in the working directory, containing the contents of the my_arr Numpy array, using commas as field delimiter or separator, and formatting numbers as integers.