Numpy

Numpy#

Numpy – short for numerical Python
- A library that is used extensively used in scientific computing
- Provides:
  - A multidimensional array object called ndarray (or just array)
  - Fast routines, i.e., functions for sorting, selecting, mathematical and linear algebra operations etc.

Differences between numpy arrays and Python sequences (lists and tuples)
- Numpy arrays are mutable while sequences can be immutable
- Numpy arrays are homogenous (same element type) while sequences can be heterogenous
- In general, numpy arrays have a fixed size in memory unlike Python sequences

Consider a problem asking you to find the resultant of two vectors
- $F_1 = 2i + 3j - 4k$
- $F_2 = 3i - 1j + 4k$
- $R = F_1 + F_2$

You need to add these vectors element-wise
- Requires loops and indexing

F1 = (2, 3, -4)
F2 = (3, -1, 4)

# Elementwise addition of a and b
c = []
for i in range(len(F1)):
    c.append(F1[i] + F2[i])
    
print(c, '\n')
print(f"R = {c[0]}i + {c[1]}j + {c[2]}k")

[5, 2, 0] 

R = 5i + 2j + 0k

The same program using numpy

import numpy as np

F1 = np.array([2, 3, -4])
F2 = np.array((3, -1, 4))
R = F1 + F2

print(R, '\n')
print(f"R = {R[0]}i + {R[1]}j + {R[2]}k")

---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[2], line 1
----> 1 import numpy as np
      3 F1 = np.array([2, 3, -4])
      4 F2 = np.array((3, -1, 4))

ModuleNotFoundError: No module named 'numpy'

Numpy allows for more direct element-wise operations using the ndarray datatype
Numpy arrays are fast and efficient compared to python sequences
- Vectorization - no explicit looping and indexing
  - These operations happen in the background with pre-compiled C code
- Broadcasting - all operations are done element-wise, in general
  - This is the method used to carry out element-wise operation
  - Makes up for concise, readable code with less scope for bugs
  - Real math-like expressions and enables vectorization

Numpy Arrays#

ndarray – n-dimensional array, the basic numpy data type
- Can be used to represent vectors (1D arrays), matrices (2D arrays), …

Creating numpy arrays
- Using the np.array function
  - Input is a list, tuple or array-like object
  - Or any object that has the magic method __array__

The data type of each element in the ndarray is automatically inferred and stored in the dtype attribute

a1 = np.array([1, 2, 5, 6])    # 1d-array

a2 = np.array([[1.4, 2],
               [2, 3]])        # 2d-array

ndarray attributes
- dtype – the datatype of the stored elements
- ndim – the dimension of the array
- shape – a tuple with the size of each dimension (also referred to as an axis)
- size – the total number of elements in the array
- itemsize – the size of each element of the array in bytes

a1 = np.array([1, 2, 5, 6])    # 1d-array
a2 = np.array([[1.4, 2],
               [2, 3]])        # 2d-array
a3 = np.array([3, 5, 9j])      # 1d-array with a complex number

print(a1.dtype, a2.dtype, a3.dtype, sep='\t')
print(a1.shape, a2.shape, a3.shape, sep='\t')

int64	float64	complex128
(4,)	(2, 2)	(3,)

Array Creation#

`arange`#

Generating a sequence of numbers
- np.arange – like the range but can create non-integer sequences
  - Inputs – start, stop, and increment
- The start and increment values can be optionally discarded
  - Default values of 0 and 1 will be used respectively

a = np.arange(1.2, 2.00005, 0.2)
a

array([1.2, 1.4, 1.6, 1.8, 2. ])

np.arange(2.2) # start = 0, increment = 1 - defaults

array([0., 1., 2.])

Example - function evaluation
- Consider the function $y = 3x^2$
- Evaluate $y$ uniformly in an interval of $[0, 10]$

x = np.arange(0, 10.001, 0.2)
y = 3*x**2
y

array([0.0000e+00, 1.2000e-01, 4.8000e-01, 1.0800e+00, 1.9200e+00,
       3.0000e+00, 4.3200e+00, 5.8800e+00, 7.6800e+00, 9.7200e+00,
       1.2000e+01, 1.4520e+01, 1.7280e+01, 2.0280e+01, 2.3520e+01,
       2.7000e+01, 3.0720e+01, 3.4680e+01, 3.8880e+01, 4.3320e+01,
       4.8000e+01, 5.2920e+01, 5.8080e+01, 6.3480e+01, 6.9120e+01,
       7.5000e+01, 8.1120e+01, 8.7480e+01, 9.4080e+01, 1.0092e+02,
       1.0800e+02, 1.1532e+02, 1.2288e+02, 1.3068e+02, 1.3872e+02,
       1.4700e+02, 1.5552e+02, 1.6428e+02, 1.7328e+02, 1.8252e+02,
       1.9200e+02, 2.0172e+02, 2.1168e+02, 2.2188e+02, 2.3232e+02,
       2.4300e+02, 2.5392e+02, 2.6508e+02, 2.7648e+02, 2.8812e+02,
       3.0000e+02])

`linspace`#

Generating a sequence of numbers
- np.linspace – evenly spaced values
  - Inputs – start, stop, and number
- The number can be left out; default value of 50 will be used
- Can also use np.logspace and np.geomspace

np.linspace(0, 10, 5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

np.linspace(2.5, -2.5)

array([ 2.5       ,  2.39795918,  2.29591837,  2.19387755,  2.09183673,
        1.98979592,  1.8877551 ,  1.78571429,  1.68367347,  1.58163265,
        1.47959184,  1.37755102,  1.2755102 ,  1.17346939,  1.07142857,
        0.96938776,  0.86734694,  0.76530612,  0.66326531,  0.56122449,
        0.45918367,  0.35714286,  0.25510204,  0.15306122,  0.05102041,
       -0.05102041, -0.15306122, -0.25510204, -0.35714286, -0.45918367,
       -0.56122449, -0.66326531, -0.76530612, -0.86734694, -0.96938776,
       -1.07142857, -1.17346939, -1.2755102 , -1.37755102, -1.47959184,
       -1.58163265, -1.68367347, -1.78571429, -1.8877551 , -1.98979592,
       -2.09183673, -2.19387755, -2.29591837, -2.39795918, -2.5       ])

`zeros`, `ones`, `full`#

Generating arrays filled with constant values
- np.zeros(shape) – creates an ndarray with zeros given the shape
  - shape can either be an int or a tuple or list of integers
- np.ones(shape) – same as np.zeros but fills the array with ones
- np.full(shape, fill) – creates an array with the given shape filled with the fill value

np.zeros(4)

array([0., 0., 0., 0.])

np.full([2,3],2)

array([[2, 2, 2],
       [2, 2, 2]])

`eye`#

Identity matrix
- np.eye(N) – creates an identity matrix with N rows and columns
- np.eye(N, M) – creates an identity matrix with N rows and M columns
- np.identity(N) – equivalent to np.eye(N)

np.eye(4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

np.eye(4, 3) # truncated identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.],
       [0., 0., 0.]])

`diag`#

Diagonal matrices
- np.diag(arr) – creates a diagonal matrix given an array-like object (lists, tuples, 1D ndarray etc.)
  - Note: if the input is a 2D ndarray, the output is a 1D ndarray containing the diagonal elements

np.diag((1, 2, 3))

array([[1, 0, 0],
       [0, 2, 0],
       [0, 0, 3]])

Importing Data#

From a text file
- np.loadtxt(filename, delimiter, skiprows, dtype) – creates an array with data from a file

File contents

x, y
0, 0
1, 1
2, 4
3, 9
4, 16
5, 25

a = np.loadtxt('Data.csv', delimiter=',', skiprows = 1, dtype = 'i')
print(a)

[[ 0  0]
 [ 1  1]
 [ 2  4]
 [ 3  9]
 [ 4 16]
 [ 5 25]]

Check the links below for more info on array creation routines
- https://numpy.org/doc/stable/user/basics.creation.html
- https://numpy.org/doc/stable/reference/routines.array-creation.html#routines-array-creation

Datatypes#

Almost all array creation methods accept an optional parameter dtype to specify the data type
The default data type in numpy is float

a_int = np.array((1, 2, 4), dtype=int)
a_flt = np.array((1, 2, 4), dtype=float)
a_cplx = np.array((1, 2, 4), dtype=complex)
a_bool = np.array((1, 2, 4), dtype=bool)

print(a_int, a_flt, a_cplx, a_bool, sep="\n\n", end="\n\n")
print(a_int.dtype, a_flt.dtype, a_cplx.dtype, a_bool.dtype, sep="\t")

[1 2 4]

[1. 2. 4.]

[1.+0.j 2.+0.j 4.+0.j]

[ True  True  True]

int64	float64	complex128	bool

Check this webpage for more info on data types

Data Type Conversion#

Arrays can be converted from one data type to another using the astype method

mat = np.full((3, 3), 2, dtype = int)
print(mat)

[[2 2 2]
 [2 2 2]
 [2 2 2]]

mat_complex = mat.astype(complex)
print(mat_complex)

[[2.+0.j 2.+0.j 2.+0.j]
 [2.+0.j 2.+0.j 2.+0.j]
 [2.+0.j 2.+0.j 2.+0.j]]

Accessing Elements#

Indexing and slicing (introduced with Python sequences) works with numpy arrays too
- [𝑖 :𝑗 :Δ] – extracts elements beginning from index 𝑖 to 𝑗−1 in increments of Δ
- If 𝑗 is not specified, it is assumed to be the length of the sequence
- If 𝑖 is not specified, it is assumed to be 0
- If Δ is not specified, it is assumed to be 1

arr1d = np.array([1, 4, 9, 16, 25, 36])
arr1d.shape

(6,)

# Basic indexing
arr1d[-2]

# Slicing
arr1d[0::2]

array([ 1,  9, 25])

# Slicing
arr1d[::-1]

array([36, 25, 16,  9,  4,  1])

Advanced indexing with 1D arrays
https://numpy.org/doc/stable/user/quickstart.html#advanced-indexing-and-index-tricks

arr1d = np.array([1, 4, 9, 16, 25, 36])
arr1d[[0, 3, 2, 0]]

array([ 1, 16,  9,  1])

j = np.array([[3, 4],
              [1, 2]])
arr1d[j]

array([[16, 25],
       [ 4,  9]])

Indexing and slicing of 2D arrays
- Indices are [row, column]
- : used to indicates all elements along the axis are to be selected

arr2d = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

# Basic indexing
arr2d[1] # row

array([4, 5, 6])

# Basic indexing
arr2d[2, 0] # row, col

# Basic indexing
arr2d[(2, 0)] # Same result with a tuple

# Slicing
arr2d[0::2]

array([[1, 2, 3],
       [7, 8, 9]])

# Slicing + Indexing
arr2d[:, :]

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Advanced indexing for 2D arrays

arr2d = np.array([[1, 2, 3],
                  [4, 5, 6],
                  [7, 8, 9]])

inds = np.array([[1, 2], 
                [2, 0]])
arr2d[inds]

array([[[4, 5, 6],
        [7, 8, 9]],

       [[7, 8, 9],
        [1, 2, 3]]])

Treats the 2D array as a 1D array and generated a higher dimensional ndarray

arr2d[0, [2, 1]]

array([3, 2])

Elements with indices (0, 1) and (0, 2)

arr2d[[0, 0, 2, 2], 
      (0, 2, 0, 2)] # Tuples or lists can be used for specifying indices

array([1, 3, 7, 9])

Elements with indices (0, 0), (0, 2), (2, 0), and (2, 2)

rows = np.array([[0, 0], 
                 [2, 2]])
columns = [[0, 2], 
           [0, 2]]
arr2d[rows, columns]

array([[1, 3],
       [7, 9]])

Same elements as before are selected but the results is now a 2D array

Indexing and slicing for 3D arrays

a = np.array([1, 2, 3, 4, 5, 6, 7, 8])
arr3d = a.reshape((2, 2, 2))
arr3d

array([[[1, 2],
        [3, 4]],

       [[5, 6],
        [7, 8]]])

The reshape method changes the shape of the ndarray given a tuple provided the number of elements remain the same
np.reshape module function does the same thing

arr3d[0, 1, 0]

arr3d[(0, 1, 0)] # Same as above, but using a tuple; list will result in advanced slicing

arr3d[:, 1, 1]

array([4, 8])

Each element is a 2D array

Boolean array indexing#

The comparison operator is broadcast to all elements in the array

arr2d = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

arr2d > 5

array([[False, False, False],
       [False, False,  True],
       [ True,  True,  True]])

arr2d[arr2d > 5]

array([6, 7, 8, 9])

arr2d[arr2d % 2 == 1]

array([1, 3, 5, 7, 9])

np.isnan and np.isinf can be used to filter bad data

arr = np.array([[1, np.inf, 3],
                [4, 5, 6],
                [7, 8, np.nan]])

arr = arr[~np.isinf(arr)] # ~ is the bitwise not operator
arr = arr[~np.isnan(arr)]
print(arr)

[1. 3. 4. 5. 6. 7. 8.]

Assignment#

Numpy arrays are mutable
- They are modified just like Python lists
Augmented assignment can also be used

a = np.linspace(1, 9, 9, dtype=int)
m = a.reshape((3, 3)) # np.reshape(a, (3, 3))
m

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Assigning a constant value to a row

m[1] = 0
m

array([[1, 2, 3],
       [0, 0, 0],
       [7, 8, 9]])

Assigning a constant value to a column

m[:, 1] = 10
m

array([[ 1, 10,  3],
       [ 0, 10,  0],
       [ 7, 10,  9]])

Assign an ndarray or list or tuple to a row
- Dimensions of the LHS and RHS must match

m[2] = [2, 3, 4]
m

array([[ 1, 10,  3],
       [ 0, 10,  0],
       [ 2,  3,  4]])

Arithmetic Operations#

Arithmetic operations result in a new instance of ndarray

a = np.linspace(1, 9, 9, dtype=int).reshape((3, 3))
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

a + 2

array([[ 3,  4,  5],
       [ 6,  7,  8],
       [ 9, 10, 11]])

Values are broadcast across array
- In other words, the scale 2 is stretched to match the dimensions of a

Add a constant value to the array via broadcasting

a + 2

🠓 Broadcasting

$$ \begin{array}{|c|c|c|} \hline 1 & 2 & 3 \\ \hline 4 & 5 & 6 \\ \hline 7 & 8 & 9 \\ \hline \end{array} \quad + \quad \begin{array}{|c|c|c|} \hline 2 & \color{#969696} 2 & \color{#969696} 2 \\ \hline \color{#969696} 2 & \color{#969696} 2 & \color{#969696} 2 \\ \hline \color{#969696} 2 & \color{#969696} 2 & \color{#969696} 2 \\ \hline \end{array} \quad = \quad \begin{array}{|c|c|c|} \hline 3 & 4 & 5 \\ \hline 6 & 7 & 8 \\ \hline 9 & 10 & 11 \\ \hline \end{array} $$

a ** [1, 2, 3]

array([[  1,   4,  27],
       [  4,  25, 216],
       [  7,  64, 729]])

The list, tuple, or ndarray being multiplied will be stretched to match the dimensions of a

Multiplying a 2D array with an 1D ndarray, list, or tuple

a * [1, 2, 3]

🠓 Broadcasting

$$ \begin{array}{|c|c|c|} \hline 1 & 2 & 3 \\ \hline 4 & 5 & 6 \\ \hline 7 & 8 & 9 \\ \hline \end{array} \quad + \quad \begin{array}{|c|c|c|} \hline 1 & 2 & 3 \\ \hline \color{#969696} 1 & \color{#969696} 2 & \color{#969696} 3 \\ \hline \color{#969696} 1 & \color{#969696} 2 & \color{#969696} 3 \\ \hline \end{array} \quad = \quad \begin{array}{|c|c|c|} \hline 1 & 4 & 9 \\ \hline 4 & 10 & 18 \\ \hline 7 & 16 & 27 \\ \hline \end{array} $$

All the other Python operators like -, /, //, %, <, >, and == work in the same manner

Broadcasting works on any two arrays with compatible dimensions

# A 2D array with 4 rows and 1 column
x = np.arange(4, dtype=int).reshape((4, 1))
x

array([[0],
       [1],
       [2],
       [3]])

# A 2D array with 1 row and 3 columns
# Not the same as a 1D array!
y = np.ones(3).reshape((1, 3))
y

array([[1., 1., 1.]])

x + y

array([[1., 1., 1.],
       [2., 2., 2.],
       [3., 3., 3.],
       [4., 4., 4.]])

x + y

🠓 Broadcasting

$$ \begin{array}{|c|c|c|} \hline 0 & \color{#969696} 0 & \color{#969696} 0 \\ \hline 1 & \color{#969696} 1 & \color{#969696} 1 \\ \hline 2 & \color{#969696} 2 & \color{#969696} 2 \\ \hline 3 & \color{#969696} 3 & \color{#969696} 3 \\ \hline \end{array} \quad + \quad \begin{array}{|c|c|c|} \hline 1 & 1 & 1 \\ \hline \color{#969696} 1 & \color{#969696} 1 & \color{#969696} 1 \\ \hline \color{#969696} 1 & \color{#969696} 1 & \color{#969696} 1 \\ \hline \color{#969696} 1 & \color{#969696} 1 & \color{#969696} 1 \\ \hline \end{array} \quad = \quad \begin{array}{|c|c|c|} \hline 1 & 1 & 1 \\ \hline 2 & 2 & 2 \\ \hline 3 & 3 & 3 \\ \hline 4 & 4 & 4 \\ \hline \end{array} $$

Matrix Math#

By default, all operations in numpy are element-wise.

a = np.linspace(1, 9, 9, dtype=int)
x = a.reshape((3, 3))
y = np.eye(3)
y

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

x*y

array([[1., 0., 0.],
       [0., 5., 0.],
       [0., 0., 9.]])

To perform matrix multiplication, use @ operator instead of *
- Alternately, np.matmul(x, y) can be used

x@y

array([[1., 2., 3.],
       [4., 5., 6.],
       [7., 8., 9.]])

@ operators works with list and tuple objects too in the same way it does with ndarray

b = (1, 1, 1)
x@b # b is treated as a column vector

array([ 6, 15, 24])

c = [1, 2, 3]
c@x # c is treated as a row vector

Augmented Assignment#

Augmented assignment modifies the nadrray in place

a = np.linspace(1, 9, 9, dtype=int).reshape((3, 3))
a

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Multiply a constant value to a row

a[1] *= 2
a

array([[ 1,  2,  3],
       [ 8, 10, 12],
       [ 7,  8,  9]])

Subtract an ndarray, list, or tuple to a column (dimensions must match)

a[:, 2] -= (2, 3, 4)
a

array([[ 1,  2,  1],
       [ 8, 10,  9],
       [ 7,  8,  5]])

Multiply the entire array by an ndarray (dimensions should match)

a += [1, 2, 4]
a

array([[ 2,  4,  5],
       [ 9, 12, 13],
       [ 8, 10,  9]])

Other operations work the same way
The two operands should have the same data type
For division, the dtype of ndarray should be float

Broadcasting Rules#

The basic rules of broadcasting:
- The trailing dimension of each array is either equal or one of them is 1
  - The array with trailing dimension 1 will be stretched to match the other array’s dimension
- This is repeated for the other dimensions of the array going right to left

Example 1:

`ndarray`	shape	Description
`A`	(512, 512, 4)	A color image with 512x512 pixels and rgba values for each pixel
`B`	(4,)	Scaling values, one for each channel (, in the shape can be ignored)
`A*B`	(512, 512, 4)	A new image with scaled rgba values

Example 2:

`ndarray`	shape
`A`	(8, 1, 6, 1)
`B`	(7, 1, 5)
`A<op>B`	(8, 7, 6, 5)

Example 3:

`ndarray`	shape
`A`	(2, 1)
`B`	(8, 4, 3)
`A<op>B`	?

In this case, the dimensions are incompatible and broadcasting is not possible

More info on broadcasting
- https://numpy.org/doc/stable/user/basics.broadcasting.html

Numpy Array Methods#

Some ndarray methods
- transpose – transpose of the array (equivalent to the attribute, T)
- max, min, mean – return the max, min and mean of the array, respectively, along a given axis
- sum, prod – return the sum and product, respectively, of all the elements along a given axis
- sort – sort the array in-place
- argsort – returns an array of indices that would sort the array
- flatten – return a flattened copy of the array (i.e., collapsed to 1d)

Check ndarray documentation for a full list
- https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html

a = np.array([[2, 3, 1],
              [4, 15, 6],
              [7, 8, 9]])
a.sort(0) # Sort along axis 0 (columns)
a

a.sort(1) # Sort along axis 1 (rows)
a

Numpy Functions#

Numpy provides functions/routines for
- Array creation
  - array, zeros, ones, full, arange, linspace, …
- Array manipulation
  - reshape, delete, insert, append, hstack, vstack, split, transpose, …
- Linear algebra
  - dot/inner, outer, linalg.inv, linalg.eig, linalg.det, …
- Mathematical functions (https://numpy.org/doc/1.21/reference/routines.math.html)
  - Vectorized – apply the function element-wise
  - Trigonometric functions – sin, cos, tan, arcsin, arccos, arctan, arctan2
  - Hyperbolic functions – sinh, cosh, tanh, arcsinh, arccosh, arctanh
  - exp, log, ceil, floor, abs, …

x = np.arange(0, 2.01*np.pi, 0.2*np.pi)
y = np.sin(x)

a = np.vstack((x, y)).T
a

print(f'{"x":^4}\t{"y":^5}')
print(f'{"-"*4}\t{"-"*5}')
for p in a:
    print(f"{p[0]:1.2f}\t{p[1]: 1.2f}")

Vectorization#

Python functions like math.sin do not work with ndarray objects

import math

x = np.arange(0, 2, 0.01)
y = [math.sin(i) for i in x] # List comprehension - alternate syntax for a simple for loop
# The above code is equivalent to the commented code below
# y = []
# for i in x:
#     y += math.sin(i)
print(y)

%timeit [math.sin(i) for i in x]

np.sin works readily with ndarrays and is much faster

%timeit np.sin(x)

Use np.vectorize(func) for functions that are not numpy aware
- Returns a vectorized version of the function
- Just wraps a for loop around numpy unaware function; do not expect a speedup
- The ouput datatype is determined by evaluating func with the first element

def mySqrt(x):
    if x < 0:
        # If this is 0, output of the vectorized function will be of type int
        return 0.0
    else:
        return x**0.5

Cannot use ndarray as an input

x = np.arange(-1, 2, 0.1)
mySqrt_vectorized = np.vectorize(mySqrt)
y = mySqrt_vectorized(x)
y

array([0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.        ,
       0.        , 0.31622777, 0.4472136 , 0.54772256, 0.63245553,
       0.70710678, 0.77459667, 0.83666003, 0.89442719, 0.9486833 ,
       1.        , 1.04880885, 1.09544512, 1.14017543, 1.18321596,
       1.22474487, 1.26491106, 1.30384048, 1.34164079, 1.37840488])

Python to Matlab#

Matlab for Python Programmers

Numpy

Contents

Numpy#

Numpy Arrays#

Array Creation#

arange#

linspace#

zeros, ones, full#

eye#

diag#

Importing Data#

Datatypes#

Data Type Conversion#

Accessing Elements#

Boolean array indexing#

Assignment#

Arithmetic Operations#

Matrix Math#

Augmented Assignment#

Broadcasting Rules#

Numpy Array Methods#

Numpy Functions#

Vectorization#

Python to Matlab#

`arange`#

`linspace`#

`zeros`, `ones`, `full`#

`eye`#

`diag`#