x <- 1:10
y <- seq(3, 30, by = 3)
x + y
## [1] 4 8 12 16 20 24 28 32 36 40
x - y
## [1] -2 -4 -6 -8 -10 -12 -14 -16 -18 -20
x * y
## [1] 3 12 27 48 75 108 147 192 243 300
x / y
## [1] 0.3333333 0.3333333 0.3333333 0.3333333 0.3333333 0.3333333 0.3333333
## [8] 0.3333333 0.3333333 0.3333333
x^2
## [1] 1 4 9 16 25 36 49 64 81 100
t(x) %*% y
## [,1]
## [1,] 1155
11 Matrix Calculations
This entire section is only appropriate if you’ve already had linear algebra.
11.1 Objectives
- Understand how to do matrix algebra in relevant programming languages
While R and Python are extremely powerful statistical programming languages, the core of most programming languages is the ability to do basic calculations and matrix arithmetic. As almost every dataset is stored as a matrix-like structure (data sets and data frames both allow for multiple types, which isn’t quite compatible with more canonical matrices), it is useful to know how to do matrix-level calculations in whatever language you are planning to use to work with data.
In this section, we will essentially be using our programming language as overgrown calculators.
11.2 Matrix Operations
Operation | R | Python |
---|---|---|
Addition | + | + |
Subtraction | - | - |
Elementwise Multiplication | * | * |
Division | / | / |
Modulo (Remainder) | %% | % |
Integer Division | %/% | // |
Elementwise Exponentiation | ^ | ** |
Matrix/Vector Multiplication | %*% | np.dot() |
Matrix Exponentiation | ^ | np.exp() |
Matrix Transpose | t(A) |
np.transpose(A) |
Matrix Determinant | det(A) |
np.linalg.det(A) |
Matrix Diagonal | diag(A) |
np.linalg.diag(A) |
Matrix Inverse | solve(A) |
np.linalg.inv(A) |
11.2.1 Basic Mathematical Operators
import numpy as np
= np.array(range(1, 11))
x = np.array(range(3, 33, 3)) # python indexes are not inclusive
y
+ y
x ## array([ 4, 8, 12, 16, 20, 24, 28, 32, 36, 40])
- y
x ## array([ -2, -4, -6, -8, -10, -12, -14, -16, -18, -20])
* y
x ## array([ 3, 12, 27, 48, 75, 108, 147, 192, 243, 300])
/ y
x ## array([0.33333333, 0.33333333, 0.33333333, 0.33333333, 0.33333333,
## 0.33333333, 0.33333333, 0.33333333, 0.33333333, 0.33333333])
** 2
x ## array([ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100])
np.dot(x.T, y)## 1155
11.2.2 Matrix Operations
Other matrix operations, such as determinants and extraction of the matrix diagonal, are similarly easy:
mat <- matrix(c(1, 2, 3, 6, 4, 5, 7, 8, 9), nrow = 3, byrow = T)
mat
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 6 4 5
## [3,] 7 8 9
t(mat) # transpose
## [,1] [,2] [,3]
## [1,] 1 6 7
## [2,] 2 4 8
## [3,] 3 5 9
det(mat) # get the determinant
## [1] 18
diag(mat) # get the diagonal
## [1] 1 4 9
diag(diag(mat)) # get a square matrix with off-diag 0s
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 4 0
## [3,] 0 0 9
diag(1:3) # diag() also will create a diagonal matrix if given a vector
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 2 0
## [3,] 0 0 3
import numpy as np
= np.array([[1, 2, 3],[6, 4, 5],[7, 8, 9]], dtype = int, order ='C')
mat
mat## array([[1, 2, 3],
## [6, 4, 5],
## [7, 8, 9]])
mat.T## array([[1, 6, 7],
## [2, 4, 8],
## [3, 5, 9]])
# numerical precision...
np.linalg.det(mat) ## 18.000000000000004
np.diag(mat)## array([1, 4, 9])
np.diag(np.diag(mat))## array([[1, 0, 0],
## [0, 4, 0],
## [0, 0, 9]])
range(1, 4))
np.diag(## array([[1, 0, 0],
## [0, 2, 0],
## [0, 0, 3]])
11.2.3 Matrix Inverse
The other important matrix-related function is the inverse. In R, A^-1
will get you the elementwise reciprocal of the matrix. Not exactly what we’d like to see… Instead, we use the solve()
function. The inverse is defined as the matrix B such that AB = I
where I
is the identity matrix (1’s on diagonal, 0’s off-diagonal). So if we solve(A)
(in R) or solve(A, diag(n))
in SAS (where n is a vector of 1s the size of A), we will get the inverse matrix. In Python, we use the np.linalg.inv()
function to invert a matrix, which may be a bit more linguistically familiar.
mat <- matrix(c(1, 2, 3, 6, 4, 5, 7, 8, 9), nrow = 3, byrow = T)
minv <- solve(mat) # get the inverse
minv
## [,1] [,2] [,3]
## [1,] -0.2222222 0.3333333 -0.1111111
## [2,] -1.0555556 -0.6666667 0.7222222
## [3,] 1.1111111 0.3333333 -0.4444444
mat %*% minv
## [,1] [,2] [,3]
## [1,] 1 0 0
## [2,] 0 1 0
## [3,] 0 0 1
import numpy as np
= np.array([[1, 2, 3],[6, 4, 5],[7, 8, 9]], dtype = int, order ='C')
mat
= np.linalg.inv(mat)
minv
minv## array([[-0.22222222, 0.33333333, -0.11111111],
## [-1.05555556, -0.66666667, 0.72222222],
## [ 1.11111111, 0.33333333, -0.44444444]])
np.dot(mat, minv)## array([[ 1.00000000e+00, 0.00000000e+00, 1.11022302e-16],
## [-8.88178420e-16, 1.00000000e+00, -5.55111512e-16],
## [ 0.00000000e+00, 2.22044605e-16, 1.00000000e+00]])
round(np.dot(mat, minv), 2)
np.## array([[ 1., 0., 0.],
## [-0., 1., -0.],
## [ 0., 0., 1.]])