Sets are collections of objects. Unlike lists they are unordered and every element of a set is unique (no repetitions)
We create sets by enclosing elements in braces:
s = {1, 2, 3}
print(s)
In order to convert a list or a string into a set we can use the set()
function:
mylist = [1, 2, 3, 4, 3, 2, 1, 0]
myset = set(mylist)
print(myset)
t = 'mississippi'
t_set = set(t)
print(t_set)
s = {'a', 2, 'b', 3, 'hello'}
print(s)
Checking if an element is in a set:
3 in s
'c' in s
Note Checking if an element is in a set is usually much faster than checking if an element is in a list:
from time import time
mylist = list(range(10**7))
myset = set(mylist)
st = time()
for n in range(10**7, 10**7+10):
print(n, n in mylist)
print(time()-st)
st = time()
for n in range(10**7, 10**7+10):
print(n, n in myset)
print(time()-st)
for loops work for sets:
print(s)
for x in s:
print(x)
s1 = {'a', 'b', 'c', 'd'}
s2 = {'c', 'd', 'e', 'f'}
Union of sets:
s = s1 | s2
print(s)
Intersection of sets:
s = s1 & s2
print(s)
Difference of sets:
s = s1-s2
print(s)
s = s2-s1
print(s)
Symmetric difference (elements that are in either set but not in their intersection):
s = s1^s2
print(s)
Adding an element to a set:
print(s1)
s1.add('x')
print(s1)
Removing an element from a set:
s1.discard('a')
print(s1)
The pop()
function removes a random element from a set and returns this element.
print(s1)
x = s1.pop()
print(s1)
print(x)
Checking is a set is a subset of another set:
s1 = {'a', 'b', 'c'}
s2 = {'a', 'b'}
s1 < s2
s2 < s1
Note. Sets are mutable:
s1 = {'a', 'b', 'c'}
s2 = s1
s2.discard('a')
print(s2)
print(s1)
Note. Elements of a set must be non-mutable objects:
s1 = {'a', 1, [1,2]}
s1 = {'a', 1, tuple([1,2])}
print(s1)
s2 = frozenset(s1)
print(s2)
s = {s2}
print(s)
Note: Empty braces {}
denote the empty dictionary. To create an empty set use set()
s = set()
print(s)
s.add(1)
print(s)
s1 = {'a', 'b'}
s2 = {'c', 'd'}
s = s1 & s2
print(s)
from IPython.display import Image
Image("web.png", width=300)
System of equations for PageRank computations in the above network:
$$ \begin{cases} x_1 - x_2 - \frac{1}{2}x_4 = 0 \\ x_2 - \frac{1}{3} x_1 - \frac{1}{2}x_3 - \frac{1}{2}x_4 = 0 \\ x_3 - \frac{2}{3}x_1 = 0 \\ x_4 - \frac{1}{2}x_3= 0 \\ x_1 + x_2 + x_3 + x_4 = 1 \\ \end{cases} $$Matrix equation:
The numpy function np.linalg.solve(A, b)
gives a solution of the matrix equation $Ax = b$
import numpy as np
A = np.array([[1, 1], [1, -1]])
b = np.array([2,3])
np.linalg.solve(A, b)
NOte: This function works only if A is an square invertible matrix:
A = np.array([[1,1], [1,1]])
b = np.array([1,1])
np.linalg.solve(A, b)
Due to rounding errors this function may not work even for invertible matrices:
A = np.array([[1,1], [1,1.00000000000000000000001]])
b = np.array([1,1])
np.linalg.solve(A, b)
The numpy function np.linalg.lstsq(A, b)
is computing least square solutions of a matrix equation Ax = b. This will work for any matrix.
A = np.array([[1, 2], [3, 4], [5, 6]])
b = np.array([1, 1, 1])
sol = np.linalg.lstsq(A, b)
print(sol)
This function returns a tuple of elements with several values:
The first element is a numpy array with least square solutions of the matrix equation:
sol[0]
The second element if the distance between Ax and b, where x is the computed solution:
sol[1]
The third element is the rank of the matrix A:
sol[2]
The last element are singular values of the matrix A:
sol[3]
Application: computation of rankings in our sample network:
A = np.array([[1, -1, 0, -1/2], [-1/3, 1, -1/2, -1/2], [-2/3, 0, 1, 0], [0,0,-1/2, 1], [1, 1, 1, 1]])
b = np.array([0,0,0,0,1])
print(np.linalg.lstsq(A, b))