Different type of python
Some notable mention wrt linux:
- Cpython -> CPython is the original Python implementation
- PyPy -> A fast python implementation with a JIT compiler
- Jython -> Python running on the Java Virtual Machine
- Stackless -> Branch of CPython supporting microthreads . Seems similar to go programming language.
Profiling and optimizing python
Timing function
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import time
from functools import wraps
import random
def timing(f):
def wrap(*args):
time1 = time.time()
ret = f(*args)
time2 = time.time()
print('{:s} function took {:.3f} ms'.format(f.__name__, (time2-time1)*1000.0))
return ret
return wrap
@timing
def random_sort(n):
return sorted([random.random() for i in range(n)])
using time it
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import timeit
def linear_search(mylist, find):
for x in mylist:
if x == find:
return True
return False
def linear_time():
SETUP_CODE = '''
from __main__ import linear_search
from random import randint'''
TEST_CODE = '''
mylist = [x for x in range(10000)]
find = randint(0, len(mylist))
linear_search(mylist, find)
'''
# timeit.repeat statement
times = timeit.repeat(setup = SETUP_CODE,stmt = TEST_CODE,repeat = 3,number = 10000)
# priniting minimum exec. time
print('Linear search time: {}'.format(min(times)))
if __name__ == "__main__":
linear_time()
timing an script
time -p python script.py
using cprofile
python -m cProfile -s cumulative script.py
profiling memory
pip install memory_profiler
pip install psutil
python -m memory_profiler script.py
there is also one tool called guppy. And it’s a really good library.
Some less known data structure
This part of the blog is taken from PyMotw
ChainMap
The ChainMap class manages a sequence of dictionaries, and searches through them in the order they are given to find values associated with keys
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
import collections
a = {1: 10, 2: 20}
b = {3: 30, 4: 40, 2:200}
m1 = collections.ChainMap(a, b)
m2 = collections.ChainMap(b, a)
print(m1[2],end=" ")
# it will print 20
print(m2[2],end="\n")
# it will print 200
print(list(m1.keys()))
print(list(m1.values()))
for k, v in m1.items():
print('{} = {}'.format(k, v))
m1.maps = list(reversed(m1.maps))
print('m1 = {}'.format(m1[2]))
a[5]=50
print(m1[5])
m3 = m1.new_child()
m3[2] = 2000
m3[10] = 209
Counter
A Counter is a container that keeps track of how many times equivalent values are added.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import collections
c = collections.Counter()
print('Initial :', c)
c.update('apoorvakumarhseenvaramuk')
print('Sequence:', c)
c.update({'a': 1, 'd': 5})
print('Dict :', c)
print('Most common:')
for letter, count in c.most_common(3):
print('{}: {}'.format(letter, count))
c1 = collections.Counter('aaaaaaassddbcpoerqw')
print(c1 + c)
print(c1 - c)
print(c1 & c)
print(c1 | c2)
DefaultDict
The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default up front when the container is initialized.
1
2
3
4
5
6
7
8
9
10
11
import collections
def default_factory():
return 'default value'
d = collections.defaultdict(default_factory, foo='bar')
print('d:', d)
print('foo =>', d['foo'])
print('bar =>', d['bar'])
NamedTuple
The standard tuple uses numerical indexes to access its members. Similar to C structure.
1
2
3
4
5
6
7
8
9
10
11
12
13
import collections
Person = collections.namedtuple('Person', 'name age')
bob = Person(name='Bob', age=30)
print('\nRepresentation:', bob)
jane = Person(name='Jane', age=29)
print('\nField by name:', jane.name)
print('\nFields by index:')
for p in [bob, jane]:
print('{} is {} years old'.format(*p))
It’s immutable.
1
2
3
4
5
6
7
import collections
Person = collections.namedtuple('Person', 'name age')
bob = Person(name='Bob', age=30)
print('Representation:', bob)
print('As Dictionary:', bob._asdict())
Some more detail about collection
Itertools
The chain() function takes several iterators as arguments and returns a single iterator that produces the contents of all of the inputs as though they came from a single iterator.
1
2
3
4
5
from itertools import *
for i in chain([1, 2, 3], ['a', 'b', 'c']):
print(i, end=' ')
print()
zip_longest()
1
2
3
4
5
6
7
from itertools import *
r1 = range(3)
r2 = range(2)
print('\nzip_longest processes all of the values:')
print(list(zip_longest(r1, r2)))
islice()
The islice() function returns an iterator which returns selected items from the input iterator, by index.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
from itertools import *
print('Stop at 5:')
for i in islice(range(100), 5):
print(i, end=' ')
print('\n')
print('Start at 5, Stop at 10:')
for i in islice(range(100), 5, 10):
print(i, end=' ')
print('\n')
print('By tens to 100:')
for i in islice(range(100), 0, 100, 10):
print(i, end=' ')
print('\n')
starmap()
The starmap() function is similar to map(), but instead of constructing a tuple from multiple iterators, it splits up the items in a single iterator as arguments to the mapping function using the * syntax.
1
2
3
4
5
6
from itertools import *
values = [(0, 5), (1, 6), (2, 7), (3, 8), (4, 9)]
for i in starmap(lambda x, y: (x, y, x * y), values):
print('{} * {} = {}'.format(*i))
fraction and count()
count() function returns an iterator that produces consecutive integers, indefinitely
1
2
3
4
5
6
7
8
9
10
from itertools import *
import fractions
from itertools import *
start = fractions.Fraction(1, 3)
step = fractions.Fraction(1, 3)
for i in zip(count(start, step), ['a', 'b', 'c']):
print('{}: {}'.format(*i))
cycle() function returns an iterator that repeats the contents of the arguments it is given indefinitely. Since it has to remember the entire contents of the input iterator, it may consume quite a bit of memory if the iterator is long.
accumulate()
accumulate() function processes the input iterable, passing the nth and n+1st item to a function and producing the return value instead of either input. The default function used to combine the two values adds them, so accumulate() can be used to produce the cumulative sum of a series of numerical inputs.
1
2
3
4
from itertools import *
print(list(accumulate(range(5))))
print(list(accumulate('abcde')))
It is possible to combine accumulate() with any other function that takes two input values to achieve different results.
1
2
3
4
5
from itertools import *
def f(a, b):
print(a, b)
return b + a + b
print(list(accumulate('abcde', f)))
permutation()
1
2
3
4
5
6
from itertools import permutations
perm = permutations([1, 2, 3], 2)
for i in list(perm):
print i
# Answer->(1, 2),(1, 3),(2, 1),(2, 3),(3, 1),(3, 2)
combination()
1
2
3
4
from itertools import combinations
comb = combinations([1, 2, 3], 2)
for i in list(comb):
print i