﻿Problem Solving with Algorithms and Data Structures - Chapter 2 - Analysis - Terminal In A Simputer

# Simputer

## Problem Solving with Algorithms and Data Structures - Chapter 2 - Analysis

This post is part 2 of the "Problem Solving with Algorithms and Data Structures Reading" series:

This book is amazing, last section of the last chapter blew my mind1.

## 2.1. Objectives

• To understand why algorithm analysis is important.
• To be able to use “Big-O” to describe execution time.
• To understand the “Big-O” execution time of common operations on Python lists and dictionaries.
• To understand how the implementation of Python data impacts algorithm analysis.
• To understand how to benchmark simple Python programs.

## 2.2. What Is Algorithm Analysis

`perf.py`:

```import time

def perf_wrapper(method,num):
start = time.time()
theSum = method(*num)
end = time.time()
return theSum, end-start

def sumOfN(n):
theSum = 0
for i in range(1,n+1):
theSum = theSum + i
return theSum

def sumOfN_op(n):
theSum = n * (n + 1) / 2
return theSum

def sum_perf_check(method, num):
for i in range(5):
print("Sum is %d required %10.7f seconds" % perf_wrapper(method,num))

print("Before the algorithm is optimized:")
check_set=[10**n for n in range(2,8)]
for num in check_set:
sum_perf_check(sumOfN,[num])

print("After the algorithm is optimized:")
for num in check_set:
sum_perf_check(sumOfN_op,[num])
```

Output of `perf.py`:

```Before the algorithm is optimized:
Sum is 5050 required  0.0000110 seconds
Sum is 5050 required  0.0000076 seconds
Sum is 5050 required  0.0000069 seconds
Sum is 5050 required  0.0000069 seconds
Sum is 5050 required  0.0000069 seconds
Sum is 500500 required  0.0000796 seconds
Sum is 500500 required  0.0000989 seconds
Sum is 500500 required  0.0000939 seconds
Sum is 500500 required  0.0001440 seconds
Sum is 500500 required  0.0001280 seconds
Sum is 50005000 required  0.0013959 seconds
Sum is 50005000 required  0.0008447 seconds
Sum is 50005000 required  0.0008399 seconds
Sum is 50005000 required  0.0008893 seconds
Sum is 50005000 required  0.0013807 seconds
Sum is 5000050000 required  0.0081816 seconds
Sum is 5000050000 required  0.0075161 seconds
Sum is 5000050000 required  0.0074959 seconds
Sum is 5000050000 required  0.0074937 seconds
Sum is 5000050000 required  0.0074308 seconds
Sum is 500000500000 required  0.0759306 seconds
Sum is 500000500000 required  0.0759892 seconds
Sum is 500000500000 required  0.0759845 seconds
Sum is 500000500000 required  0.0772898 seconds
Sum is 500000500000 required  0.0759218 seconds
Sum is 50000005000000 required  0.7791717 seconds
Sum is 50000005000000 required  0.7657561 seconds
Sum is 50000005000000 required  0.7915492 seconds
Sum is 50000005000000 required  0.7761276 seconds
Sum is 50000005000000 required  0.7596998 seconds
After the algorithm is optimized:
Sum is 5050 required  0.0000050 seconds
Sum is 5050 required  0.0000010 seconds
Sum is 5050 required  0.0000010 seconds
Sum is 5050 required  0.0000007 seconds
Sum is 5050 required  0.0000005 seconds
Sum is 500500 required  0.0000012 seconds
Sum is 500500 required  0.0000007 seconds
Sum is 500500 required  0.0000007 seconds
Sum is 500500 required  0.0000007 seconds
Sum is 500500 required  0.0000005 seconds
Sum is 50005000 required  0.0000007 seconds
Sum is 50005000 required  0.0000005 seconds
Sum is 50005000 required  0.0000007 seconds
Sum is 50005000 required  0.0000005 seconds
Sum is 50005000 required  0.0000005 seconds
Sum is 5000050000 required  0.0000007 seconds
Sum is 5000050000 required  0.0000010 seconds
Sum is 5000050000 required  0.0000007 seconds
Sum is 5000050000 required  0.0000007 seconds
Sum is 5000050000 required  0.0000007 seconds
Sum is 500000500000 required  0.0000005 seconds
Sum is 500000500000 required  0.0000005 seconds
Sum is 500000500000 required  0.0000007 seconds
Sum is 500000500000 required  0.0000007 seconds
Sum is 500000500000 required  0.0000005 seconds
Sum is 50000005000000 required  0.0000005 seconds
Sum is 50000005000000 required  0.0000010 seconds
Sum is 50000005000000 required  0.0000005 seconds
Sum is 50000005000000 required  0.0000005 seconds
Sum is 50000005000000 required  0.0000005 seconds
```

## 2.3 Big-O Notation

It's a method to characterize an algorithm’s efficiency in terms of execution time.

Example:

```a=5
b=6
c=10
for i in range(n):
for j in range(n):
x = i * i
y = j * j
z = i * j
for k in range(n):
w = a*k + 45
v = b*b
d = 33
```

T(n) = 3 + 3n2 + 2n + 1 = 3n2 + 2n + 4

The fragment of the code: O(n2)

## 2.4. An Anagram Detection Example

• Solution 1: Checking Off O(n2)
• Solution 2: Sort and Compare O(nlogn) or O(n2) due to `sort`
• Solution 3: Brute Force (Never good)
• Solution 4: Count and Compare O(n)

## 2.5. Performance of Python Data Structures

In later chapters you will see some possible implementations of both lists and dictionaries and how the performance depends on the implementation.

## 2.6. Lists

`list_perf.py`:

```from timeit import Timer
def test1():
l = []
for i in range(1000):
l = l + [i]

def test2():
l = []
for i in range(1000):
l.append(i)

def test3():
l = [i for i in range(1000)]

def test4():
l = list(range(1000))

t1 = Timer("test1()", "from __main__ import test1")
print("concat ",t1.timeit(number=1000), "milliseconds")
t2 = Timer("test2()", "from __main__ import test2")
print("append ",t2.timeit(number=1000), "milliseconds")
t3 = Timer("test3()", "from __main__ import test3")
print("comprehension ",t3.timeit(number=1000), "milliseconds")
t4 = Timer("test4()", "from __main__ import test4")
print("list range ",t4.timeit(number=1000), "milliseconds")
```

Result of `list_perf.py`:

```concat  1.9500927650005906 milliseconds
append  0.11545232799835503 milliseconds
comprehension  0.062062365002930164 milliseconds
list range  0.0495129260016256 milliseconds
```
OperationBig-O Efficiency
index []O(1)
index assignmentO(1)
appendO(1)
pop()O(1)
pop(i)O(n)
insert(i,item)O(n)
del operatorO(n)
iterationO(n)
contains (in)O(n)
get slice [x:y]O(k)
del sliceO(n)
set sliceO(n+k)
reverseO(n)
concatenateO(k)
sortO(n log n)
multiplyO(nk)

`popperf.py`:

```import timeit
popzero = timeit.Timer("x.pop(0)",
"from __main__ import x")
popend = timeit.Timer("x.pop()",
"from __main__ import x")

x = list(range(2000000))
print("x.pop(0) execution time: %fs" % popzero.timeit(number=1000))

x = list(range(2000000))
print("x.pop() execution time: %fs" % popend.timeit(number=1000))
```

Result of `popperf.py`:

```x.pop(0) execution time: 2.256239s
x.pop() execution time: 0.000111s
```

`popperf_2.py`:

```from timeit import Timer

popzero = Timer("x.pop(0)",
"from __main__ import x")
popend = Timer("x.pop()",
"from __main__ import x")
print("pop(0)   pop()")
for i in range(1000000,100000001,1000000):
x = list(range(i))
pt = popend.timeit(number=1000)
x = list(range(i))
pz = popzero.timeit(number=1000)
print("%15.5f, %15.5f" %(pz,pt))
```

Result of `popperf_2.py`:

```pop(0)   pop()
0.94833,         0.00010
2.04151,         0.00011
3.24372,         0.00011
4.37107,         0.00011
5.42274,         0.00012
6.68484,         0.00011
7.64100,         0.00011
8.23122,         0.00011
9.99637,         0.00011
11.55539,         0.00012
12.92389,         0.00026
13.31022,         0.00012
14.33949,         0.00011
16.14405,         0.00011
17.22286,         0.00014
17.30630,         0.00013
18.30070,         0.00014
19.25390,         0.00013
20.32429,         0.00017
21.64347,         0.00014
22.62368,         0.00013
23.68417,         0.00013
24.95661,         0.00011
25.84395,         0.00011
27.02810,         0.00011
29.19760,         0.00011
30.58301,         0.00011
31.83124,         0.00011
32.58386,         0.00016
33.32165,         0.00011
35.25702,         0.00011
36.03567,         0.00011
36.56381,         0.00012
38.86707,         0.00011
39.97921,         0.00011
39.82716,         0.00011
41.63275,         0.00011
43.04904,         0.00011
......
```

## 2.7. Dictionaries

operationBig-O Efficiency
copyO(n)
get itemO(1)
set itemO(1)
delete itemO(1)
contains (in)O(1)
iterationO(n)

`dict_perf.py`:

```import timeit
import random

for i in range(10000,1000001,20000):
t = timeit.Timer("random.randrange(%d) in x"%i,
"from __main__ import random,x")
x = list(range(i))
lst_time = t.timeit(number=1000)
x = {j:None for j in range(i)}
d_time = t.timeit(number=1000)
print("%d,%10.3f,%10.3f" % (i, lst_time, d_time))
```

Result of `dict_perf.py`:

```10000,     0.119,     0.001
30000,     0.366,     0.002
50000,     0.612,     0.002
70000,     0.914,     0.003
90000,     1.193,     0.003
110000,     1.381,     0.003
130000,     1.708,     0.003
150000,     1.984,     0.002
170000,     2.283,     0.003
190000,     2.660,     0.003
210000,     2.880,     0.002
230000,     2.918,     0.002
250000,     1.782,     0.001
270000,     1.881,     0.001
290000,     2.034,     0.001
310000,     2.122,     0.001
330000,     2.254,     0.001
350000,     2.473,     0.002
370000,     2.502,     0.001
390000,     2.844,     0.002
410000,     3.044,     0.001
430000,     3.268,     0.001
450000,     3.170,     0.002
470000,     3.398,     0.002
490000,     3.607,     0.002
510000,     3.740,     0.001
530000,     3.844,     0.002
550000,     4.131,     0.001
570000,     4.102,     0.002
590000,     4.418,     0.002
610000,     4.413,     0.002
630000,     4.829,     0.002
650000,     4.717,     0.003
670000,     5.037,     0.001
690000,     5.114,     0.002
710000,     5.599,     0.002
730000,     5.634,     0.002
750000,     5.669,     0.002
770000,     5.701,     0.001
790000,     5.745,     0.002
810000,     5.985,     0.002
830000,     6.167,     0.001
850000,     6.496,     0.002
870000,     6.966,     0.002
890000,     6.906,     0.001
910000,     6.333,     0.001
930000,     7.055,     0.001
950000,     6.709,     0.001
970000,     7.176,     0.002
990000,     6.942,     0.001
```

Dictionary is consitently faster than list.

The only dictionary operations that are not O(1) are those that require iteration.