Space and Time Complexity

Space complexity refers to the amount of memory used by an algorithm to complete its execution, as a function of the size of the input. The space complexity of an algorithm can be affected by various factors such as the size of the input data, the data structures used in the algorithm, the number and size of temporary variables, and the recursion depth. Time complexity refers to the amount of time required by an algorithm to run as the input size grows. It is usually measured in terms of the "Big O" notation, which describes the upper bound of an algorithm's time complexity.

Why do you think a programmer should care about space and time complexity?

  • A programmer should care about space and time complexity because these are both two major factors that can effect the efficiency and readability of code. The code that can take the least amount of time and fill the least space will be the most efficient when it comes to larger scale projects.

Take a look at our lassen volcano example from the data compression tech talk. The first code block is the original image. In the second code block, change the baseWidth to rescale the image.

from IPython.display import Image, display
from pathlib import Path 

# prepares a series of images
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    for image in images:
        # File to open
        image['filename'] = path / image['file']  # file with path
    return images

def image_display(images):
    for image in images:  
        display(Image(filename=image['filename']))

if __name__ == "__main__":
    lassen_volcano = image_data(images=[{'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"}])
    image_display(lassen_volcano)
from IPython.display import HTML, display
from pathlib import Path 
from PIL import Image as pilImage 
from io import BytesIO
import base64

# prepares a series of images
def image_data(path=Path("images/"), images=None):  # path of static images is defaulted
    for image in images:
        # File to open
        image['filename'] = path / image['file']  # file with path
    return images

def scale_image(img):
    #baseWidth = 625
    #baseWidth = 1250
    #baseWidth = 2500
    baseWidth = 5000 # see the effect of doubling or halfing the baseWidth 
    #baseWidth = 10000 
    #baseWidth = 20000
    #baseWidth = 40000
    scalePercent = (baseWidth/float(img.size[0]))
    scaleHeight = int((float(img.size[1])*float(scalePercent)))
    scale = (baseWidth, scaleHeight)
    return img.resize(scale)

def image_to_base64(img, format):
    with BytesIO() as buffer:
        img.save(buffer, format)
        return base64.b64encode(buffer.getvalue()).decode()
    
def image_management(image):  # path of static images is defaulted        
    # Image open return PIL image object
    img = pilImage.open(image['filename'])
    
    # Python Image Library operations
    image['format'] = img.format
    image['mode'] = img.mode
    image['size'] = img.size
    image['width'], image['height'] = img.size
    image['pixels'] = image['width'] * image['height']
    # Scale the Image
    img = scale_image(img)
    image['pil'] = img
    image['scaled_size'] = img.size
    image['scaled_width'], image['scaled_height'] = img.size
    image['scaled_pixels'] = image['scaled_width'] * image['scaled_height']
    # Scaled HTML
    image['html'] = '<img src="data:image/png;base64,%s">' % image_to_base64(image['pil'], image['format'])


if __name__ == "__main__":
    # Use numpy to concatenate two arrays
    images = image_data(images = [{'source': "Peter Carolin", 'label': "Lassen Volcano", 'file': "lassen-volcano.jpg"}])
    
    # Display meta data, scaled view, and grey scale for each image
    for image in images:
        image_management(image)
        print("---- meta data -----")
        print(image['label'])
        print(image['source'])
        print(image['format'])
        print(image['mode'])
        print("Original size: ", image['size'], " pixels: ", f"{image['pixels']:,}")
        print("Scaled size: ", image['scaled_size'], " pixels: ", f"{image['scaled_pixels']:,}")
        
        print("-- original image --")
        display(HTML(image['html']))

Big O Notation

  • Constant O(1)
  • Linear O(n)
  • Quadratic O(n^2)
  • Logarithmic O(logn)
  • Exponential (O(2^n))
numbers = list(range(1000))

Constant O(1)

Time

An example of a constant time algorithm is accessing a specific element in an array. It does not matter how large the array is, accessing an element in the array takes the same amount of time. Therefore, the time complexity of this operation is constant, denoted by O(1).

print(numbers[263])
263

Space

This function takes two integer inputs and returns their sum. The function does not create any additional data structures or variables that are dependent on the input size, so its space complexity is constant, or O(1). Regardless of how large the input integers are, the function will always require the same amount of memory to execute.

def sum(a, b): 
  return a + b;

Linear O(n)

Time

An example of a linear time algorithm is traversing a list or an array. When the size of the list or array increases, the time taken to traverse it also increases linearly with the size. Hence, the time complexity of this operation is O(n), where n is the size of the list or array being traversed.

for i in numbers:
    print(i)

Space

This function takes a list of elements arr as input and returns a new list with the elements in reverse order. The function creates a new list reversed_arr of the same size as arr to store the reversed elements. The size of reversed_arr depends on the size of the input arr, so the space complexity of this function is O(n). As the input size increases, the amount of memory required to execute the function also increases linearly.

def reverse_list(arr):
    n = len(arr)
    reversed_arr = [None] * n
    for i in range(n):
        reversed_arr[n-i-1] = arr[i]
    return reversed_arr

Quadratic O(n^2)

Time

An example of a quadratic time algorithm is nested loops. When there are two nested loops that both iterate over the same collection, the time taken to complete the algorithm grows quadratically with the size of the collection. Hence, the time complexity of this operation is O(n^2), where n is the size of the collection being iterated over.

for i in numbers:
    for j in numbers:
        print(i,j)

Space

This function takes two matrices matrix1 and matrix2 as input and returns their product as a new matrix. The function creates a new matrix result with dimensions m by n to store the product of the input matrices. The size of result depends on the size of the input matrices, so the space complexity of this function is O(n^2). As the size of the input matrices increases, the amount of memory required to execute the function also increases quadratically.

def multiply_matrices(matrix1, matrix2):
    m = len(matrix1) 
    n = len(matrix2[0])
    result = [[0] * n] * m #this creates the new matrix based on the size of matrix 1 and 2
    for i in range(m):
        for j in range(n):
            for k in range(len(matrix2)):
                result[i][j] += matrix1[i][k] * matrix2[k][j]
    return result

print(multiply_matrices([[1,2],[3,4]], [[3,4],[1,2]]))
[[18, 28], [18, 28]]

Logarithmic O(logn)

Time

An example of a log time algorithm is binary search. Binary search is an algorithm that searches for a specific element in a sorted list by repeatedly dividing the search interval in half. As a result, the time taken to complete the search grows logarithmically with the size of the list. Hence, the time complexity of this operation is O(log n), where n is the size of the list being searched.

def binary_search(arr, low, high, target):
    while low <= high:
        mid = (low + high) // 2
        if arr[mid] == target:
            return mid
        elif arr[mid] < target:
            low = mid + 1
        else:
            high = mid - 1

target = 263
result = binary_search(numbers, 0, len(numbers) - 1, target)

print(result)
263

Space

The same algorithm above has a O(logn) space complexity. The function takes an array arr, its lower and upper bounds low and high, and a target value target. The function searches for target within the bounds of arr by recursively dividing the search space in half until the target is found or the search space is empty. The function does not create any new data structures that depend on the size of arr. Instead, the function uses the call stack to keep track of the recursive calls. Since the maximum depth of the recursive calls is O(logn), where n is the size of arr, the space complexity of this function is O(logn). As the size of arr increases, the amount of memory required to execute the function grows logarithmically.

Exponential O(2^n)

Time

An example of an O(2^n) algorithm is the recursive implementation of the Fibonacci sequence. The Fibonacci sequence is a series of numbers where each number is the sum of the two preceding ones, starting from 0 and 1. The recursive implementation of the Fibonacci sequence calculates each number by recursively calling itself with the two preceding numbers until it reaches the base case (i.e., the first or second number in the sequence). The algorithm takes O(2^n) time in the worst case because it has to calculate each number in the sequence by making two recursive calls.

def fibonacci(n):
    if n <= 1:
        return n
    else:
        return fibonacci(n-1) + fibonacci(n-2)

print(fibonacci(3))
2

Space

This function takes a set s as input and generates all possible subsets of s. The function does this by recursively generating the subsets of the set without the first element, and then adding the first element to each of those subsets to generate the subsets that include the first element. The function creates a new list for each recursive call that stores the subsets, and each element in the list is a new list that represents a subset. The number of subsets that can be generated from a set of size n is 2^n, so the space complexity of this function is O(2^n). As the size of the input set increases, the amount of memory required to execute the function grows exponentially.

def generate_subsets(s):
    if not s:
        return [[]]
    subsets = generate_subsets(s[1:])
    return [[s[0]] + subset for subset in subsets] + subsets

Using the time library, we are able to see the difference in time it takes to calculate the fibonacci function above.

  • Based on what is known about the other time complexities, hypothesize the resulting elapsed time if the function is replaced.
import time

start_time = time.time()
print(fibonacci(34))
end_time = time.time()

total_time = end_time - start_time
print("Time taken:", total_time, "seconds")

start_time = time.time()
print(fibonacci(35))
end_time = time.time()

total_time = end_time - start_time
print("Time taken:", total_time, "seconds")
5702887
Time taken: 1.314479112625122 seconds
9227465
Time taken: 2.0933010578155518 seconds

Hacks

  • Record your findings when testing the time elapsed of the different algorithms.

    • linear algorithms run faster with smaller data sets
    • quadratic and exponential ones run faster with larger data sets.
  • Why is time and space complexity important when choosing an algorithm?

    • A programmer should care about space and time complexity because these are both two major factors that can effect the efficiency and readability of code. The code that can take the least amount of time and fill the least space will be the most efficient when it comes to larger scale projects.
  • Should you always use a constant time algorithm / Should you never use an exponential time algorithm? Explain?

    • No, it's not always necessary to use a constant time algorithm nor is it always bad to use an exponential time algorithm. The choice of algorithm depends on the specific problem you are trying to solve, the size of the input data, and the resources available to you. If you have a small input size and limited resources, a constant time algorithm might be the most efficient choice. But if your input size is large, it might not be possible to solve the problem with a constant time algorithm, and you might have to resort to an algorithm with higher time complexity, such as an exponential time algorithm.
    • For example, in cryptography, brute force algorithms that try every possible key are often exponential time algorithms. While these algorithms are slow, they are still useful because the only known way to break the encryption is to try all possible keys. Similarly, in certain scientific simulations, it might be necessary to use an exponential time algorithm to model complex physical systems accurately.
  • What are some general patterns that you noticed to determine each algorithm's time and space complexity?

    • Nested loops: If an algorithm contains one or more nested loops, the time complexity is typically O(n^2), O(n^3), or some other polynomial time complexity, where n is the size of the input data.
    • Recursive calls: If an algorithm is recursive, the time complexity is often expressed using a recurrence relation. The complexity of a recursive algorithm is usually related to the number of recursive calls and the size of the data being processed.
    • Sorting and searching: If an algorithm involves sorting or searching data, the time complexity is usually expressed in terms of the number of elements being sorted or searched. For example, quicksort and merge sort have a time complexity of O(n log n), while linear search has a time complexity of O(n).
    • Data structures: If an algorithm uses data structures like arrays, lists, or trees, the space complexity is usually proportional to the size of the data being stored. For example, an algorithm that creates an array of size n has a space complexity of O(n).

Although we will go more in depth later, time complexity is a key concept that relates to the different sorting algorithms. Do some basic research on the different types of sorting algorithms and their time complexity.

  1. Bubble Sort: It is one of the simplest sorting algorithms, where each element is compared with its adjacent element and swapped if the adjacent element is greater. This process is repeated until the list is sorted. The time complexity of the bubble sort algorithm is O(n^2).

  2. Selection Sort: In this algorithm, the smallest element in the list is found and swapped with the first element. Then, the smallest element in the remaining list is found and swapped with the second element, and so on. The time complexity of the selection sort algorithm is O(n^2).

  3. Insertion Sort: This algorithm works by iterating through the list and inserting each element into its proper position in the sorted sub-list. The time complexity of the insertion sort algorithm is also O(n^2).

  4. Merge Sort: This algorithm divides the list into two halves recursively until each sub-list contains only one element. Then, the sub-lists are merged back together in sorted order. The time complexity of the merge sort algorithm is O(n log n).

  5. Quick Sort: It is a divide-and-conquer algorithm that selects a pivot element and partitions the list into two sub-lists, one with elements smaller than the pivot and the other with elements greater than the pivot. This process is repeated recursively until the list is sorted. The time complexity of the quick sort algorithm is O(n log n).

  6. Heap Sort: This algorithm sorts the elements by constructing a binary heap and repeatedly extracting the maximum element from the heap until the list is sorted. The time complexity of the heap sort algorithm is also O(n log n).

Complete the Time and Space Complexity analysis questions linked below. Practice

  1. What is the time, and space complexity of the following code:
    • O(N * M) time, O(1) space
    • O(N + M) time, O(N + M) space
    • O(N + M) time, O(1) space
    • O(N * M) time, O(N + M) space
import random

a = 0
b = 0
for i in range(N):
  a = a + random()
 
for i in range(M):
  b= b + random()

the answer is option 3

  • this is because the first loop is O(N) and the second loop is O(M). Since N and M are independent variables, so we can’t say which one is the leading term. Therefore Time complexity of the given problem will be O(N+M).
  • Since variables size does not depend on the size of the input, therefore Space Complexity will be constant or O(1). The algorithm stores the sum of N random numbers in one variable which is a. Thus the only space being used is the variable a thus the space is always 1.
  1. What is the time complexity of the following code:
    • O(N)
    • O(N*log(N))
    • O(N * Sqrt(N))
    • O(N*N)
a = 0
for i in range(N):
  for j in reversed(range(i, N)):
    a = a + i + j

the answer is option 4

  • there are two nested loops that both iterate over the same collection, thus the time taken to complete the algorithm grows quadratically with the size of the collection.
  1. What is the time complexity of the following code:
    • O(n)
    • O(N log N)
    • O(n^2)
    • O(n^2Logn)
k = 0;
for i in range(n//2,n): #iterates over the second half of the range from "n//2" to "n".
  for j in range(2,n,pow(2,j)): #the step size, "pow(2,j)", finds the 2^"j". So, the step size of the loop doubles with each iteration.
        k = k + n / 2;

the answer is option 2

  • the first for statement is N. The second for statement is logN. pow(2,j) represents the step size so for each iteration of the second for, j increases until 2^j = n which means that j = log(2)n. Then it stops so the time complexity is N*logN.
  1. What does it mean when we say that an algorithm X is asymptotically more efficient than Y?
    • X will always be a better choice for small inputs
    • X will always be a better choice for large inputs
    • Y will always be a better choice for small inputs
    • X will always be a better choice for all inputs

the answer is option 2

  • The worst-case time complexity of an algorithm is a measure of the largest amount of time it takes to run for any input size n. When we say that algorithm X is asymptotically more efficient than algorithm Y, we mean that X has a smaller worst-case time complexity than Y. This implies that as the input size increases towards infinity, X will eventually become faster than Y for large enough inputs.
  • For example, if the worst-case time complexity of algorithm X is O(n log n) and the worst-case time complexity of algorithm Y is O(n^2), we say that X is asymptotically more efficient than Y, because the growth rate of X is slower than that of Y as the input size increases. In practice, this means that as the input size gets larger, X will become faster than Y for large enough inputs.
  1. What is the time complexity of the following code:
    • O(N)
    • O(Sqrt(N))
    • O(N / 2)
    • O(log N)
a = 0
i = N
while (i > 0):
  a += i # adds the current value of "i" to the value of "a" and assigns the result to "a".
  i //= 2 #divides "i" by 2 and assigns result to "i". halves "i" in each iteration of the loop.

the answer is option 4

  • based on how the code works, I can determine that every time it iterates, the value of "i" is being halved and added and assigned to the variable a. As "i" is halved more and more, it slowly approaches 0 thus I can imagine that N decreases in larger quantities at first but slows down later which follows a pattern of log N.
  1. Which of the following best describes the useful criterion for comparing the efficiency of algorithms?
    • Time
    • Memory
    • Both of the above
    • None of the above

the answer is option 3.

  • this is because both time and memory are things that programmers must take into consideration when determining the efficiency of algorithms. Coders need to ensure that the algorithm doesn't take up too much memory but also make sure the program runs in reasonable time.
  1. How is time complexity measured?
    • By counting the number of algorithms in an algorithm.
    • By counting the number of primitive operations performed by the algorithm on a given input size.
    • By counting the size of data input to the algorithm.
    • None of the above

the answer is option 2

  • To determine the time complexity of an algorithm, we first identify the operations that the algorithm performs, and then we count the number of times each operation is executed as a function of the input size n.
  1. What will be the time complexity of the following code?
    • O(n)
    • O(k)
    • O(logk(n))
    • O(logn(k))
for i in range(n):
  i=i*k

the answer is option 3

  • idg
  1. What will be the time complexity of the following code?
    • n
    • (n+1)
    • n(n-1)
    • n(n+1)
value = 0;
for i in range(n): #iterates "n" times, with "i" taking on values from 0 to n-1.
  for j in range(i): # iterates "i" times, with "j" taking on values from 0 to i-1.
    value=value+1

the answer is option 3

  • the first for statement runs (n) times. the second for runs (n-1) times because "i" takes on values from 0 to n-1. Thus the overall time complexity is n*(n-1)
  1. Algorithm A and B have a worst-case running time of O(n) and O(logn), respectively. Therefore, algorithm B always runs faster than algorithm A.
    • True
    • False

the answer is False

  • to have a worst case running time means that as the input size increases towards infinity, algorithm B will eventually become faster than A for large enough inputs. However for smaller inputs, it may be the case that algorithm A is faster, thus B does not always run faster than A