Asymptotic analysis is the process of determining the complexity of an algorithm, typically described using big O notation. The goal of asymptotic analysis is to determine the worst-case running time of an algorithm as the input size grows to infinity.
Worst-case analysis of an algorithm is a method used to determine the upper bound on the running time of an algorithm in the worst possible scenario. It is used to determine the maximum amount of time or resources an algorithm will take to complete its task. This type of analysis is typically used to compare the efficiency of different algorithms and to help choose the most suitable algorithm for a given problem. The worst-case scenario is usually the input that results in the maximum number of steps or operations for the algorithm to complete its task.
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
# worst case input: [5, 4, 3, 2, 1]
arr = [5, 4, 3, 2, 1]
print(bubble_sort(arr))
In this example, the worst-case scenario occurs when the input array is already sorted in descending order. In this case, the algorithm will have to perform n-1 comparisons and n-1 swaps in the inner loop for each iteration of the outer loop, resulting in a total of (n-1)² operations. Therefore, the worst-case time complexity of the bubble sort algorithm is O(n²).
Best-case analysis is a method used to determine the best-case running time of an algorithm, which is the minimum amount of time or resources the algorithm will take to complete its task for any input of a given size.
def bubble_sort(arr):
n = len(arr)
for i in range(n):
for j in range(0, n-i-1):
if arr[j] > arr[j+1]:
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
# best case input: [1,2,3,4,5]
arr = [1,2,3,4,5]
print(bubble_sort(arr))
In this example, the best-case scenario occurs when the input array is already sorted in ascending order. In this case, the algorithm will not perform any swaps in the inner loop, resulting in a total of n-1 operations. Therefore, the best-case time complexity of the bubble sort algorithm is O(n).
Average-case analysis is used to estimate the performance of the algorithm for “typical” inputs, rather than the worst or best possible inputs.
import random
def quicksort(arr):
if len(arr) <= 1:
return arr
pivot = arr[len(arr) // 2]
left = [x for x in arr if x < pivot]
middle = [x for x in arr if x == pivot]
right = [x for x in arr if x > pivot]
return quicksort(left) + middle + quicksort(right)
# Assume an input of size n is generated randomly with equal probability
n = 10
arr = random.sample(range(1, 100), n)
print(quicksort(arr))
In this example, we assume that the input array is generated randomly with equal probability, this means that all possible input arrays of size n have the same chance of being chosen. As you can see, quicksort’s average-case time complexity is O(n*log(n)) which is considered to be very good and is often used in practice.
The problem size of an algorithm is the size of the input data and it affects the running time. In general, as problem size increases, running time also increases. Algorithms with time complexity O(n²) or less, such as insertion sort, bubble sort, and selection sort, are efficient on small data sets. Problem size can be represented by number of elements, iterations, recursive calls or a combination of those factors.
Merge sort has a time complexity of O(n log n) and performs well on large data sets. Insertion sort is simple but requires no additional memory while merge sort requires O(n) memory but it is a stable sort algorithm.
Constant factors refer to non-dominating terms in the running time of an algorithm. They can include additional operations or hardware-specific factors that affect the running time.
For example, consider the following two algorithms:
# Algorithm 1
def find_max(arr):
max_val = arr[0]
for i in range(1, len(arr)):
if arr[i] > max_val:
max_val = arr[i]
return max_val
# Algorithm 2
def find_max(arr):
max_val = arr[0]
for i in range(1, len(arr)):
if arr[i] > max_val:
max_val = arr[i]
arr[i] = arr[i] ** 2
return max_val
Both algorithms have O(n) time complexity, but Algorithm 2 has an additional square operation, considered as a constant factor.
Constant factors can also include hardware-specific factors, such as the speed of the CPU or the size of the cache, that can affect the running time of an algorithm on a specific machine. It’s important to note that constant factors can have a significant impact on the running time of an algorithm, especially for small input sizes.
Big Omega and Theta. Big O notation and Theta notation are both used to describe the upper bound of the running time of an algorithm. However, there is a subtle difference between the two.
Big O notation describes the worst-case running time of an algorithm, which is the maximum amount of time the algorithm will take given any input. It provides an upper bound on the running time, which means that the algorithm will never take longer than O(f(n)) operations, where f(n) is a function of the input size.
Theta notation, on the other hand, describes the average-case or expected running time of an algorithm. It provides a tight bound on the running time, which means that the algorithm will take Theta(f(n)) operations in most cases, where f(n) is a function of the input size.
The notation was not invented by algorithm designers or computer scientists. It’s been in use in number theory since the nineteenth century.