Sorting Algorithms

Quick Start Guide

Sorting Algorithms :- A Sorting Algorithm is used to rearrange a given array or list elements according to a comparison operator on the elements. The comparison operator is used to decide the new order of element in the respective data structure.

Following are the basic sorting algorithms

  • Bubble Sort
    • Bubble sort is an algorithm that compares the adjacent elements and swaps their positions if they are not in the intended order. The order can be ascending or descending.

    • How Bubble Sort Works?
      • Compare two items.

      • If the one on the left is bigger, swap them.

      • Move one position right.

      • Linear Search Algorithm::
        bubbleSort(array)
        for i = 1 to indexOfLastUnsortedElement-1
        if leftElement > rightElement

        swap leftElement and rightElement

        end bubbleSort

    • Optimized Bubble Sort
      • In the simple bubble sort algorithm, all possible comparisons are made even if the array is already sorted which increases the execution time.

      • The code can be optimized by introducing an extra variable swapped so After each iteration, if there is no swapping taking place then, there is no need for performing further iterations.

      • In such a case, variable swapped is set false. Thus, we can prevent further iterations.

      • Optimized Algorithm::
        bubbleSort(array)

        swapped = false for i = 1 to indexOfLastUnsortedElement-1

        if leftElement > rightElement

        swap leftElement and rightElement swapped = true

        end bubbleSort

    • Efficiency
      • For 10 data items, this is 45 comparisons (9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1).

      • In general, where N is the number of items in the array, there are N-1 comparisons on the first pass, N-2 on the second, and so on. The formula for the sum of such a series is (N–1) + (N–2) + (N–3) + ... + 1 = N*(N–1)/2 N*(N–1)/2 is 45 (10*9/2) when N is 10.

  • Insertion Sort
    • Insertion sort works similarly as we sort cards in our hand in a card game.

    • We assume that the first card is already sorted then, we select an unsorted card. If the unsorted card is greater than the card in hand, it is placed on the right otherwise, to the left. In the same way, other unsorted cards are taken and put at their right place.

    • A similar approach is used by insertion sort.

    • Insertion sort is a sorting algorithm that places an unsorted element at its suitable place in each iteration.

    • In most cases the insertion sort is the best of the elementary sorts as described above . It still executes in O(N2) time, but it’s about twice as fast as the bubble sort and somewhat faster than the selection sort in normal situations. It’s also not too complex, although it’s slightly more involved than the bubble and selection sorts. It’s often used as the final stage of more sophisticated sorts, such as quicksort.

    • How Insertion Sort Works?
      • The first element in the array is assumed to be sorted. Take the second element and store it separately in key variable

      • Compare key variable with the first element. If the first element is greater than key, then key is placed in front of the first element

      • At this stage first two elements are sorted

      • Take the third element and compare it with the elements on the left of it. Placed it just behind the element smaller than it. If there is no element smaller than it, then place it at the beginning of the array

      • Place every unsorted element at its correct position till end

      • Insertion Algorithm::
        insertionSort(array)

        mark first element as sorted for each unsorted element X

        store the element X in a variable key for j = lastSortedIndex down to 0

        if current element j > X move sorted element to the right by 1

        break loop and insert X here

        end insertionSort

      • Efficiency
        • How many comparisons and copies does this algorithm require? On the first pass, it compares a maximum of one item. On the second pass, it’s a maximum of two items, and so on, up to a maximum of N-1 comparisons on the last pass. This is 1 + 2 + 3 + … + N-1 = N*(N-1)/2

        • However, because on each pass an average of only half of the maximum number of items are actually compared before the insertion point is found, we can divide by 2, which gives N*(N-1)/4

        • The number of copies is approximately the same as the number of comparisons. However, a copy isn’t as time-consuming as a swap, so for random data this algorithm runs twice as fast as the bubble sort and faster than the selection sort.

        • In any case, like the other sort routines, the insertion sort runs in O(N2) time for random data.

        • For data that is already sorted or almost sorted, the insertion sort does much better. When data is in order, the condition in the while loop is never true, so it becomes a simple statement in the outer loop, which executes N-1 times. In this case the algorithm runs in O(N) time. If the data is almost sorted, insertion sort runs in almost O(N) time, which makes it a simple and efficient way to order a file that is only slightly out of order.

  • Selection Sort
    • Selection sort is an algorithm that selects the smallest element from an unsorted list in each iteration and places that element at the beginning of the unsorted list.

    • The selection sort improves on the bubble sort by reducing the number of swaps necessary from O(N2) to O(N). Unfortunately, the number of comparisons remains O(N2). However, the selection sort can still offer a significant improvement for large records that must be physically moved around in memory, causing the swap time to be much more important than the comparison time.

    • How Selection Sort Works?
      • Set the first element as minimum

      • Compare minimum with the 2nd element

      • If the 2nd element is smaller than minimum, assign the second element as minimum

      • Compare minimum with the third element

      • Again, if the third element is smaller, then assign minimum to the third element otherwise do nothing,The process goes on until the last element

      • After each iteration, minimum is placed in the front of the unsorted list

      • For each iteration, indexing starts from the first unsorted element

      • So bascially the list is divided into two parts: the sublist of items already sorted, which is built up from left to right and is found at the beginning, and the sublist of items remaining to be sorted, occupying the remainder of the array

      • Selection Sort Algorithm::

        selectionSort(array, size) repeat (size - 1) times

        set the first unsorted element as the minimum for each of the unsorted elements

        if element < currentMinimum

        set element as new minimum

        swap minimum with first unsorted position

        end selectionSort

      • Efficiency
        • The selection sort performs the same number of comparisons as the bubble sort: N*(N-1)/2. For 10 data items, this is 45 comparisons

        • However, 10 items require fewer than 10 swaps. With 100 items, 4,950 comparisons are required, but fewer than 100 swaps

        • For large values of N, the comparison times will dominate, so we would have to say that the selection sort runs in O(N2) time, just as the bubble sort did

  • Counting Sort
    • Counting sort is a sorting algorithm that sorts the elements of an array by counting the number of occurrences of each unique element in the array. The count is stored in an auxiliary array and the sorting is done by mapping the count as an index of the auxiliary array

    • How Counting Sort Works?
      • Find out the maximum element (let it be max) from the given array

      • Initialize an array of length max+1 with all elements 0. This array is used for storing the count of the elements in the array

      • Store the count of each element at their respective index in count array

      • For example: if the count of element 3 is 2 then, 2 is stored in the 3rd position of count array. If element “5” is not present in the array, then 0 is stored in 5th position

      • Store cumulative sum of the elements of the count array. It helps in placing the elements into the correct index of the sorted array

      • Find the index of each element of the original array in the count array. This gives the cumulative count. Place the element at the index calculated

      • After placing each element at its correct position, decrease its count by one.

    • Counting Sort Algorithm::
      countingSort(array, size)

      max : find largest element in array initialize count array with all zeros for j <- 0 to size

      find the total count of each unique element and store the count at jth index in count array

      for i <- 1 to max

      find the cumulative sum and store it in count array itself

      for j <- size down to 1

      restore the elements to array decrease count of each element restored by 1

    • Efficiency
      • Counting sorts fail when there are large key values (the k in the O(n)). This means that if you have a large variety of key values, counting sort will be slow

      • Radix sort can help solve that problem but it does nothing for other issue

      • Both counting and radix sort are only valid for integer keys

      • While not a terribly serious limitation, it does mean that Radix Sort’s value for the number of digits in a key should not be considered constant

  • Merge Sort
    • mergesort is a much more efficient sorting technique than those we saw in above section, at least in terms of speed. While the bubble, insertion, and selection sorts take O(N2) time, the mergesort is O(N*logN)

    • For example, if N (the number of items to be sorted) is 10,000, then N2 is 100,000,000, while``N*logN`` is only 40,000

    • If sorting this many items required 40 seconds with the mergesort, it would take almost 28 hours for the insertion sort.

    • Merge Sort uses Divide and Conquer Strategy

    • Using the Divide and Conquer technique, we divide a problem into subproblems. When the solution to each subproblem is ready, we ‘combine’ the results from the subproblems to solve the main problem.

    • The mergesort is also fairly easy to implement. It’s conceptually easier than quicksort and the Shell short.

    • The heart of the mergesort algorithm is the merging of two already-sorted arrays. Merging two sorted arrays A and B creates a third array, C, that contains all the elements of A and B, also arranged in sorted order.

    • Similar to quicksort the list of element which should be sorted is divided into two lists. These lists are sorted independently and then combined. During the combination of the lists the elements are inserted (or merged) on the correct place in the list

    • You divide the half into two quarters, sort each of the quarters, and merge them to make a sorted half

    • How Mergesort Works?
      • Assume the size of the left array is k, the size of the right array is m and the size of the total array is n (=k+m).

      • Create a helper array with the size n

      • Copy the elements of the left array into the left part of the helper array. This is position 0 until k-1.

      • Copy the elements of the right array into the right part of the helper array. This is position k until m-1.

      • Create an index variable i=0; and j=k+1

      • Loop over the left and the right part of the array and copy always the smallest value back into the original array. Once i=k all values have been copied back the original array. The values of the right array are already in place.

    • Efficiency
      • As we noted, the mergesort runs in O(N*logN) time. There are 24 copies necessary to sort 8 items. Log28 is 3, so 8*log28 equals 24. This shows that, for the case of 8 items, the number of copies is proportional to N*log2N

      • In the mergesort algorithm, the number of comparisons is always somewhat less than the number of copies.

  • Quick Sort
    • Quicksort is an algorithm based on divide and conquer approach in which the array is split into subarrays and these sub-arrays are recursively called to sort the elements.

    • Quicksort is a divide-and-conquer algorithm that involves choosing a pivot value from a data-set and splitting the set into two subsets: a set that contains all values less than the pivot and a set that contains all values greater than or equal to the pivot. The pivot/split process is recursively applied to each subset until there are no more subsets to split. The results are combined to form the final sorted set.

    • The challenge of a quicksort is to determine a reasonable midpoint value for dividing the data into two groups. The efficiency of the algorithm is entirely dependent upon how successfully and accurate the midpoint value is selected

    • Quicksort is undoubtedly the most popular sorting algorithm, and for good reason: In the majority of situations, it’s the fastest, operating in O(N*logN) time. (This is only true for internal or in-memory sorting; for sorting data in disk files, other algorithms may be better)

    • To understand quicksort, you should be familiar with the partitioning algorithm.

    • How Quick Sort Works?
      • If the array contains only one element or zero elements then the array is sorted.

      • If the array contains more then one element then:

      • Select an element from the array. This element is called the “pivot element”. For example select the element in the middle of the array.

      • All elements which are smaller then the pivot element are placed in one array and all elements which are larger are placed in another array.

      • Sort both arrays by recursively applying Quicksort to them.

      • Combine the arrays.

      • Quicksort can be implemented to sort “in-place”. This means that the sorting takes place in the array and that no additional array need to be created.

    • Quick Sort Algorithm::
      quickSort(array, leftmostIndex, rightmostIndex)
      if (leftmostIndex < rightmostIndex)

      pivotIndex = partition(array,leftmostIndex, rightmostIndex) quickSort(array, leftmostIndex, pivotIndex) quickSort(array, pivotIndex + 1, rightmostIndex)

      partition(array, leftmostIndex, rightmostIndex) set rightmostIndex as pivotIndex storeIndex = leftmostIndex - 1 for i <- leftmostIndex + 1 to rightmostIndex if element[i] < pivotElement

      swap element[i] and element[storeIndex] storeIndex++

      swap pivotElement and element[storeIndex+1] return storeIndex + 1

    • Efficiency
      • Quicksort operates in O(N*logN) time. This is generally true of the divide-and-conquer algorithms, in which a recursive method divides a range of items into two groups and then calls itself to handle each group.

      • In this situation the logarithm actually has a base of 2: The running time is proportional to N*log2N

Time Complexity And Space Complexity

Sorting Algorithms

Time Complexity

Space Complexity

Sorting

Best

Average

Worst

Worst

Bubble Sort

O(n)

O(n^2)

O(n^2)

O(1)

Insertion Sort

O(n)

O(n^2)

O(n^2)

O(1)

Selection Sort

O(n^2)

O(n^2)

O(n^2)

O(1)

Counting Sort

O(n+k)

O(n+k)

O(n+k)

O(k)

Merge Sort

O(nLogn)

O(nLogn)

O(nLogn)

O(n)

Heap Sort

O(nLogn)

O(nLogn)

O(nLogn)

O(1)

Quick Sort

O(nLogn)

O(nLogn)

O(n^2)

O(nLogn)

ShellSort

O(n)

O((nLogn)^2)

O((nLogn)^2)

O(1)

# import required algorithm
>>> from package.pygostructures.data_structures.sorting import BubbleSort

# create array
>>> my_array = [4,1,77,2,3,0,99,100,65]

# pass the array as argument
>>> b = BubbleSort(my_array)

# call the function
>>> b.bubblesort()
Sorted Array :  [0, 1, 2, 3, 4, 65, 77, 99, 1

Sorting Algorithms

Author : Robin Singh

Programs List :

1 . Bubble Sort 2 . Counting Sort 3 . Insertion Sort 4 . Merge Sort 5 . Quick Sort 6 . Selection Sort 7 . Shell Sort 8 . Heap Sort

Bubble Sort

class pythorn.data_structures.sorting.BubbleSort(array=None)[source]

Bubble Sort Implementation

bubblesort()[source]
Returns

sorted values

static get_code()[source]

returns source code of the current class

Returns

source code

static time_complexity()[source]

returns time complexity of the functions

Returns

string

Counting Sort

class pythorn.data_structures.sorting.CountingSort(array=None)[source]

Counting Sort Implementation

counting_sort()[source]
Returns

sorted values

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

Insertion Sort

class pythorn.data_structures.sorting.InsertionSort(array=None)[source]

Insertion Sort Implementation

insertion_sort()[source]
Returns

sorted values

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

Merge Sort

class pythorn.data_structures.sorting.MergeSort(array)[source]

Merge Sort Implementation

merge_sort()[source]

function to sort an array using merge sort algorithm

Returns

sorted values

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

Quick Sort

Example :
# import required algorithm
>>> from pythorn.data_structures.sorting import QuickSort

# create array
>>> f = [2,6,9,10,100,20,3,6,9,78,98,2,1500]

# pass array , value zero and len(array)-2 as an argument
>>> b = QuickSort(f,0,len(f)-2)
>>> b.quick_sort()

# call the function
[2, 2, 3, 6, 6, 9, 9, 10, 20, 78, 98, 100, 1500]
class pythorn.data_structures.sorting.QuickSort(array=None, lower=None, upper=None)[source]

Quick Sort Implementation

static partition(a, lb, ub)[source]

function for partition

quick_sort()[source]
Returns

sorted values of the given array

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

Selection Sort

class pythorn.data_structures.sorting.SelectionSort(array=None)[source]

Selection Sort Implementation

selection_sort()[source]
Returns

sorted values of array

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

Shell Sort

class pythorn.data_structures.sorting.ShellSort(array=None)[source]

Shell Sort Implementation

shell_sort()[source]
Returns

sorted values of array

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity

HeapSort

class pythorn.data_structures.sorting.HeapSort(array=None)[source]

Heap Sort Implementation

static heapify(a, n, i)[source]

funtion for converting a binary tree into a Heap data structure

heapSort()[source]
Returns

sorted values of array

static get_code()[source]
Returns

source code

static time_complexity()[source]
Returns

time complexity