Searching and Sorting Arrays

An array is a fixed-size, homogeneous data structure with random-access, where all elements can be selected at random and are equally accessible. An element of the array can be selected by giving its subscript, which is an integer indicating the position of the element in the array.

Arrays are represented in memory using sequential mapping.

A two dimensional array can be viewed as a one-dimensional array who elements are in turn one-dimensional arrays. We can view a two dimensional array as a single column of rows; this is called row-major representation.

An array can be viewed as a matrix. To add two matrices, we simply add the matching rows and columns to each other.

int * getArray(int row, int column);
void printArray(int *array, int row, int column);
void populateArray(int *array, int row, int column, char *name);
void addArray(int *arrOne, int *arrTwo, int row, int column);

int main()
{
    const int rSize = 3;
    const int cSize = 3;

    int *arrOne = getArray(rSize, cSize);
    int *arrTwo = getArray(rSize, cSize);

    populateArray(arrOne, rSize, cSize, "first");
    populateArray(arrTwo, rSize, cSize, "second");

    printArray(arrOne, rSize, cSize);
    printArray(arrTwo, rSize, cSize);

    addArray(arrOne, arrTwo, rSize, cSize);


    return 0;
}

void addArray(int *arrOne, int *arrTwo, int row, int column){
    int i = 0;
    int j = 0;
    int a, b;
    while(i < row){
        while(j < column){
            a = *(arrOne + (i*column) + j);
            b = *(arrTwo + (i*column) + j);
            printf("\n%d + %d = ", a, b);
            *(arrOne + (i*column) + j) += b;
            printf("%d\n", *(arrOne + (i * column) + j));
            j++;
        }
        j = 0;
        i++;
    }

    printArray(arrOne, row, column);

}

void populateArray(int *array, int row, int column, char *name){
    int i = 0;
    int j = 0;
    int val;
    printf("Please populate the %s array:\n", name);
    while(i < row){
        while(j < column){
            printf("Value for row %d, column %d\t", i, j);
            scanf("%d", &val);
            *(array + (i * column)+j) = val;
            j++;
        }
        j = 0;
        i++;
    }
}

void printArray(int *array, int row, int column){
    int i = 0;
    int j = 0;
    printf("\n");
    while(i < row){
        while(j < column){
            //use pointer arithmetic to get the array elements
            printf("%d\t", *(array + (i*column)+j));
            j++;
        }
        j = 0;
        i++;
        printf("\n");
    }
}

int * getArray(int row, int column){
    int *rArray = (int *)malloc(row * column * sizeof(int));
    return rArray;
}

The transpose of a matrix is a new matrix whose rows are the columns of the original, and whose columns are the rows of the original .


#define ROW_LENGTH 3
#define COL_LENGTH 3

void readArray(int *array, int rowLength, int colLength);
void displayArray(int *array, int rowLength, int colLength);
void transpose(int *matrix, int rowLength, int colLength);

int main()
{
    int iMatrix[ROW_LENGTH][COL_LENGTH];

    readArray(iMatrix, ROW_LENGTH, COL_LENGTH);
    displayArray(iMatrix, ROW_LENGTH, COL_LENGTH);
    transpose(iMatrix, ROW_LENGTH, COL_LENGTH);
    displayArray(iMatrix, ROW_LENGTH, COL_LENGTH);

    return 0;
}

void readArray(int *array, int rowLength, int colLength){
    int i, j;
    for(i = 0; i < rowLength; i++){
        for(j = 0; j < colLength; j++){
            printf("Enter value for element [%d, %d]: ", i, j);
            scanf("%d", (array + (i * colLength) + j));
        }
    }
}

void displayArray(int *array, int rowLength, int colLength){
    int i, j;
    for(i = 0; i < rowLength; i++){
        for(j = 0; j < colLength; j++){
            printf("%5d\t", *(array + (i * colLength) + j));
        }
        printf("\n");
    }
}

void transpose(int *matrix, int rowLength, int colLength){
    int i, j, temp;
    for(i = 0; i < rowLength; i++){
            for(j = i+1; j < colLength; j++){
                temp =  *(matrix + (i * colLength) + j);
                 *(matrix + (i * colLength) + j) = *(matrix + (j * colLength) + i);
                *(matrix + (j * colLength) + i) = temp;
            }
    }
}

A bubble sort is a simple sort in which we arrange the elements of the set by forming pairs of adjacent elements, where the pair is of the ith and (i+1)th element. If the order is ascending, we interchange the elements of the pair if the first element of the pair is greater than the second element.

void swap(int *x, int *y);
void bSort(int *list, int bounds);

int main()
{

    const int bounds = 10;

    int iArray[] = {1701, 73, 6174, 2048, 52, 203, 42, 3389, 1138, 1999};
    printArray(iArray, bounds);
    bSort(iArray, bounds);
     printArray(iArray, bounds);
    return 0;
}

void printArray(int *list, int bounds){
    int i = 0;
    for(; i < bounds; i++){
        printf("%d\t", *(list+i));
    }
    printf("\n");
}


void bSort(int *list, int bounds){
    int inner, outer;
    //go backwards through the array;
    for(inner = bounds - 1; inner > 0; inner--){
            for(outer = 0; outer < inner; outer++){                 
            if(*(list + outer) > *(list + outer + 1)){
                    swap((list + outer), (list + outer + 1));
                }
            }
    }

}


void swap(int *x, int *y){
    printf("swaping %d and %d\n", *x, *y);
    int tmp = *x;
    *x = *y;
    *y = tmp;
}

 

Introduction to Sorting and Searching in C

A lot of computation time is spent searching for values. Normally, the objects searched for are records that have a sort key chosen for the search – such as the time, a range of numeric values, name, etc. Searching thus involves comparing the given data objects to a given search key value. If the value can be matched, then a pointer to that object in memory must be returned.

The three most basic searches are linear, binary, and interpolation search. A linear search proceeds linearly through an array.

#include <stdio.h>
#include <time.h>
#include <stdlib.h>

typedef struct {
	int iValue;
	double dValue;
} record;

record * createRecords(int numRecords);
void initializeRecords(record recordSet[], int numRecords);
void printRecords(record recordSet[], int numRecords);
void printARecord(record recordSet[], int loc);
int main()
{
	time_t tme;

	int numRecords = 100;
	srand((unsigned)(time(&tme)));
	record *recordSet = createRecords(numRecords);
	initializeRecords(recordSet, numRecords);
	printRecords(recordSet, numRecords);

    return 0;
}

void printARecord(record recordSet[], int loc) {
	printf("Record %d contains values %d and %f\n", loc + 1, recordSet[loc].iValue, recordSet[loc].dValue);
}

void printRecords(record recordSet[], int numRecords) {
	for (int i = 0; i < numRecords; i++) {
		printf("%d %f \t", recordSet[i].iValue, recordSet[i].dValue);
	}
	printf("\n");
}

void initializeRecords(record recordSet[], int numRecords) {
	for (int i = 0; i < numRecords; i++) {
		recordSet[i].iValue = rand() % 50;
		recordSet[i].dValue = (double)(i + 1) / ((rand() % 20) + 1);
	}
}

//instantiate an array of records 
record * createRecords(int numRecords) {
	record *returnRecord = (record *)malloc(sizeof(record) * numRecords);
	return returnRecord;
}

The code above sets up an array of 100 structs that each contain an integer and a floating point value. We will use this array to look at some simple search algorithms.

The most elementary search is a linear search. A linear search is so named because it goes through an array in sequence till it finds the requested value.

//Use a linear search algorithm until a matching record is found
void tryLinearSearch(record recordSet[], int numRecords) {
	int req = 0;
	int found = -1;
	
	printf("Performing linear search\n");
	while (found < 0) {
		req = rand() % 50;
		printf("Searching for value %d...\n", req);
		found = linearSearch(recordSet, numRecords, req);
	}

	printf("Value %d found!\t", req);
	printARecord(recordSet, found);

}

int linearSearch(record recordSet[], int limit, int req) {
	int rLocation = 0;
	while (rLocation++ < limit) {
		//find first match
		if (recordSet[rLocation].iValue == req) {
			break;
		}
	}
	//no record found, return neg value
	if (rLocation == limit) {
		rLocation = -1;
	}

	return rLocation;
}

A linear search not only proceeds linearly through an array, its processing time increases linearly with the number of elements in the collection.  In the worst-case scenario, when not matching element is found, the search time is O(n), where n is the length of the collection being searched.

One way to save time in a linear search is to order the records in decreasing or increasing order. The linear search can then be modified by terminating the search as soon as a record is met whose value is greater than or less than the desired value.

The most elementary sort is the maximum entry sort. In the maximum entry sort organizes elements in an array into decreasing order, where the largest number is placed in element 1, the second largest in element 2, and so on.  In such an algorithm, when the k largest numbers have been put in the first k elements of the array, a variable, let’s call it next, will point to the element of the array into which the next largest element should be placed. The next variable must be updated to next + 1, so that it will contain the value k + 1.

This algorithm would need to be called n – 1 times, where n is the size of the array. In the algorithm, we will compare each element to a variable containing the largest element found so far.


//copy array 
int copyRecordArray(record sourceRecords[], record targetRecords[], int bounds) {
	int rValue = 0;

	for (int i = 0; i < bounds; i++) {
		targetRecords[i].dValue = sourceRecords[i].dValue;
		targetRecords[i].iValue = sourceRecords[i].iValue;
	}
	return rValue;

}

//return a new, sorted array
record * maxEntrySort(record recordSet[], int bounds) {
	int next, maxLocation;

	//copy the data to new array
	record *newArray = createRecords(bounds);
	copyRecordArray(recordSet, newArray, bounds);
	//sort the records copied into the new array
	for (next = 0; next < bounds; next++) {
		//find the maximum value in the array
		maxLocation = locateMax(newArray, next, bounds);
		//interchange record at current position with record at position of max
		switchRecords(newArray, next, maxLocation);
	}
	return newArray;

}

//this function locates the max value in an array
int locateMax(record recordSet[], int next, int bounds) {
	int largest, i, maxLocation;
	maxLocation = next;
	largest = recordSet[maxLocation].iValue;
	//go forward through array looking for the largest remaining value
	for (i = next + 1; i < bounds; i++) { 		
if (recordSet[i].iValue > largest) {
			maxLocation = i;
			largest = recordSet[i].iValue;
		}
	}
	return maxLocation;
}


//switch the locations of two records in the record array
void switchRecords(record recordSet[], int recordA, int recordB) {
	record tmpRecord;
	//copy recordA's data into a temp
	tmpRecord.dValue = recordSet[recordA].dValue;
	tmpRecord.iValue = recordSet[recordA].iValue;

	//copy recordB's data into recordA
	recordSet[recordA].dValue = recordSet[recordB].dValue;
	recordSet[recordA].iValue = recordSet[recordB].iValue;

	//copy the recordA data stored in the temp variable to recordB
	recordSet[recordB].dValue = tmpRecord.dValue;
	recordSet[recordB].iValue = tmpRecord.iValue;
}

 

How much time does this algorithm take? Well, one call is made to the locateMax() function n – 1 times, and each search for locateMax() takes n – k, where n is the upper bounds of the array and k is the value of the current position in the array. The total number of comparisons sums up to n(n-1)/n. The final result is O(n^2) time.

The bubble sort gets its name from the manner in which larger entries rise up to the top of the collection while smaller entries float down. To perform a bubble sort, we traverse the array, comparing the two adjacent elements, starting with the first two and ending with the last two. If the second element is larger than the first, we switch them. Each traversal of the array is considered one pass. After the first pass the smallest entry will be at the bottom of the array.

In general, after the kth pass, the k smallest entries will be in order in the bottom k elements in the array. The array will be sorted after at most n – 1 passes, though it may actually be sorted before then.

record * bubbleSort(record recordSet[], int bounds) {
	bool notDone = true;
	int i; 
	record * rArray = createRecords(bounds);
	copyRecordArray(recordSet, rArray, bounds);
	int numPasses = 0;
	while (notDone) {
		notDone = false;
		for (i = 0; i < bounds; i++) { 			if (rArray[i + 1].iValue > rArray[i].iValue) {
				switchRecords(rArray, i, i+1);
				notDone = true;
			}
		}
		numPasses++;
	}
	printf("%d passes made\n", numPasses);
	return rArray;
}

The worst cast scenario is O(n^2) for bubble sort, the same as for a linear sort. However, the best case scenario is O(n), with the more sorted the initial data, the shorter the time.

With the elements sorted, we are free to perform a binary search. In a binary search, every time a test is made in the search, there are two choices, to search either half of the remaining elements for the desired element.

With a binary search, at most n key comparisons are needed when searching among 2^n elements, as long as each comparison is able to split the remaining collection of elements into two parts that differ in size by at most one.

//be sure to use sorted array
void tryBinarySearch(record recordSet[], int numRecords) {
	int req = 0; 
	int found = -1;

	printf("Perfoming binary search\n");
	while (found < 0) {
		req = rand() % 50;
		printf("Searching for value %d...\n", req);
		found = binarySearch(recordSet, numRecords, req);
	}
	printf("Value %d found\t", req);
	printARecord(recordSet, found);
}

int binarySearch(record recordSet[], int bounds, int key) {
	int rValue = -1;
	int top, mid, bottom;
	bool found = false;
	top = 0; 
	bottom = bounds;
	while ((top <= bottom) && found == false) {
		mid = (top + bottom) / 2;
		if (recordSet[mid].iValue == key) {
			found = true;
			rValue = mid;
		}
		//if the value is less than the searched for value
		else if (recordSet[mid].iValue < key) {
			bottom = mid - 1;
		}
		else {
			top = mid + 1;
		}
	}
	return rValue;
}

Each pass through the loop in a binary search involves at most half the elements that were under consideration in the preceding pass; starting with n elements, after one pass through the loop at most 1/2n elements are left, after two passes, 1/2(1/2n)n elements, after three passes 1/2(1/2)(1/2) elements are left, and after k passes at most [1/2(1/2)^(k-1)]n elements are left, or (1/2^k)n. Therefore in the worst case the search time is O(lg n).