Author Topic: sorts ... template merge sort and quick sorts compared, template insert sort  (Read 40415 times)

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Update!   FREE homework help NOW available via e-mail ... (or in person if in Toronto Ontario Canada region.)


Sorts Tutorial Part 1 (templated insertion sort - a very quick and 'stabile' sort for very small data sets) ...


Before we look at any code for an insertion sort of an array, you may like to see this text byte that favourably compares insertion sort with selection sort ...

(Taken from wikipedia some weeks ago re. selection sort)

Among simple average-case T(n^2) algorithms, selection sort almost always outperforms bubble sort and gnome sort, but is generally outperformed by INSERTION sort. Insertion sort is very similar in that after the kth iteration, the first k elements in the array are in sorted order. Insertion sort's advantage is that it only scans as many elements as it needs in order to place the k + 1st element, while selection sort must scan all remaining elements to find the k + 1st element.

Simple calculation shows that insertion sort will therefore usually perform about half as many comparisons as selection sort, although it can perform just as many or far fewer depending on the order the array was in prior to sorting. It can be seen as an advantage for some real-time applications that selection sort will perform identically regardless of the order of the array, while insertion sort's running time can vary considerably ...



It may help to think of insert sorting arrays of n elements like this:

1. with an array of just the first 2 elements ... 'insert' that 2nd element in the correct place (either position 2 or 1)

2. with an array of the first 3 elements ... 'insert' the 3rd element in the correct place in the sorted array in 1. above

3. with an array of the first 4 elements ... 'insert' the 4th element in the correct place in the sorted array in 2. above

...

(n-1). with an array of all n elements ... 'insert' the nth element in the correct place in the sorted array in (n-2). above


These insertions, (with the correct insertion position found and opened up by the inner-loop), are done once on each pass of the outer loop like this:

- get a copy of the new top index element on each outer loop from i = 1 to i = arySize-1 ... get val = ary[ i ]

- the element just to the left of this val has index j = i-1

- if val < ary[j] ... copy ary[j] up over ary[j+1] like this ... ary[j+1] = ary[j]

- now decrement j like this --j and repeat above, as long as j >= 0 and val < ary[j] for each new decremented j

- when the above test fails then insert val at the insertion index j+1 like this ... ary[j+i] = val

Note: if val is already in sorted position with respect to the first left element, then we don't really need to copy the value on top of itself, i.e. by setting val = val ... but why add a boolean test overhead here, to every outer pass?


Also ... you may like to see this comment from DIC macosxnerd101 ...
Quote
... What I would try to convey to the reader is that when we insert an element, we are basically pushing/sliding/moving all the elements at the spot of insertion to the right one spot to make room to insert our element.

But as noted above, if 'val' is already in the correct sorted position, we won't push/slide/move any elements ... Instead, in that case, we just copy that value on top of itself ... and then move to the next VALue in the outer loop, (if it exists), to insert sort.


Here's an example insertion sort you may try:

Code: [Select]
template < typename T >
void isort( T ary[], int size )
{
    for( int i = 1; i < size; ++i ) // start with an array of just the first 2 elements (if exists)
    {
        T cmp = ary[i]; // get copy of this new cmp element on each outer loop ...
        int j = i-1; // get index of element just to the left of the above 'cmp' to start comparisons
        while( j >= 0 && cmp < ary[j] )
        {
            ary[j+1] = ary[j]; // copy element 'up' ...
            --j; // decrement j in preparation for next inner loop ...
        }
        ary[j+1] = cmp; // insert (or copy) element at index j+1 (since j was decremented above)
    }
}


Here is a variant not handled by the above template (please note comments in snippet):

Code: [Select]
// isort array of C strings using strcmp ...
void isort( char* ary[], int size )
{
    for( int i = 1; i < size; ++i )
    {
        char* cmp = ary[i]; // get a copy of this cmp pointer ...
        int j = i-1;
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer ...
    }
}


And an other overloaded variant of the above ... (not handled either by the template snippet above - note the comments in the snippet):

Code: [Select]
// isort array of C strings using strcmp ... (for arrays of constant pointers)
void isort( const char* ary[], int size )
{
    for( int i = 1; i < size; ++i )
    {
        const char* cmp = ary[i]; // get a copy of this cmp pointer ...
        int j = i-1;
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer ...
    }
}


Here is a vector version of insert sort (please see comments in template snippet):

Code: [Select]
template < typename T >
void isort( vector < T >& v, bool a = true  ) // a = true/false gives ascending/descending order
{
    int size = (int)v.size();
    if( a )
        for( int i = 1; i < size; ++i ) // start with an array of just the first 2 elements (if exists)
        {
            T cmp = v[i]; // get copy of this new cmp element on each outer loop ...
            int j = i-1; // get index of element just to the left of the above 'cmp' to start comparisons
            while( j >= 0  &&  cmp < v[j] )
            {
                v[j+1] = v[j]; // copy element 'up' ...
                --j; // decrement j in preparation for next inner loop ...
            }
            v[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
    }
    else
        for( int i = 1; i < size; ++i )
        {
            T cmp = v[i];
            int j = i-1;
            while( j >= 0  &&  v[j] < cmp )
            {
                v[j+1] = v[j];
                --j;
            }
            v[j+1] = cmp;
    }
}


Here is a mini insertion sort program, that shows the steps ...

Code: [Select]
// isortSteps.cpp // insertion sort demo // this version 2010-10-05 //

#include <iostream>
using namespace std;

#define show_steps 1

template < typename T >
void isort( T ary[], int size )
{
    for( int i = 1; i < size; ++i ) // start with an array of just the first 2 elements (if exists)
    {
        T cmp =  ary[i]; // get copy of this new cmp element on each outer loop ...
        int j = i-1; // get index of element just to the left of the above 'cmp' to start comparisons
        while( j >= 0 && cmp < ary[j] )
        {
            ary[j+1] = ary[j]; // copy element 'up' ...
            --j; // decrement j in preparation for next inner loop ...
        }
        ary[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
#if show_steps
        cout << "step " << i << ": ";
        for( int j = 0; j <= i; ++j ) cout << ary[j] << " ";
        cout << endl;                   
#endif       
    }
}

// to show an array 'ary' with 'size' elements ....
template < typename T >
void show( T ary[], int size )
{
    if( !size ) return;
    int i;
    --size; // quit 'for loop' at one element before last ...
    for( i = 0; i < size; ++i )
        cout << ary[i] << ", ";
    cout << ary[i] << endl;
}



int main()
{
    int a[] = { 3, 1, 4, 8, 5, 9, 7, 0, 6, 2 };
    int size_a = sizeof a/sizeof(int);
    cout << "Unsorted ...\n";
    show( a, size_a );
    isort( a, size_a );
    cout << "Sorted ...\n";
    show( a, size_a );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// my output
/*
Unsorted ...
3, 1, 4, 8, 5, 9, 7, 0, 6, 2
step 1: 1 3
step 2: 1 3 4
step 3: 1 3 4 8
step 4: 1 3 4 5 8
step 5: 1 3 4 5 8 9
step 6: 1 3 4 5 7 8 9
step 7: 0 1 3 4 5 7 8 9
step 8: 0 1 3 4 5 6 7 8 9
step 9: 0 1 2 3 4 5 6 7 8 9
Sorted ...
0, 1, 2, 3, 4, 5, 6, 7, 8, 9

Press 'Enter' to continue ...
*/


Finally ... here is a test program, that also demo's an insert sort of an array of struct's, to facilitate testing these out:

Code: [Select]
// isort_ary_vec_demo.cpp // insertion sort demo // this version 2010-10-08 //

#include <iostream>
#include <fstream>

// isort_ary_vec.h // insertion sort templates for arrays and vectors of objects
#include "isort_ary_vec.h"
/*
    above includes ... <vector>, <string> and using namespace std;
*/


// to show an array 'ary' with 'size' elements ....
template < typename T >
void show( T ary[], int size )
{
    if( !size ) return;
    int i;
    --size; // quit 'for loop' at one element before last ...
    for( i = 0; i < size; ++i )
        cout << ary[i] << ", ";
    cout << ary[i] << endl;
}

// to show a vector 'v' with 'v.size()' elements ....
template < typename T >
void show( const vector< T >& v )
{
    if( !v.size() ) return;
    int i, size = (int)v.size()-1; // quit 'for loop' at element before last ...
    for( i = 0; i < size; ++i )
        cout << v[i] << ", ";
    cout << v[i] << endl;
}

// an example struct to demo isort of array of struct's ...
struct Contact
{
    string name;
    int area;
    int phone;

    // define overloaded operator << ... so can then cout << a_struct_Contact
    // and use templated 'show(s)' above ...
    friend ostream& operator << ( ostream& os, const Contact& c )
    {
        return os << c.name << " (" << c.area << ")" << c.phone;
    }

    // define overloaded operator < ... so can then compare two struct's
    // using templated isort above ...
    bool operator < ( const Contact& c )
    {
        if( name == c.name  )
        {
            if( area == c.area ) return phone < c.phone;
            // else ...
            return area < c.area;
        }
        // else ...
        return name < c.name;
    }
};


int main()
{
    int a[] = { 10, 0, 9, 1, 8, 2, 7, 3, 6, 4, 5 };
    int size_a = sizeof a/sizeof(int);
    cout << "Unsorted ...\n";
    show( a, size_a );
    isort( a, size_a );
    cout << "Sorted ...\n";
    show( a, size_a );

    string b[] = { "mat", "cat", "hat", "bat", "sat", "fat", "rat" };
    int size_b = sizeof b/sizeof(string);
    cout << "\nUnsorted ...\n";
    show( b, size_b );
    isort( b, size_b );
    cout << "Sorted ...\n";
    show( b, size_b );

    vector < string > v( b, b+size_b );
    cout << "\nBefore sorting in descending order ...\n";
    show( v );
    isort( v , false ); // false implies ... descending order
    cout << "Sorted in descending order ...\n";
    show( v );

    Contact c[] = { {"Joe", 123, 4567890}, {"Joe", 456, 7890123}, {"Ann", 123, 3334455} };
    int size_c = sizeof c/sizeof(Contact);
    cout << "\nUnsorted ...\n";
    show( c, size_c );
    isort( c, size_c );
    cout << "Sorted ...\n";
    show( c, size_c );

    vector < Contact > vc( c, c+size_c );
    isort( vc, false );
    cout << "Vector<Contact> copy of above array of 'Contact' sorted in descending order ...\n";
    show( vc );

    char* d[] = { "Joe", "Sam", "Ann", "Fran" };
    int size_d = sizeof d/sizeof(char*);
    cout << "\nUnsorted ...\n";
    show( d, size_d );
    isort( d, size_d );
    cout << "Sorted ...\n";
    show( d, size_d );

    vector < string > vd( d, d+size_d );
    isort( vd, false );
    cout << "Vector<string> copy of above array of 'C string' sorted in descending order ...\n";
    show( vd );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}


// my output compiled with Dev C++ ...
/*
Unsorted ...
10, 0, 9, 1, 8, 2, 7, 3, 6, 4, 5
Sorted ...
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Unsorted ...
mat, cat, hat, bat, sat, fat, rat
Sorted ...
bat, cat, fat, hat, mat, rat, sat

Before sorting in descending order ...
bat, cat, fat, hat, mat, rat, sat
Sorted in descending order ...
sat, rat, mat, hat, fat, cat, bat

Unsorted ...
Joe (123)4567890, Joe (456)7890123, Ann (123)3334455
Sorted ...
Ann (123)3334455, Joe (123)4567890, Joe (456)7890123
Vector<Contact> copy of above array of 'Contact' sorted in descending order ...
Joe (456)7890123, Joe (123)4567890, Ann (123)3334455

Unsorted ...
Joe, Sam, Ann, Fran
Sorted ...
Ann, Fran, Joe, Sam
Vector<string> copy of above array of 'C string' sorted in descending order ...
Sam, Joe, Fran, Ann

Press 'Enter' to continue ...
*/


// my output ... compiled with VC++ 2008 (exact same as above compiled with Dev C++)
/*
Unsorted ...
10, 0, 9, 1, 8, 2, 7, 3, 6, 4, 5
Sorted ...
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10

Unsorted ...
mat, cat, hat, bat, sat, fat, rat
Sorted ...
bat, cat, fat, hat, mat, rat, sat

Before sorting in descending order ...
bat, cat, fat, hat, mat, rat, sat
Sorted in descending order ...
sat, rat, mat, hat, fat, cat, bat

Unsorted ...
Joe (123)4567890, Joe (456)7890123, Ann (123)3334455
Sorted ...
Ann (123)3334455, Joe (123)4567890, Joe (456)7890123
Vector<Contact> copy of above array of 'Contact' sorted in descending order ...
Joe (456)7890123, Joe (123)4567890, Ann (123)3334455

Unsorted ...
Joe, Sam, Ann, Fran
Sorted ...
Ann, Fran, Joe, Sam
Vector<string> copy of above array of 'C string' sorted in descending order ...
Sam, Joe, Fran, Ann

Press 'Enter' to continue ...
*/
« Last Edit: April 17, 2012, 05:13:26 AM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Re: sorts
« Reply #1 on: October 05, 2010, 11:44:04 PM »
Here is the file 'isort_ary_vec.h' used by the above program ...

(Note: attached files are not permitted here so I will plan to provide a link to the test file mentioned below so you can download it ... see right below ... )

http://www.4shared.com/document/l6MRMe9d/isortDemo.html

Code: [Select]
// isort_ary_vec.h // insertion sort templates for arrays and vectors of objects ...

#ifndef ISORT_ARY_VEC_H
#define ISORT_ARY_VEC_H

#include <vector>
#include <string>
#include <cstring> // re. strcmp for C strings ...
using namespace std;

template < typename T >
void isort( T ary[], int size )
{
    for( int i = 1; i < size; ++i ) // start with an array of just the first 2 elements (if exists)
    {
        T cmp = ary[i]; // get copy of this new cmp element on each outer loop ...
        int j = i-1; // get index of element just to the left of the above 'cmp' to start comparisons
        while( j >= 0 && cmp < ary[j] )
        {
            ary[j+1] = ary[j]; // copy element 'up' ...
            --j; // decrement j in preparation for next inner loop ...
        }
        ary[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
    }
}

// vector version of above ... (with bool flag to handle ascending or descending order sort) ...
template < typename T >
void isort( vector < T >& v, bool a = true  ) // a = true/false gives ascending/descending order
{
    int size = (int)v.size();
    if( a )
        for( int i = 1; i < size; ++i ) // start with a vector of just the first 2 elements (if exists)
        {
            T cmp = v[i]; // get copy of this new cmp element on each outer loop ...
            int j = i-1; // get index of element just to the left of the above 'cmp' to start comparisons
            while( j >= 0  &&  cmp < v[j] )
            {
                v[j+1] = v[j]; // copy element 'up' ...
                --j; // decrement j in preparation for next inner loop ...
            }
            v[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
    }

    else
        for( int i = 1; i < size; ++i )
        {
            T cmp = v[i];
            int j = i-1;
            while( j >= 0  &&  v[j] < cmp )
            {
                v[j+1] = v[j];
                --j;
            }
            v[j+1] = cmp;
    }
}


// isort array of C strings using strcmp ...
void isort( char* ary[], int size )
{
    for( int i = 1; i < size; ++i )
    {
        char* cmp = ary[i]; // get a copy of this cmp pointer ...
        int j = i-1;
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer ...
    }
}
// and a 'const pointers to char' version of the above ,,,
void isort( const char* ary[], int size )
{
    for( int i = 1; i < size; ++i )
    {
        const char* cmp = ary[i]; // get a copy of this cmp pointer ...
        int j = i-1;
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer ...
    }
}

#endif


You may have noticed above that sorting C++ strings might be less efficent than sorting C strings since a deep copy of the C++ string was made each time a string was moved ... rather than just making a copy of the C string pointer.  A recent student request seemed to hint at this by asking for a way to insert sort an array of C++ strings using the C strcmp function. 


Below, please see examples of a 'shallow' copy method used for an insert sort of C++ strings (including one example that uses strcmp ... and see the final demo below that shows some relative speeds/speedups of sorting C strings and C++ strings via 'shallow copies' being moved ...)

Code: [Select]
// isortCstring_demo.cpp // this version 2010-10-07 //

// insert sorts an array of C++ strings in a way similar to the
// insert sort of an array of C strings where only pointers are copied/moved ...
// recall ... C strings may be thought of as pointers to char (char*)

// Note: this 'shallow copy' method of insert sorting C++ strings would NOT be
// efficent if this code is compiled under VC++ 2008 where sizeof(string) = 32
// (unless your strings to be sorted have an average char length > 32 bytes ...)
// But whould be efficent for short strings with an average length > 4 bytes
// compiled under Dev C++ ... (where sizeof(string) = 4)

#include <iostream>
#include <string>
#include <cstring> // re. strcmp, memcpy
#include <cstdlib> // re. malloc, free, exit

using namespace std;

// passing in an array of fixed (constant) addresses ... and size of array
void isortCString( const char* ary[], int size )
{
    for( int i = 1; i < size; ++i ) // start isort with just an array of the 1st 2 strings (if exists)
    {
        const char* cmp = ary[i]; // get a copy of this address into 'cmp'
        int j = i-1; // j is index of first string to the left of the 'cmp' string
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer
    }
}

void exitMessage( const char* msg )
{
    cerr << msg << flush;
    cin.get();
    exit(1);
}

void isortCppStringStrCmp( string ary[], int size )
{
    string* sp = (string*) malloc(sizeof(string)); // get memory to hold a 'shallow' copy of a string
    if( sp == NULL ) exitMessage( "malloc failed in isortCppStringStrCmp ... " );

    for( int i = 1; i < size; ++i )
    {
        memcpy( sp, &ary[i], sizeof(string) ); // get 'shallow' copy of this string
        const char* p = sp->c_str(); // get pointer to this 'comparison' C string

        int j = i-1; // j is index of first string to the left of this 'compare' string

        while( j >= 0  &&  strcmp(p, ary[j].c_str()) < 0 ) // pointers to C strings are passed to strcmp
        {
            memcpy( &ary[j+1], &ary[j], sizeof(string) ); // 'shallow' copy up ...
            --j;
        }
        memcpy( &ary[j+1], sp, sizeof(string) ); // insert 'shallow' copy that was stored in sp
    }
   
    free(sp);
}

// shallow copy ... but uses C++ overloaded operator < to compare 2 C++ strings
void isortCppString( string ary[], int size )
{
    string* sp = (string*) malloc(sizeof(string)); // get memory to hold a 'shallow' copy of a string
    if( sp == NULL ) exitMessage( "malloc failed in isortCppString ... " );

    for( int i = 1; i < size; ++i )
    {
        memcpy( sp, &ary[i], sizeof(string) ); // get 'shallow' copy of this string
        int j = i-1; // j is index of first string to the left of this 'compare' string

        while( j >= 0  &&  *sp < ary[j] )
        {
            memcpy( &ary[j+1], &ary[j], sizeof(string) ); // 'shallow' copy up ...
            --j;
        }
        memcpy( &ary[j+1], sp, sizeof(string) ); // insert 'shallow' copy that was stored in sp
    }

    free(sp);
}



template < typename T >
void show( T ary[], int size )
{
    if( !size ) return;
    cout << ary[0];
    for( int i = 1; i < size; ++i )
        cout << ", "<< ary[i];
    cout << endl;
}



int main()
{
    cout << "sizeof(string) = " << sizeof(string) << endl;

    // an array of constants (the constants are the addresses to each C string)
    const char* d[] = { "Joe", "Sam", "Ann", "Fran" };
    int size_d = sizeof d/sizeof(char*);
    cout << "\nUnsorted ...\n";
    show( d, size_d );
    isortCString( d, size_d );
    cout << "Sorted using isortCString ...\n";
    show( d, size_d );

    string e[] = { "Joe", "Sam", "Ann", "Fran" };
    int size_e = sizeof e/sizeof(string);
    cout << "\nUnsorted ...\n";
    show( e, size_e );
    isortCppStringStrCmp( e, size_e );
    cout << "Sorted using isortCppStringStrCmp ...\n";
    show( e, size_e );

    string f[] = { "Joe", "Sam", "Ann", "Fran" };
    int size_f = sizeof f/sizeof(string);
    cout << "\nUnsorted ...\n";
    show( f, size_f );
    isortCppString( f, size_f );
    cout << "Sorted using isortCppString ...\n";
    show( f, size_f );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// my output compiled with Dev C++ 4.9.9.2 ...
sizeof(string) = 4

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCString ...
Ann, Fran, Joe, Sam

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCppStringStrCmp ...
Ann, Fran, Joe, Sam

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCppString ...
Ann, Fran, Joe, Sam

Press 'Enter' to continue ...
*/

// my output compiled with VC++ 2008 ...
sizeof(string) = 32

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCString ...
Ann, Fran, Joe, Sam

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCppStringStrCmp ...
Ann, Fran, Joe, Sam

Unsorted ...
Joe, Sam, Ann, Fran
Sorted using isortCppString ...
Ann, Fran, Joe, Sam

Press 'Enter' to continue ...
*/


Please note that all the above code compiled and ran ok using MS Visual C++ 2008 and Dev C++ 4.9.9.2 and that ... VC++ sizeof(string) was 32 while Dev C++ sizeof(string) was 4, so using a shallow string copy method compiled via VC++ would actually be a slower copy method unless the average length of your strings to be sorted was greater than 32 bytes ... and that while VC++ compiled and ran ok using 'new' and 'delete' ... in order for the code to compile and run ok under both compilers it was necessary to use 'malloc' and 'free' in the above 'shallow copy' code ... (that used C memcpy).


Below is a final added demo to show the relative speedups of using C strings vs C++ strings ... and with different copy methods ... (Please see the attached test file used for this demo of isort's of 'short' strings.) ,

Code: [Select]
// isort_speedups_demo_timed.cpp // this version 2010-10-08 //

#include <iostream>
#include <iomanip>
#include <fstream>
#include <vector>
#include <string>
#include <cstring> // re. strcmp, memcpy
#include <cstdlib> // re. malloc, free, exit
#include <ctime>   // re. clock

using namespace std;

void exitMessage( const char* msg )
{
    cerr << msg << flush;
    cin.get();
    exit(1);
}

// passing in an array of (dynamic memory) addresses ... and size of array
void isortCString( char* ary[], int size )
{
    for( int i = 1; i < size; ++i ) // start isort with just an array of the 1st 2 strings (if exists)
    {
        char* cmp = ary[i]; // get a copy of this address into 'cmp'
        int j = i-1; // j is index of first string to the left of the 'cmp' string
        while( j >= 0 && strcmp(cmp, ary[j]) < 0 )
        {
            ary[j+1] = ary[j]; // copy pointer 'up' ...
            --j;
        }
        ary[j+1] = cmp; // insert pointer
    }
}

void isortCppStringStrCmp( string ary[], int size )
{
    string* sp = (string*) malloc(sizeof(string)); // get memory to hold a 'shallow' copy of a string
    if( sp == NULL ) exitMessage( "malloc failed in isortCppStringStrCmp ... " );

    for( int i = 1; i < size; ++i )
    {
        memcpy( sp, &ary[i], sizeof(string) ); // get 'shallow' copy of this string
        const char* p = sp->c_str(); // get pointer to this 'comparison' C string

        int j = i-1; // j is index of first string to the left of this 'compare' string

        while( j >= 0  &&  strcmp(p, ary[j].c_str()) < 0 ) // pointers to C strings are passed to strcmp
        {
            memcpy( &ary[j+1], &ary[j], sizeof(string) ); // 'shallow' copy up ...
            --j;
        }
        memcpy( &ary[j+1], sp, sizeof(string) ); // insert 'shallow' copy that was stored in sp
    }
   
    free(sp);
}

// shallow copy ... but uses C++ overloaded operator < to compare 2 C++ strings
void isortCppString( string ary[], int size )
{
    string* sp = (string*) malloc(sizeof(string)); // get memory to hold a 'shallow' copy of a string
    if( sp == NULL ) exitMessage( "malloc failed in isortCppString ... " );

    for( int i = 1; i < size; ++i )
    {
        memcpy( sp, &ary[i], sizeof(string) ); // get 'shallow' copy of this string
        int j = i-1; // j is index of first string to the left of this 'compare' string

        while( j >= 0  &&  *sp < ary[j] )
        {
            memcpy( &ary[j+1], &ary[j], sizeof(string) ); // 'shallow' copy up ...
            --j;
        }
        memcpy( &ary[j+1], sp, sizeof(string) ); // insert 'shallow' copy that was stored in sp
    }

    free(sp);
}



template < typename T >
void show( T ary[], int size )
{
    for( int i = 0; i < size; ++i )
        cout << ary[i] << " ";
    cout << endl;
}

template< typename T >
string isSortedUp( T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return "FALSE";
    // else ...
    return "TRUE";
}
// overloaded variant not handled above ...
string isSortedUp( char* ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( strcmp( ary[i-1], ary[i]) > 0 ) return "FALSE";
    // else ...
    return "TRUE";
}


// makes a new C string (deep) copy (in new memory) of a C++ string ...
char* newCopy( const string& s )
{
    char* p = (char*) malloc( (s.size()+1) * sizeof(char) );
    if( !p ) exitMessage( "malloc failed in newCopy ..." );
    strncpy( p, &s[0], s.size() );
    p[s.size()] = 0; // null terminate ...
    return p;
}



int main()
{
    cout << setprecision(3) << fixed; // show 3 decimal places ...
   
    // get test file into a vector <string> of words ...
    ifstream fin( "isortDemo.txt" );
    if( !fin ) exitMessage( "File 'isortDemo.txt' failed to be opened for input." );

    string word;
    vector < string > words;
    while( fin >> word ) words.push_back( word );
    fin.close();
   
    cout << "The number of words to insert sort is " << int(words.size()) << endl;

    double ti, t, st[3] = {0}; // to hold the time it takes to isort ...

    int loops = 3;
    do
    {
        // now get shallow copies of C++ into an array of shallow copies ...
        string* s = (string*) malloc( sizeof(string)*words.size() );
        if( !s ) exitMessage( "malloc failed to allocate C++ string array memory." );

        // and deep copies of (char's in) C++ strings into an array of new copies
        // of C strings ... (an array of pointers to pointers to char)
        char** cs = (char**) malloc( sizeof(char*)*words.size() );
        if( !cs ) exitMessage( "malloc failed to allocate C string array memory." );

        for( int i = int(words.size())-1; i >= 0; --i )
        {
            memcpy( &s[i], &words[i], sizeof(string) ); // shallow copy ...
            cs[i] = newCopy( words[i] ); // deep copy into a new C string (in new memory)
        }


        // test 1 ... isort of array of C strings ... (array of pointers to char)

        //cout << "\nUnsorted ...\n";
        //show( cs, int(words.size()) );
        ti = clock();
        isortCString( cs, int(words.size()) );
        t = clock() - ti;
        st[0] += t;
        cout << "\nSorted using isortCString ... secs =\t\t"
             << t/CLOCKS_PER_SEC
             << " is sorted UP = " << isSortedUp( cs, int(words.size()) ) << endl;
        //show( cs, int(words.size()) );

        for( int i = int(words.size())-1; i >= 0; --i )
            free( cs[i] ); // free copies of C strings in new memory ...
        free( cs ); // free memory to hold pointers ...



        // test 2 ... isort of array of C++ strings ... (with shallow copy and strcmp)

        //cout << "\nUnsorted ...\n";
        //show( s, int(words.size()) );
        ti = clock();
        isortCppStringStrCmp( s, int(words.size()) );
        t = clock() - ti;
        st[1] += t;
        cout << "Sorted using isortCppStringStrCmp ...secs =\t"
             << t/CLOCKS_PER_SEC
             << " is sorted UP = " << isSortedUp( s, int(words.size()) ) << endl;
        //show( s, int(words.size()) );



        // test 3 ... isort of array of C++ strings ... (with shallow copy and < to compare)

        for( int i = int(words.size())-1; i >= 0; --i )
            memcpy( &s[i], &words[i], sizeof(string) );

        //cout << "\nUnsorted ...\n";
        //show( s, int(words.size()) );
        ti = clock();
        isortCppString( s, int(words.size()) );
        t = clock() - ti;
        st[2] += t;
        cout << "Sorted using isortCppString ...secs =\t\t"
             << t/CLOCKS_PER_SEC
             << " is sorted UP = " << isSortedUp( s, int(words.size()) ) << endl;
        //show( s, int(words.size()) );
    }
    while( --loops );

    cout << "\nst[0] = " << st[0]/CLOCKS_PER_SEC/3
         << "\nst[1] = " << st[1]/CLOCKS_PER_SEC/3
         << "\nst[2] = " << st[2]/CLOCKS_PER_SEC/3 << endl;

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// my output compiled with Dev C++ 4.9.9.2 ...
/*
The number of words to insert sort is 10560

Sorted using isortCString ... secs =            0.454 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     0.828 is sorted UP = TRUE
Sorted using isortCppString ...secs =           2.328 is sorted UP = TRUE

Sorted using isortCString ... secs =            0.469 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     0.828 is sorted UP = TRUE
Sorted using isortCppString ...secs =           2.344 is sorted UP = TRUE

Sorted using isortCString ... secs =            0.469 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     0.828 is sorted UP = TRUE
Sorted using isortCppString ...secs =           2.328 is sorted UP = TRUE

st[0] = 0.464
st[1] = 0.828
st[2] = 2.333

Press 'Enter' to continue ...
*/

// my output compiled with VC++ 2008 ...
/*
The number of words to insert sort is 10560

Sorted using isortCString ... secs =            0.516 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     1.578 is sorted UP = TRUE
Sorted using isortCppString ...secs =           3.532 is sorted UP = TRUE

Sorted using isortCString ... secs =            0.563 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     1.703 is sorted UP = TRUE
Sorted using isortCppString ...secs =           3.859 is sorted UP = TRUE

Sorted using isortCString ... secs =            0.563 is sorted UP = TRUE
Sorted using isortCppStringStrCmp ...secs =     1.578 is sorted UP = TRUE
Sorted using isortCppString ...secs =           3.437 is sorted UP = TRUE

st[0] = 0.547
st[1] = 1.620
st[2] = 3.609

Press 'Enter' to continue ...
*/


From the above sort times, (and the sort times in the demos in Part 2, to follow, re. T( nlog(n) ) sorts ...), you can see why one might rather NOT use VC++ 2008 as a compiler for code to handle a large number of short C++ strings ... :)

« Last Edit: October 08, 2010, 10:38:47 PM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Re: sorts
« Reply #2 on: October 05, 2010, 11:46:16 PM »
Sorts Tutorial Part 2: (templated quick sort and merge sort of arrays and vectors)


In my previous insertion sort of arrays and vectors tutorial, you may have noticed that isort is often preferred over bubble sort or selection sort for small data sets ... because it is quickest there ... (and also a 'stabile' sort).  Recall that the average time for these simple sorts is T(n^2) ... with isort equal to T(n) if the data set was already in sorted order.

For data sets that are greater then about 10 to 100 elements or so ... (recall from Part 1, that my best time example isort ... of about 10,000 short C strings ... took me about 0.5 sec's on my old PC running XP) ...So for data sets of any appreciable size ... an insert sort method may be too slow ... BUT here is where quick sort or merge sort shine ...

(You can use this templated 'msort' code below if you need a 'stabile' sort.) 

The average time for randomly pre-sorted data sets for quick sort and merge sort is T( nlog(n) ).

Quick sort often sorts randomly pre-sorted data sets in roughly half the time as merge sort.


These above considerations ... and the nature of the data to be sorted ... will help you select a sort method that is appropriate for your data stored in arrays or vectors.


Note also, that sorting vectors takes MUCH longer here ... than to sort arrays ... especially if the code was compiled with VC++ 2008 vs Dev C++ 4.9.9.2


The following snippets show two slightly different quick sorts and a merge sort (of arrays and vectors):

You might like to see the comments and outputs ... and note the relative speeds for sorting a large number of elements (the examples here use 3 Million random int's with values in the range 0..32767).


Let's start with the code for a merge sort of an array, since it may be easier to follow than a quick sort, and is a stabile sort ... but note that it often takes about twice as long to sort as quick sort.

Code: [Select]
// msort.cpp // this verson 2010-10-05

#include <iostream>
#include <string>
using namespace std;

#define show_steps 1
#if show_steps
    int num = 0;
#endif

// the following array merge sort was derived from an Horstmann vector merge sort
// example that was modified and then updated to templates ... and with an added
// 'wrapper' template function to reduce all the original tmp memory allocates
// to just one  ...

// merge two adjacent ranges in array ary
template< typename T >
void merge( T ary[], size_t bot, size_t mid, size_t top, T tmp[] )
{
    size_t sz = top - bot + 1;   // size of the range to be merged
    //T* tmp = new T[ sz ];      // merge both halves into array tmp
    size_t j = 0;                // next open index in tmp
    size_t h1 = bot;             // next index to consider in the 1st half
    size_t h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...
        if( ary[h1] < ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    for( j = 0; j < sz; ++j ) ary[bot+j] = tmp[j]; // copy back sorted tmp array

#if show_steps
    cout << "step " << ++num << ":\t\t";
    for( size_t i = 0; i < bot; ++i ) cout << "  ";
    for( size_t i = bot; i <= top; ++i ) cout << ary[i] << " ";
    cout << endl;
#endif
    //delete [] tmp;
}

// merge sort elements in array ary in range ...
template< typename T >
void my_msort( T ary[], size_t bot, size_t top, T tmp[] )
{
    if( bot == top ) return;
    size_t mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp );    // now merge these 2 sorted chunks ...
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1)
template< typename T >
void msort( T ary[], size_t bot, size_t top )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp );
    delete [] tmp;
}

template< typename T >
void print( const T ary[], int size )
{
    for( int i = 0; i < size; ++i ) cout << ary[i] << " ";
    cout << endl;
}


int main()
{
    int d[] = { 3, 1, 4, 8, 5, 9, 2, 6, 7 };
    int size_d = sizeof d / sizeof(int);
   
    cout << "Unsorted: \t";
    print( d, size_d );
    msort( d, 0, size_d-1 );
    cout << "Sorted: \t";
print( d, size_d );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// my output ...
/*
Unsorted:       3 1 4 8 5 9 2 6 7
step 1:         1 3
step 2:         1 3 4
step 3:               5 8
step 4:         1 3 4 5 8
step 5:                   2 9
step 6:                       6 7
step 7:                   2 6 7 9
step 8:         1 2 3 4 5 6 7 8 9
Sorted:         1 2 3 4 5 6 7 8 9

Press 'Enter' to continue ...
*/


Now ... a fairly fast quick sort method ...

Code: [Select]
// qSort2.cpp // this version 2010-10-05

#include <iostream>
#include <string>

#define show_steps 1 // set '1' to '0' to NOT show steps ...

using namespace std;

// templated/edited from an old Comal qsort of an array example ...
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while(ary[lft] < comp) ++lft;
        while(ary[rht] > comp) --rht;
        if(lft <= rht) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    }while( lft <= rht );
#if show_steps
    cout << "comp=" << comp << ": \t";
    for( int i = 0; i < left; ++i ) cout << "    ";
    for( int i = left; i <= right; ++i ) cout << ary[i] << " ";
    cout << endl;
#endif
    if( left < rht   ) qSort2( ary, left, rht  );
    if( lft  < right ) qSort2( ary, lft, right );
}


template< typename T >
void print( T ary[], int size )
{
    for( int i = 0; i < size; ++i ) cout << ary[i] << " ";
    cout << endl;
}



int main()
{
    int d[] = { 300, 100, 400, 800, 700, 500, 900, 600, 200 };
    int size_d = sizeof d / sizeof(int);
    cout << "Unsorted: \t";
    print( d, size_d );
    qSort2( d, 0, size_d-1 );
    cout << "Sorted: \t";
    print( d, size_d );

    string e[] = { "bob", "sam", "Tam", "tam", "job", "Tir", "Sam", "Job", "Bob" };
    int size_e = sizeof e / sizeof(string);
    cout << "\nUnsorted: \t";
    print( e, size_e );
    qSort2( e, 0, size_e-1 );
    cout << "Sorted: \t";
    print( e, size_e );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// my output ...
/*
Unsorted:       300 100 400 800 700 500 900 600 200
comp=700:       300 100 400 200 600 500 900 700 800
comp=400:       300 100 200 400 600 500
comp=100:       100 300 200
comp=300:           200 300
comp=600:                   400 500 600
comp=400:                   400 500
comp=700:                               700 900 800
comp=900:                                   800 900
Sorted:         100 200 300 400 500 600 700 800 900

Unsorted:       bob sam Tam tam job Tir Sam Job Bob
comp=job:       bob Bob Tam Job Sam Tir job tam sam
comp=Tam:       Sam Bob Job Tam bob Tir
comp=Bob:       Bob Sam Job
comp=Sam:           Job Sam
comp=bob:                   Tam Tir bob
comp=Tam:                   Tam Tir
comp=tam:                               job sam tam
comp=job:                               job sam
Sorted:         Bob Job Sam Tam Tir bob job sam tam

Press 'Enter' to continue ...
*/


Now ... compare this fast qSort2 above with an other slightly different and slower qSort method ...

Code: [Select]
// qsorts2Compared.cpp

#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
#include <ctime>
using namespace std;

#define show_steps 1

const int BIGNUM = 3000000;

int c1 = 0, c2 = 0; // to count swaps ...

template< typename T > // this example code was taken from DIC KYA ...
int partition( T data[], int left, int right, int pivot )
{
    int storeIndex = left; //first run starts at index 0, used to store moved elements
    T tmp, pivotVal = data[pivot];
    data[pivot] = data[right]; // set the "end" of the partition to the pivot location
    data[right] = pivotVal;  //assign the pivot to the "end" of this partition

    //This is where the O(n) comes from
    for( int i = left; i < right; i++ )
    {
        if( data[i] <= pivotVal ) //if it's less then the pivot shift it
        {
            ++c1; // count these swaps with global c1 var...
            tmp = data[i]; //swap accordingly
            data[i] = data[storeIndex];
            data[storeIndex] = tmp;
            storeIndex++; //increment our storage location
        }
    }
    // now that we're done, get the index from our storage location
    // and move the pivot back
    // this pivot is now in its final resting place
    ++c1; // count these swaps with global c1 var...
    tmp = data[storeIndex];
    data[storeIndex] = data[right];
    data[right] = tmp;
#if show_steps
    cout << "comp = " << pivotVal << ", ary = ";
    for( int i = 0; i < left; ++i ) cout << "  ";
    for( int i = left; i <= right; ++ i ) cout << data[i] << " ";
    cout << endl;
#endif
    return storeIndex; //return our new pivot
}

template< typename T >
void qSort( T data[], int left, int right )
{
    if( right > left )
    {
        int mid = ( left + right ) / 2; //usually a good place to start ...
        //this is the O(n) portion
        int newPivot = partition(data, left, right, mid); //arrange based on the pivot
        //this is the O(log n) portion
        qSort(data, left, newPivot-1); //repeat for the "left hand side" of the pivot
        qSort(data, newPivot+1, right);//repeat for the "right hand side" of the pivot
    }
}

// this version was templated/edited from an old Comal version ...
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[( lft + rht )/2]; // using the middle value for 'comp'
    do
    {
        while( ary[lft] < comp ) ++lft; // get the next index to swap val up ...
        while( ary[rht] > comp ) --rht; // get the next index to swap val down ...
        if( lft <= rht ) // then swap these two values ...
        {
            ++c2; // count these swaps with global c2 var...
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht; // advance to get ready for next set of comparisons
        }
    }while( lft <= rht ); // when cross over ... finished with this 'comp' val

#if show_steps // show the results of this 'comparision' set of swaps ...
    cout << "comp = " << comp << ", ary = ";
    for( int i = 0; i < left; ++i ) cout << "  ";
    for( int i = left; i <= right; ++ i ) cout << ary[i] << " ";
    cout << endl;
#endif

    // note both recursive calls ... 'divide' their new array 'chunk' about a 'comp'
    if( left < rht   ) qSort2( ary, left, rht); // get new comp and swaps on left
    if( lft  < right ) qSort2( ary, lft, right); // do above ... but on the right
}

template< typename T >
void print( const T ary[], int size )
{
    for(int i = 0; i < size; ++i ) cout << ary[i] << " ";
    cout << endl;
}

template< typename T >
string isSortedUp( const T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return "false";
    // else ...
    return "true";
}

bool fillFromFile( int b[], int e[], int size_d )
{
    ifstream fin( "randInt3M.txt" );
    if( !fin ) return false;
    int i =0;
    while( i < size_d && fin >> b[i] ) { e[i] = b[i]; ++i; }
    fin.close();
    if( i == size_d ) return true;
    else { cout << "Error filling int array's ..." << endl; return false; }
}

void writeFile( int b[], int size_d )
{
    ofstream fout( "randInt3M.txt" );
    for( int i = 0, j =0; i < size_d; ++i )
    {
        fout << b[i] << " ";
        if( ++j == 16 ) { j = 0; fout << endl; }
    }
    fout << endl;
    fout.close();
}


int main()
{
    int d[] = { 3, 1, 8, 6, 5, 9, 2, 0, 7, 4 };
    int size_d = sizeof d / sizeof d[0];
    int* e = new int[size_d];

    //string d[] = { "bob", "job", "sam", "Sam", "Job", "Bob" };
    //int size_d = sizeof d / sizeof d[0];
    //string* e = new string[size_d];

    for( int i = 0; i < size_d; ++i ) e[i] = d[i]; // copy d into e ...

    cout << "Unsorted d: \t";
    print( d, size_d );
    qSort( d, 0, size_d-1 );
    cout << "Sorted d: \t";
    print( d, size_d );

    cout << "\nUnsorted e: \t";
    print( e, size_d );
    qSort2( e, 0, size_d-1 );
    cout << "Sorted e: \t";
    print( e, size_d );

    delete [] e;
    cout << "swaps in d=" << c1 << " and ...\nswaps in e=" << c2 << endl;

#if ! show_steps
    c1 = c2 = 0; // re-set counters to zero ...
    size_d = BIGNUM;
    int* b = new int[ size_d ];
    e = new int[ size_d ];
   
    if( !fillFromFile( b, e, size_d ) )
    {
        srand( (unsigned) time(0) );
        for( int i = 0; i < size_d; ++i ) b[i] = e[i] = rand(); // fill both arrays
        writeFile( b, size_d );
    }
   
    cout << "\nUnsorted b: \t"; cout << size_d << " random int's ... ";
    double ti = clock();
    qSort( b, 0, size_d-1 );
    double tb = (clock() - ti)/CLOCKS_PER_SEC;
    cout << "isSortedUp( b, size_d ) = " << isSortedUp( b, size_d ) << endl;

    cout << "Unsorted e: \t"; cout << size_d << " random int's ... ";
    ti = clock();
    qSort2( e, 0, size_d-1 );
    double te = (clock() - ti)/CLOCKS_PER_SEC;
    cout << "isSortedUp( e, size_d ) = " << isSortedUp( e, size_d ) << endl;

    delete [] b;
    delete [] e;
    cout << "swaps in b=" << c1 << " and ...\nswaps in e=" << c2 << endl;
    cout << "secs for b=" << tb << " and ...\nsecs for e=" << te << endl;
#endif

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}


// compiled under Dev C++ 4.9.9.2 ... my output ...
/*
Unsorted d:     3 1 8 6 5 9 2 0 7 4
comp = 5, ary = 3 1 4 2 0 5 6 8 7 9
comp = 4, ary = 3 1 0 2 4
comp = 1, ary = 0 1 3 2
comp = 3, ary =     2 3
comp = 8, ary =             6 7 8 9
comp = 6, ary =             6 7
Sorted d:       0 1 2 3 4 5 6 7 8 9

Unsorted e:     3 1 8 6 5 9 2 0 7 4
comp = 5, ary = 3 1 4 0 2 9 5 6 7 8
comp = 4, ary = 3 1 2 0 4
comp = 1, ary = 0 1 2 3
comp = 2, ary =     2 3
comp = 6, ary =           6 5 9 7 8
comp = 6, ary =           5 6
comp = 7, ary =               7 9 8
comp = 9, ary =                 8 9
Sorted e:       0 1 2 3 4 5 6 7 8 9
swaps in d=19 and ...
swaps in e=11

Press 'Enter' to continue ...
*/

/*
Unsorted d:     3 1 8 6 5 9 2 0 7 4
Sorted d:       0 1 2 3 4 5 6 7 8 9

Unsorted e:     3 1 8 6 5 9 2 0 7 4
Sorted e:       0 1 2 3 4 5 6 7 8 9
swaps in d=19 and ...
swaps in e=11

Unsorted b:     3000000 random int's ... isSortedUp( b, size_d ) = true
Unsorted e:     3000000 random int's ... isSortedUp( e, size_d ) = true
swaps in b=171121445 and ...
swaps in e=19480616
secs for b=2.937 and ...
secs for e=0.875

Press 'Enter' to continue ...
*/

// compiled under VC++ 2008 ... my output ...
/*
Unsorted d:     3 1 8 6 5 9 2 0 7 4
comp = 5, ary = 3 1 4 2 0 5 6 8 7 9
comp = 4, ary = 3 1 0 2 4
comp = 1, ary = 0 1 3 2
comp = 3, ary =     2 3
comp = 8, ary =             6 7 8 9
comp = 6, ary =             6 7
Sorted d:       0 1 2 3 4 5 6 7 8 9

Unsorted e:     3 1 8 6 5 9 2 0 7 4
comp = 5, ary = 3 1 4 0 2 9 5 6 7 8
comp = 4, ary = 3 1 2 0 4
comp = 1, ary = 0 1 2 3
comp = 2, ary =     2 3
comp = 6, ary =           6 5 9 7 8
comp = 6, ary =           5 6
comp = 7, ary =               7 9 8
comp = 9, ary =                 8 9
Sorted e:       0 1 2 3 4 5 6 7 8 9
swaps in d=19 and ...
swaps in e=11

Press 'Enter' to continue ...
*/

/*
Unsorted d:     3 1 8 6 5 9 2 0 7 4
Sorted d:       0 1 2 3 4 5 6 7 8 9

Unsorted e:     3 1 8 6 5 9 2 0 7 4
Sorted e:       0 1 2 3 4 5 6 7 8 9
swaps in d=19 and ...
swaps in e=11

Unsorted b:     3000000 random int's ... isSortedUp( b, size_d ) = true
Unsorted e:     3000000 random int's ... isSortedUp( e, size_d ) = true
swaps in b=171121445 and ...
swaps in e=19480616
secs for b=3.078 and ...
secs for e=0.89

Press 'Enter' to continue ...
*/


« Last Edit: October 08, 2010, 09:58:03 PM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Re: sorts
« Reply #3 on: October 05, 2010, 11:48:16 PM »
And comparing the speed of msort with both quick sorts ...

Code: [Select]
// qsorts_msort_ary_timed.cpp // note: msort is a 'stabile sort'

#include <iostream>
#include <iomanip>
#include <string>
#include <cstdlib>
#include <ctime>
using namespace std;

int c1 = 0, c2 = 0; // globals ... for count of swaps ...

template< typename T > // from DIC KYA  ...
int partition( T data[], int left, int right, int pivot )
{
    int storeIndex = left;
    T tmp, pivotVal = data[pivot];
    data[pivot] = data[right];
    data[right] = pivotVal;

    for( int i = left; i < right; ++i )
    {
        if( data[i] <= pivotVal )
        {
            ++c1; // count swaps ... (c1 is a global variable)
            tmp = data[i];
            data[i] = data[storeIndex];
            data[storeIndex] = tmp;
            storeIndex++;
        }
    }

    ++c1; // count swaps ...
    tmp = data[storeIndex];
    data[storeIndex] = data[right];
    data[right] = tmp;
    return storeIndex;
}

template< typename T > // from KYA at DIC ...
void qSort( T data[], int left, int right )
{
    if( right > left )
    {
        int mid = ( left + right )/2;
        int newPivot = partition( data, left, right, mid );
        qSort( data, left, newPivot-1 );
        qSort( data, newPivot+1, right );
    }
}

// my quick sort ... (edited from a Comal version)
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            ++c2; // count swaps ... (c2 is a global variable)
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    }while( lft <= rht );
    if( left < rht   ) qSort2( ary, left, rht);
    if( lft  < right ) qSort2( ary, lft, right);
}

// merge two adjacent ranges in array ary ...
template< typename T >
void merge( T ary[], size_t bot, size_t mid, size_t top, T tmp[] )
{
    size_t sz = top - bot + 1;   // size of the range to be merged
    //T* tmp = new T[ sz ];        // merge both halves into array tmp
    size_t j = 0;                // next open index in tmp
    size_t h1 = bot;             // next index to consider in the 1st half
    size_t h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...
        if( ary[h1] < ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    for( j = 0; j < sz; ++j ) ary[bot+j] = tmp[j]; // copy back sorted tmp array
    //delete [] tmp;
}

// merge sort elements in array ary in range ...
template< typename T >
void my_msort( T ary[], size_t bot, size_t top, T tmp[] )
{
    if( bot == top ) return;
    size_t mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp ); // now merge these 2 sorted chunks ...
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1)
template< typename T >
void msort( T ary[], size_t bot, size_t top )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp );
    delete [] tmp;
}



template< typename T >
void print( T ary[], int size )
{
    for( int i = 0; i < size; ++i ) cout << setw(8) << ary[i];
    cout << endl;
}

template< typename T >
string isSortedUp( T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return "Array is NOT sorted UP!";
    // else ...
    return "Array is sorted UP!";
}

int myIntCompare( const void* x, const void* y )
{
    return *(int*)x - *(int*)y;
}

const int NUM = 3;

int main()
{
    cout << setprecision(3) << fixed;
    srand( unsigned(time(0)) );     // seed random number generator ...
    double st1 =0, st2 =0, st3 =0;  // to hold sumTime1 and sumTime2 ...
    int num = NUM;                  // number of loops ...to count down
    do
    {
        int size_d = 3000000;       // size of array of random int's
        int *d = new int[size_d];
        int *e = new int[size_d];
        int *f = new int[size_d];   // 3 copies of an unsorted array of int's
        for( int i = 0; i < size_d; ++i ) d[i] = e[i] = f[i] = rand();

        cout << "\nUnsorted: \t";
        //print( d, size_d );
        double t1 = clock();
        qSort( d, 0, size_d-1 );
        double t2 = clock() - t1;
        st1 += t2;
        cout << "\ttime to qSort was " << t2/CLOCKS_PER_SEC << " seconds "
             << isSortedUp( d, size_d ) << endl;
        //print( d, size_d );

        cout << "Unsorted: \t";
        //print( e, size_d );
        t1 = clock();
        qSort2( e, 0, size_d-1 );
        t2 = clock() - t1;
        st2 += t2;
        cout << "\ttime to qSort2 was " << t2/CLOCKS_PER_SEC << " seconds "
             << isSortedUp( e, size_d ) << endl;
        //print( e, size_d );

        cout << "Unsorted: \t";
        //print( f, size_d );
        t1 = clock();
        msort( f, 0, size_d-1 );
        t2 = clock() - t1;
        st3 += t2;
        cout << "\ttime to msort was " << t2/CLOCKS_PER_SEC << " seconds "
             << isSortedUp( f, size_d ) << endl;
        //print( f, size_d );

        delete [] f;
        delete [] e;
        delete [] d;
    }
    while( --num );
   
    cout << "\nqSort  swaps = " << c1
         << "\nqSort2 swaps = " << c2 << endl
         
         << "\nqSort  time  = " << st1/CLOCKS_PER_SEC/NUM
         << "\nqSort2 time  = " << st2/CLOCKS_PER_SEC/NUM
         << "\nmsort  time  = " << st3/CLOCKS_PER_SEC/NUM << endl;

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// compiled under Dev C++ 4.9.9.2 ... my output ...
/*

Unsorted:               time to qSort was 3.031 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 0.906 seconds Array is sorted UP!
Unsorted:               time to msort was 1.531 seconds Array is sorted UP!

Unsorted:               time to qSort was 2.922 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 0.875 seconds Array is sorted UP!
Unsorted:               time to msort was 1.781 seconds Array is sorted UP!

Unsorted:               time to qSort was 2.938 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 0.875 seconds Array is sorted UP!
Unsorted:               time to msort was 1.562 seconds Array is sorted UP!

qSort  swaps = 512520766
qSort2 swaps = 58308594

qSort  time  = 2.964
qSort2 time  = 0.885
msort  time  = 1.625

Press 'Enter' to continue ...
*/


// compiled under VC++ 2008 ... my output ...

Unsorted:               time to qSort was 3.343 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 1.000 seconds Array is sorted UP!
Unsorted:               time to msort was 2.610 seconds Array is sorted UP!

Unsorted:               time to qSort was 3.532 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 0.907 seconds Array is sorted UP!
Unsorted:               time to msort was 2.063 seconds Array is sorted UP!

Unsorted:               time to qSort was 3.015 seconds Array is sorted UP!
Unsorted:               time to qSort2 was 0.891 seconds Array is sorted UP!
Unsorted:               time to msort was 2.078 seconds Array is sorted UP!

qSort  swaps = 510006044
qSort2 swaps = 58236424

qSort  time  = 3.297
qSort2 time  = 0.933
msort  time  = 2.250

Press 'Enter' to continue ...
*/


Now ... comparing the speeds of 'qSort2', C's qsort and 'msort' ...
(of arrays that are random generated or pre-sorted ascending or descending)

Code: [Select]
// qSort2_qsort_msort_ary_timed.cpp

#include <iostream>
#include <iomanip>
#include <fstream>
#include <cstdlib> // re. rand ...
#include <ctime> // re. time ...
using namespace std;


// my quick sort ... (edited from an old Comal version)
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    }while( lft <= rht );
    if( left < rht   ) qSort2( ary, left, rht);
    if( lft  < right ) qSort2( ary, lft, right);
}

// merge two adjacent ranges in array ary ...
template< typename T >
void merge( T ary[], size_t bot, size_t mid, size_t top, T tmp[] )
{
    size_t sz = top - bot + 1;   // size of the range to be merged
    size_t j = 0;                // next open index in tmp
    size_t h1 = bot;             // next index to consider in the 1st half
    size_t h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...
        if( ary[h1] < ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    for( j = 0; j < sz; ++j ) ary[bot+j] = tmp[j]; // copy back sorted tmp array
}

// merge sort elements in array ary in range ...
template< typename T >
void my_msort( T ary[], size_t bot, size_t top, T tmp[] )
{
    if( bot == top ) return;
    size_t mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp ); // now merge these 2 sorted chunks ...
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1)
template< typename T >
void msort( T ary[], size_t bot, size_t top )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp );
    delete [] tmp;
}


template< typename T >
bool isSortedUp( T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return false;
    // else ...
    return true;
}
template< typename T >
bool isSortedDn( T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] < ary[i] ) return false;
    // else ...
    return true;
}


// for C's qsort ...
int myIntCompare( const void* x, const void* y )
{
    return *(int*)x - *(int*)y;
}
// for C's qsort ... descending order ...
int myIntCompareRev( const void* x, const void* y )
{
    return *(int*)y - *(int*)x;
}

bool fillFromFile( int d[], int e[], int f[], int size_d )
{
    ifstream fin( "randInt3M.txt" );
    if( !fin ) return false;
    int i =0;
    while( i < size_d && fin >> d[i] ) { f[i] = e[i] = d[i]; ++i; }
    fin.close();
    if( i == size_d ) return true;
    else { cout << "Error filling int array's ..." << endl; return false; }
}
void writeFile( int ary[], int size )
{
    ofstream fout( "randInt3M.txt" );
    for( int i = 0, j = 0; i < size; ++i )
    {
        fout << ary[i] << " ";
        if( ++j == 16 ) { j = 0; fout << endl; }
    }
    fout << endl;
    fout.close();
}



const int BIGNUM = 3000000;
const int NUM = 3;

int main()
{
    cout << setprecision(3) << fixed; // show 3 decimal places ...
    srand( unsigned( time(0) ));    // seed random number generator ...
    double st[7] = {0};             // to hold sumTime[0]..sumTime[6]
    int num = NUM;                  // number of loops ...to count down
    do
    {
        int size_d = BIGNUM;        // size of array of random int's
        int *d = new int[size_d];
        int *e = new int[size_d];
        int *f = new int[size_d];   // get 3 copies of an unsorted array of int's
       
        if( !fillFromFile( d, e, f, size_d ) )
        {
            srand( (unsigned) time(0) );
            for( int i = 0; i < size_d; ++i ) f[i]=e[i]=d[i]=rand(); // fill arrays
            writeFile( d, size_d );
        }

        cout << "\nUnsorted: \t";
        double t1 = clock();
        qSort2( d, 0, size_d-1 );
        double t2 = clock() - t1;
        st[0] += t2;
        cout << " time to qSort2 was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( d, size_d ) = " << isSortedUp( d, size_d ) << endl;


        cout << "Unsorted: \t";
        t1 = clock();
        qsort( e, size_d, sizeof(int), myIntCompare );
        t2 = clock() - t1;
        st[1] += t2;
        cout << " time to qsort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( e, size_d ) = " << isSortedUp( e, size_d ) << endl;

        cout << "Pre. sorted: \t"; // was pre-sorted ...
        t1 = clock();
        qsort( e, size_d, sizeof(int), myIntCompare );
        t2 = clock() - t1;
        st[2] += t2;
        cout << " time to qsort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( e, size_d ) = " << isSortedUp( e, size_d ) << endl;

        cout << "Rev. sorted: \t"; // e was pre-sorted ... now will rev. sort
        t1 = clock();
        qsort( e, size_d, sizeof(int), myIntCompareRev );
        t2 = clock() - t1;
        st[3] += t2;
        cout << " time to qsort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedDn( e, size_d ) = " << isSortedDn( e, size_d ) << endl;


        cout << "Unsorted: \t";
        t1 = clock();
        msort( f, 0, size_d-1 );
        t2 = clock() - t1;
        st[4] += t2;
        cout << " time to msort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( f, size_d ) = " << isSortedUp( f, size_d ) << endl;

        cout << "Pre. sorted: \t"; // was pre-sorted ...
        t1 = clock();
        msort( f, 0, size_d-1 );
        t2 = clock() - t1;
        st[5] += t2;
        cout << " time to msort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( f, size_d ) = " << isSortedUp( f, size_d ) << endl;

        cout << "Rev. sorted: \t"; // e was pre-sorted descending, now will sort
        t1 = clock();
        msort( e, 0, size_d-1 );
        t2 = clock() - t1;
        st[6] += t2;
        cout << " time to msort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( e, size_d ) = " << isSortedUp( e, size_d ) << endl;

        delete [] f;
        delete [] e;
        delete [] d;
    }
    while( --num );

    cout << "\nqSort2 time = " << st[0]/CLOCKS_PER_SEC/NUM << endl
         << "\nqsort  time = " << st[1]/CLOCKS_PER_SEC/NUM
         << "\nqsortPrtime = " << st[2]/CLOCKS_PER_SEC/NUM
         << "\nqsortRvtime = " << st[3]/CLOCKS_PER_SEC/NUM << endl
         << "\nmsort  time = " << st[4]/CLOCKS_PER_SEC/NUM
         << "\nmsortPrtime = " << st[5]/CLOCKS_PER_SEC/NUM
         << "\nmsortRvtime = " << st[6]/CLOCKS_PER_SEC/NUM << endl;

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// compiled under Dev C++ 4.9.9.2 ... my output ...
/*

Unsorted:        time to qSort2 was 0.844 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 0.890 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 0.344 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 0.391 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 1.546 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.219 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.172 seconds isSortedUp( e, size_d ) = 1

Unsorted:        time to qSort2 was 0.969 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 0.969 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 0.343 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 0.500 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 1.578 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.235 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.203 seconds isSortedUp( e, size_d ) = 1

Unsorted:        time to qSort2 was 0.843 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 0.859 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 0.359 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 0.375 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 1.546 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.234 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.188 seconds isSortedUp( e, size_d ) = 1

qSort2 time = 0.885

qsort  time = 0.906
qsortPrtime = 0.349
qsortRvtime = 0.422

msort  time = 1.557
msortPrtime = 1.229
msortRvtime = 1.188

Press 'Enter' to continue ...
*/

// compiled under VC++ 2008 ... my output ...
/*

Unsorted:        time to qSort2 was 0.875 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 4.422 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 2.859 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 2.922 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 2.078 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.703 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.641 seconds isSortedUp( e, size_d ) = 1

Unsorted:        time to qSort2 was 0.875 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 4.406 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 2.843 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 2.921 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 2.063 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.703 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.625 seconds isSortedUp( e, size_d ) = 1

Unsorted:        time to qSort2 was 0.875 seconds isSortedUp( d, size_d ) = 1
Unsorted:        time to qsort was 4.453 seconds isSortedUp( e, size_d ) = 1
Pre. sorted:     time to qsort was 2.860 seconds isSortedUp( e, size_d ) = 1
Rev. sorted:     time to qsort was 2.922 seconds isSortedDn( e, size_d ) = 1
Unsorted:        time to msort was 2.063 seconds isSortedUp( f, size_d ) = 1
Pre. sorted:     time to msort was 1.688 seconds isSortedUp( f, size_d ) = 1
Rev. sorted:     time to msort was 1.641 seconds isSortedUp( e, size_d ) = 1

qSort2 time = 0.875

qsort  time = 4.427
qsortPrtime = 2.854
qsortRvtime = 2.922

msort  time = 2.068
msortPrtime = 1.698
msortRvtime = 1.636

Press 'Enter' to continue ...
*/
« Last Edit: October 08, 2010, 10:04:21 PM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Re: sorts
« Reply #4 on: October 08, 2010, 10:07:06 PM »



Now ... comparing the speed to sort an array vs the speed to sort a vector  ...
(and the speed of code from different compilers)

Code: [Select]
// qSort2_msort_vec_ary_timed.cpp

#include <iostream>
#include <iomanip>
#include <vector>
#include <string>
#include <cstdlib>
#include <ctime>
using namespace std;


// my array quick sort ... (edited from an old Comal version)
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    }while( lft <= rht );
    if( left < rht   ) qSort2( ary, left, rht);
    if( lft  < right ) qSort2( ary, lft, right);
}

// vector version ...
template< typename T >
void qSort2( vector< T >& ary, int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    }while( lft <= rht );
    if( left < rht   ) qSort2( ary, left, rht);
    if( lft  < right ) qSort2( ary, lft, right);
}



// the following array merge sort was derived from an Horstmann vector merge sort
// example that was modified and then updated to templates ... and with an added
// 'wrapper' template function to reduce all the original tmp memory allocates
// to just one  ...

// merge two adjacent ranges in array ary
template< typename T >
void merge( T ary[], size_t bot, size_t mid, size_t top, T tmp[] )
{
    size_t sz = top - bot + 1;   // size of the range to be merged
    //T* tmp = new T[ sz ];      // merge both halves into array tmp
    size_t j = 0;                // next open index in tmp
    size_t h1 = bot;             // next index to consider in the 1st half
    size_t h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...
        if( ary[h1] < ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    for( j = 0; j < sz; ++j ) ary[bot+j] = tmp[j]; // copy back sorted tmp array
    //delete [] tmp;
}

// merge sort elements in array ary in range ...
template< typename T >
void my_msort( T ary[], size_t bot, size_t top, T tmp[] )
{
    if( bot == top ) return;
    size_t mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp );    // now merge these 2 sorted chunks ...
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1)
template< typename T >
void msort( T ary[], size_t bot, size_t top )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp );
    delete [] tmp;
}

// vector versions ...

template< typename T >
void merge( vector< T >& ary, size_t bot, size_t mid, size_t top, vector< T >& tmp )
{
    size_t sz = top - bot + 1;   // size of the range to be merged
    size_t j = 0;                // next open index in tmp
    size_t h1 = bot;             // next index to consider in the 1st half
    size_t h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp vec...
        if( ary[h1] < ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    for( j = 0; j < sz; ++j ) ary[bot+j] = tmp[j]; // copy back sorted tmp vec...
}
template< typename T >
void my_msort( vector< T >& ary, size_t bot, size_t top, vector< T >& tmp )
{
    if( bot == top ) return;
    size_t mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp );    // now merge these 2 sorted chunks ...
}
template< typename T >
void msort( vector < T >& ary, size_t bot, size_t top )
{
    vector < T > tmp(top-bot+1);
    my_msort( ary, bot, top, tmp );
}


template< typename T >
void print( const T ary[], int size )
{
    for( int i = 0; i < size; ++i ) cout << setw(8) << ary[i];
    cout << endl;
}
template< typename T >
void print( const vector < T >& ary, int size )
{
    for( int i = 0; i < size; ++i ) cout << setw(8) << ary[i];
    cout << endl;
}

template< typename T >
bool isSortedUp( const T ary[], int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return false;
    // else ...
    return true;
}
template< typename T >
bool isSortedUp( const vector< T >& ary, int size )
{
    for( int i = size-1; i > 0; --i )
        if( ary[i-1] > ary[i] ) return false;
    // else ...
    return true;
}


int main()
{
    srand( unsigned(time(0)) );           // seed random number generator ...
    double st =0, st1 =0, st2 =0, st3 =0; // to hold sumTime..sumTime3
    int num = 3;                          // number of loops ... to count down
    do
    {
        int size_d = 3000000;      // size of array of random int's
        int *d = new int[size_d];
        int *e = new int[size_d];  // get 2 copies of an unsorted array of int's
        for( int i = 0; i < size_d; ++i ) d[i] = e[i] = rand();

        cout << "\nUnsorted: \n";
        vector< int > vd( d, d+size_d ); // get a copy of array d into a vector vd
        //print( vd, size_d );
        double t1 = clock();
        msort( vd, 0, size_d-1 );
        double t2 = clock() - t1;
        st += t2;
        cout << "...time to msort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( vd, size_d ) = " << isSortedUp( vd, size_d ) << endl;
        //print( vd, size_d );

        //print( d, size_d );
        t1 = clock();
        msort( d, 0, size_d-1 );
        t2 = clock() - t1;
        st1 += t2;
        cout << "...time to msort was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( d, size_d ) = " << isSortedUp( d, size_d ) << endl;
        //print( d, size_d );


        cout << "Unsorted: \n";
        vector< int > ve( e, e+size_d );
        //print( ve, size_d );
        t1 = clock();
        qSort2( ve, 0, size_d-1 );
        t2 = clock() - t1;
        st2 += t2;
        cout << "...time to qSort2 was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( ve, size_d ) = " << isSortedUp( ve, size_d ) << endl;
        //print( ve, size_d );

        //print( e, size_d );
        t1 = clock();
        qSort2( e, 0, size_d-1 );
        t2 = clock() - t1;
        st3 += t2;
        cout << "...time to qSort2 was " << t2/CLOCKS_PER_SEC << " seconds ";
        cout << "isSortedUp( e, size_d ) = " << isSortedUp( e, size_d ) << endl;
        //print( e, size_d );


        cout << "msort  vec time = " << st/CLOCKS_PER_SEC
        << "\nmsort  ary time = " << st1/CLOCKS_PER_SEC
        << "\nqSort2 vec time = " << st2/CLOCKS_PER_SEC
        << "\nqSort2 ary time = " << st3/CLOCKS_PER_SEC
        << endl;

        delete [] e;
        delete [] d;
    }
    while( --num );

    cout << "\nPress 'Enter' to continue ... " << flush;
    cin.get();
}

// compiled under Dev C++ 4.9.9.2 ... my output ...
/*

Unsorted:
...time to msort was 13.188 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 1.641 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 5.453 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.86 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 13.188
msort  ary time = 1.641
qSort2 vec time = 5.453
qSort2 ary time = 0.86

Unsorted:
...time to msort was 12.672 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 1.562 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 5.453 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.86 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 25.86
msort  ary time = 3.203
qSort2 vec time = 10.906
qSort2 ary time = 1.72

Unsorted:
...time to msort was 12.641 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 1.563 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 5.391 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.86 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 38.501
msort  ary time = 4.766
qSort2 vec time = 16.297
qSort2 ary time = 2.58

Press 'Enter' to continue ...
*/


// compiled under VC++ 2008 ... my output ...
/*

Unsorted:
...time to msort was 78.921 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 2.047 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 31 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.875 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 78.921
msort  ary time = 2.047
qSort2 vec time = 31
qSort2 ary time = 0.875

Unsorted:
...time to msort was 75.844 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 2.062 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 30.953 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.875 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 154.765
msort  ary time = 4.109
qSort2 vec time = 61.953
qSort2 ary time = 1.75

Unsorted:
...time to msort was 75.75 seconds isSortedUp( vd, size_d ) = 1
...time to msort was 2.047 seconds isSortedUp( d, size_d ) = 1
Unsorted:
...time to qSort2 was 32.797 seconds isSortedUp( ve, size_d ) = 1
...time to qSort2 was 0.891 seconds isSortedUp( e, size_d ) = 1
msort  vec time = 230.515
msort  ary time = 6.156
qSort2 vec time = 94.75
qSort2 ary time = 2.641

Press 'Enter' to continue ...
*/


You may like to note the much slower code (especially for vectors) generated above by the MS VC++ 2008 compiler compared to the much older Dev C++ compiler ... :)

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
More sorts ... this time all have a 'compare function/functor' option ...

For vector and array containers

* merge sort

* quick sort

* quick sort that uses an insert sort for data chunks less than 10 elements so that record order is preserved

* insert sort that is used for the above quick sort, but can also be used alone

* shell sort

A medium-large (about a 75 k word) data file is also provided see the next link below:

https://www.4shared.com/office/3jCtehLvee/Psalms119times32.html

to use with the little test program that follows:

Code: [Select]
// test_msort_vs_other_sorts.cpp //  // 2018-10-09 //

/*
Merge-Sort times compared to quick sort times
*/

// sorts and counts word duplicates ...
// comparing TWO different methods
// 1. using a map to hold in insert sorted order, the (key.value) pairs -> (word, count)
// 2. using a vector to hold all the words, then after sorting it, count any duplicates. //



#define myDebug 1 // set to 0 to turn off, to 1 to turn on //

#include "msort.h" // NOTE!!!  ALSO has: typedef unsigned int uns; //
#include "qSort2.H"  // a NON 'record order preserving' qsort //
#include "qsort_isort_vector.h" // an 'record order preserving' qsort //

#include <iostream>
#include <fstream> // re. ifstream
#include <iomanip> // re. std::setw(..)
#include <sstream> // re. std::istringstream
#include <string>
#include <cctype> // re. isalpha, toupper

#include <vector>
#include <map>
#include <ctime> // re. running timimg via clock()


using std::cin; using std::cout; using std::endl; using std::flush;
using std::ifstream;
using std::setw;
using std::istringstream;
using std::string;
using std::vector; using std::map;


const char* IN_FNAME = "Psalms119times32.txt";


template< typename T >
void print( const T ary[], size_t );

template< typename T >
void print( const vector < T >& );


template< typename T >
bool isSorted( const T* ary, size_t, bool a = true );

template< typename T >
bool isSorted( const vector< T >&, bool a = true );


typedef std::vector<  string > Words; // this is used in next function qnd following ... //

bool fillFromFile( const char* fname, Words& wds, std::map< string, int >& myPairs );

void toAllCaps( string& str );
void pause( const string& msg = "Press 'Enter' to continue ... " );


struct MyCmp // re. demo sorting strings... using functors ... //
{
    int operator () ( const string& a, const string& b )
    {
        if( a < b ) return -1;
        if( a > b ) return  1;
        return 0;
    }
} ;
struct MyCmpRev // re. demo sorting strings... using functors ... //
{
    int operator () ( const string& a, const string& b )
    {
        if( b < a ) return -1;
        if( b > a ) return  1;
        return 0;
    }
} ;



int main() /////////// MAIN BEGINS /////////////////////////
{
    vector< string > myWords;
    map < string, int > myPairs; // to hold sorted (word, count) pairs //
   
    if( fillFromFile( IN_FNAME, myWords, myPairs ) )
    {

        // NOW ... GET/HOLD a copy of the 'unsorted'
        // vector in a 'new array' ...
        string* ary = new string[myWords.size()];
        for( unsigned i = 0; i < myWords.size(); ++ i )
             ary[i] = myWords[i];
           
             
        double t1 = clock();
        msort( myWords, 0, myWords.size()-1 );
        double t2 = clock();
       
       
        cout << "There were " << myWords.size()
             << " words read from file: '" <<  IN_FNAME << "'\n"
             << myPairs.size()
             << " of which were unique ... \n\n"
             
             <<"Re. the vector 'myWords' and the array 'ary' ...\n "
             
             << "\nTime to msort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted ... report only if NOT //


        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
             
        t1 = clock();
        MyCmp ascending;
        msort( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();

        cout << "Time to msort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //


        t1 = clock();
        MyCmpRev descending;
        msort( myWords, 0, myWords.size()-1, descending );
        t2 = clock();

        cout << "Time to msort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if rev. sorted  //
       

             
             
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
             
        t1 = clock();
        msort( ary, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to msort array was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             ary[i] = myWords[i];
             
        t1 = clock();
        msort( ary, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to msort array (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //
       

        t1 = clock();
        msort( ary, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to msort array (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size(), false ); // test if rev. sorted //



             
        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort2 vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //
             
             
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort2 vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //
       
       
        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort2 vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if sorted //
       

             
             
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
        t1 = clock();
        qSort2( ary, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort2 array was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //


        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
        t1 = clock();
        qSort2( ary, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort2 array (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        t1 = clock();
        qSort2( ary, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort2 array (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size(), false ); // test if rev. sorted //



        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             ary[i] = myWords[i];

        t1 = clock();
        qSort( myWords, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

             
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];

        t1 = clock();
        qSort( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //
       
        t1 = clock();
        qSort( myWords, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if rev. sorted //


             
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];

        t1 = clock();
        shellsort( myWords );
        t2 = clock();
        cout << "\nTime to shellsort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //


         // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
             myWords[i] = ary[i];
             
        t1 = clock();
        shellsort( myWords, ascending );
        t2 = clock();
        cout << "Time to shellsort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //
       
        t1 = clock();
        shellsort( myWords, descending );
        t2 = clock();
        cout << "Time to shellsort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if sorted //


qSort( myWords, 0, myWords.size()-1, ascending ); // resort up to get ready for next part //
             
        pause("Press 'Enter' to continue and see the (word, count) pairs ... " );
       
        // Ok ... show the words and count, 20 at a time ... //
       
             
        unsigned i = 0, count = 0, stop = 0;
        map< string, int >::const_iterator it; // to traverse the map ... //
       
        for( it = myPairs.begin(); it != myPairs.end(); ++ it )
        {
            ++stop; // pause every 20 words //
           
while( (i < myWords.size())  &&  (myWords[i] == it->first) )
{ ++count; ++i; }
            cout << "The word: '" << it->first << "' occured "
                 << it->second << " times =? " << count << " times.\n";
            if( stop % 20 == 0 )
                pause();

            count = 0; // get ready to count the next duplicate words in the vector //
        }


        cout << "All done?  Now i = " << i
             << " and words in myWords = " << myWords.size() << endl;
             
        delete [] ary;
    }

    pause( "\nPress 'Enter' to continue/exit ..." );

} //////////////////// MAIN ENDS ///////////////////////////




template< typename T >
void print( const T ary[], size_t size )
{
    for( size_t i = 0; i < size; ++i ) cout << setw(8) << ary[i];
    if( size ) std::cout << std::endl;
}
template< typename T >
void print( const vector < T >& v )
{
    for( size_t i = 0; i < v.size(); ++i ) cout << setw(8) << v[i];
    if( v.size() ) cout << endl;
}

template< typename T >
bool isSorted( const T* ary, size_t size, bool a )
{
    if( a ) // ascending
    {
        for( int i = size-1; i > 0; --i )
        {
            if( ary[i-1] > ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    else // descending //
    {
        for( int i = size-1; i > 0; --i )
        {
            if( ary[i-1] < ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    // else ...
    return true;
}
template< typename T >
bool isSorted( const vector< T >& ary, bool a )
{
    if( a )
    {
        for( int i = ary.size()-1 ; i > 0 ; --i )
        {
            if( ary[i-1] > ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    else
    {
        for( int i = ary.size()-1 ; i > 0 ; --i )
        {
            if( ary[i-1] < ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    // else ...
    return true;
}

bool fillFromFile( const char* fname, Words& wds, std::map<  string, int >& myPairs )
{
    ifstream fin( fname );
    if( fin.is_open() )
    {
        string line, word;
        while( getline( fin, line ) )
        {
            istringstream iss(line);
            while( iss >> word )
            {
               
                if( word[0] == '[' ) continue;

toAllCaps( word );

unsigned i = 0, len = word.size();
                while( i < len  &&  ('A' <= word[i] && word[i] <= 'Z') )
                       ++i ;

                if( i < len )
                    word.erase(i);
                   
                if( !word.size() ) continue;
                   
               
                if( myPairs.count( word ) ) ++ myPairs[word] ;
                else myPairs[word]  = 1;

                wds.push_back( word );
            }
        }
        fin.close();
        return true;
    }
    // else
    std::cout << "\nThere was a problem opening file "
              << fname << "\n\n";
    return false;
}

void toAllCaps( string& str )
{
    unsigned i = 0, len = str.size();
    while( i < len ) { str[i] = toupper(str[i]); ++ i; }
}

void pause( const string& msg )
{
    cout << msg << flush;
    string dummyLine;
    getline( cin, dummyLine );
}


// typical output to first part ... //
/*
There were 75040 words read from file: 'Psalms119times32.txt'
495 of which were unique ...

Re. the vector 'myWords' and the array 'ary' ...

Time to msort vector was 0.107 seconds.
Time to msort vector (passing in functor) was 0.112 seconds.
Time to msort vector (down, passing in functor) was 0.087 seconds.

Time to msort array was 0.114 seconds.
Time to msort array (passing in functor) was 0.102 seconds.
Time to msort array (down, passing in functor) was 0.087 seconds.

Time to qSort2 vector was 0.067 seconds.
Time to qSort2 vector (passing in functor) was 0.055 seconds.
Time to qSort2 vector (down, passing in functor) was 0.061 seconds.

Time to qSort2 array was 0.055 seconds.
Time to qSort2 array (passing in functor) was 0.052 seconds.
Time to qSort2 array (down, passing in functor) was 0.058 seconds.

Time to qSort vector was 0.033 seconds.
Time to qSort vector (passing in functor) was 0.032 seconds.
Time to qSort vector (down, passing in functor) was 0.038 seconds.

Time to shellsort vector was 0.059 seconds.
Time to shellsort vector (passing in functor) was 0.059 seconds.
Time to shellsort vector (down, passing in functor) was 0.069 seconds.
Press 'Enter' to continue and see the (word, count) pairs ...
The word: 'A' occured 384 times =? 384 times.
The word: 'ABHOR' occured 32 times =? 32 times.
The word: 'ABIDETH' occured 32 times =? 32 times.
The word: 'ABOVE' occured 64 times =? 64 times.
The word: 'ACCEPT' occured 32 times =? 32 times.
The word: 'ACCORDING' occured 576 times =? 576 times.
The word: 'AFFLICTED' occured 128 times =? 128 times.
The word: 'AFFLICTION' occured 96 times =? 96 times.
The word: 'AFRAID' occured 32 times =? 32 times.
The word: 'AFTER' occured 128 times =? 128 times.
The word: 'AGAINST' occured 96 times =? 96 times.
The word: 'AIN' occured 32 times =? 32 times.
The word: 'ALEPH' occured 32 times =? 32 times.
The word: 'ALL' occured 576 times =? 576 times.
The word: 'ALMOST' occured 32 times =? 32 times.
The word: 'ALSO' occured 192 times =? 192 times.
The word: 'ALWAY' occured 32 times =? 32 times.
The word: 'AM' occured 96 times =? 96 times.
The word: 'AN' occured 96 times =? 96 times.
The word: 'ANCIENTS' occured 32 times =? 32 times.
Press 'Enter' to continue ...
The word: 'AND' occured 2016 times =? 2016 times.
The word: 'ANGUISH' occured 32 times =? 32 times.
The word: 'ANSWER' occured 32 times =? 32 times.
The word: 'ANY' occured 32 times =? 32 times.
The word: 'ARE' occured 96 times =? 96 times.
The word: 'ART' occured 32 times =? 32 times.
The word: 'AS' occured 224 times =? 224 times.
The word: 'ASHAMED' occured 160 times =? 160 times.
The word: 'ASTRAY' occured 64 times =? 64 times.
The word: 'AT' occured 128 times =? 128 times.
The word: 'AWAY' occured 96 times =? 96 times.
The word: 'AWE' occured 32 times =? 32 times.
The word: 'BANDS' occured 32 times =? 32 times.
The word: 'BE' occured 416 times =? 416 times.
The word: 'BECAUSE' occured 288 times =? 288 times.
The word: 'BECOME' occured 32 times =? 32 times.
The word: 'BEEN' occured 96 times =? 96 times.
The word: 'BEFORE' occured 160 times =? 160 times.
The word: 'BEGINNING' occured 32 times =? 32 times.
The word: 'BEHELD' occured 32 times =? 32 times.
Press 'Enter' to continue ...
*/


Three .h files follow that hold the template sorts ...
« Last Edit: November 09, 2018, 09:40:10 PM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Firstly, the merge sort:

Code: [Select]
// msort.h //  // 2011-07-11 //  // this version 2015-04-24 //


// Note: msort DOES preserve  previous record order //


#ifndef MSORT_H
#define MSORT_H

#include <vector>

// Note: this next typedef was used to show how a typing short-cut ...
// could ... introduce a conflict/problem ... in other parts
// of a multi-file program that may use this file //
typedef unsigned int uns;


// an example compare functor ... here it is for C++ string //
/*
struct MyCmp
{
    int operator () ( const string& a, const string& b )
    {
        if( a < b ) return -1;
        if( a > b ) return  1;
        return 0;
    }
} ;

// ....
// Then use/call like this ...

    MyCmp ascending;
    msort( myWords, 0, myWords.size()-1, ascending
*/


// The following  merge sort was improved/speeded up ...
// from an old 'Horstmann text' vector merge sort example ...
// After modified to handle arrays ...'twas changed to templates,
// with an added 'wrapper' template function, to reduce all
// the original many tmp memory allocates ... to just one.
// Note use of a 'start pointer' in the next to last line
// in the merge function itself ...
// to save one 'pointer arithmetic adding step' on each pass //


// merge two adjacent ranges in array ary //
template< typename T >
void merge( T ary[], uns bot, uns mid, uns top, T tmp[] )
{
    uns sz = top - bot + 1;   // size of the range to be merged
    uns j = 0;                // next open index in tmp
    uns h1 = bot;             // next index to consider in the 1st half
    uns h2 = mid + 1;         // next index to consider in the 2nd half //

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...
        if( ary[h1] <= ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half) //
   
    T* pBot = &ary[bot];

    for( j = 0; j < sz; ++j ) pBot[j] = tmp[j]; // copy back sorted tmp array //
}

// merge sort elements in array ary in range ... //
template< typename T >
void my_msort( T ary[], uns bot, uns top, T tmp[] )
{
    if( bot == top ) return;
    uns mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp );    // now merge these 2 sorted chunks ... //
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1) //
template< typename T >
void msort( T ary[], uns bot, uns top )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp );
    delete [] tmp;
}




// merge two adjacent ranges in array ary //
template< typename T, typename Compare >
void merge( T ary[], uns bot, uns mid, uns top, T tmp[], Compare myCmp )
{
    uns sz = top - bot + 1;   // size of the range to be merged
    uns j = 0;                // next open index in tmp
    uns h1 = bot;             // next index to consider in the 1st half
    uns h2 = mid + 1;         // next index to consider in the 2nd half //

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp array ...;
        if( myCmp( ary[h1], ary[h2] ) <= 0 ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half) //

    T* pBot = &ary[bot];

    for( j = 0; j < sz; ++j ) pBot[j] = tmp[j]; // copy back sorted tmp array //
}

// merge sort elements in array ary in range ... //
template< typename T, typename Compare >
void my_msort( T ary[], uns bot, uns top, T tmp[], Compare myCmp )
{
    if( bot == top ) return;
    uns mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp, myCmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp, myCmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp, myCmp );    // now merge these 2 sorted chunks ... //
}

// my_msort 'wrapper' speeds up merges (reduces all the new..deletes to just 1) //
template< typename T, typename Compare >
void msort( T ary[], uns bot, uns top, Compare myCmp )
{
    T* tmp = new T[top-bot+1];
    my_msort( ary, bot, top, tmp, myCmp );
    delete [] tmp;
}




// vector versions ...

template< typename T >
void merge( std::vector< T >& ary, uns bot, uns mid, uns top, std::vector< T >& tmp )
{
    uns sz = top - bot + 1;   // size of the range to be merged
    uns j = 0;                // next open index in tmp
    uns h1 = bot;             // next index to consider in the 1st half
    uns h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp vec...
        if( ary[h1] <= ary[h2] ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)
   
    T* pBot = &ary[bot];

    for( j = 0; j < sz; ++j ) pBot[j] = tmp[j]; // copy back sorted tmp vec...
}

template< typename T >
void my_msort( std::vector< T >& ary, uns bot, uns top, std::vector< T >& tmp )
{
    if( bot == top ) return;
    uns mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp );    // now merge these 2 sorted chunks ...
}

template< typename T >
void msort( std::vector < T >& ary, uns bot, uns top )
{
    std::vector < T > tmp(top-bot+1);
    my_msort( ary, bot, top, tmp );
}




template< typename T, typename Compare >
void merge( std::vector< T >& ary, uns bot, uns mid, uns top, std::vector< T >& tmp, Compare myCmp )
{
    uns sz = top - bot + 1;   // size of the range to be merged
    uns j = 0;                // next open index in tmp
    uns h1 = bot;             // next index to consider in the 1st half
    uns h2 = mid + 1;         // next index to consider in the 2nd half

    // while indexes h1 and h2 are not past their ends ...
    while( h1 <= mid && h2 <= top )
    {   // move the smaller element into the tmp vec...
        if( myCmp(ary[h1], ary[h2]) <= 0 ) tmp[j++] = ary[h1++];
        else tmp[j++] = ary[h2++];
    }

    // Note: only one of the two while loops below is executed ...
    while( h1 <= mid ) tmp[j++] = ary[h1++]; // copy any remaining entries (1st half)
    while( h2 <= top ) tmp[j++] = ary[h2++]; // copy any remaining entries (2nd half)

    T* pBot = &ary[bot];

    for( j = 0; j < sz; ++j ) pBot[j] = tmp[j]; // copy back sorted tmp vec...
}

template< typename T, typename Compare >
void my_msort( std::vector< T >& ary, uns bot, uns top, std::vector< T >& tmp, Compare myCmp )
{
    if( bot == top ) return;
    uns mid = ( bot + top ) / 2;

    my_msort( ary, bot, mid, tmp, myCmp );      // sort the first ...
    my_msort( ary, mid + 1, top, tmp, myCmp );  // and the second half ...
    merge( ary, bot, mid, top, tmp, myCmp );    // now merge these 2 sorted chunks ...
}

template< typename T, typename Compare >
void msort( std::vector < T >& ary, uns bot, uns top, Compare myCmp )
{
    std::vector < T > tmp(top-bot+1);
    my_msort( ary, bot, top, tmp, myCmp );
}


#endif


Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Then the (plain) quick sort:

Code: [Select]
// qSort2.h //  // this version 2015-04-24 //


// Note: this qSort2 does NOT preserve  previous record order //


#ifndef QSORT2_H
#define QSORT2_H

#include <vector>

// an example compare functor ... here it is for C++ string //
/*
struct MyCmp
{
    int operator () ( const string& a, const string& b )
    {
        if( a < b ) return -1;
        if( a > b ) return  1;
        return 0;
    }
} ;

// ....
// Then use/call like this ...

    MyCmp ascending;
    msort( myWords, 0, myWords.size()-1, ascending );
*/


// my array quick sort ... (edited from an old Comal version)
// updated to template... //
template< typename T >
void qSort2( T ary[], int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    } while( lft <= rht );
   
    if( left < rht   ) qSort2( ary, left, rht  );
    if( lft  < right ) qSort2( ary, lft, right );
}

// vector version ...
template< typename T >
void qSort2( std::vector< T >& ary, int left, int right )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( ary[lft] < comp ) ++lft;
        while( ary[rht] > comp ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    } while( lft <= rht );
   
    if( left < rht   ) qSort2( ary, left, rht);
    if( lft  < right ) qSort2( ary, lft, right);
}



// Compare versions ... //

template< typename T, typename Compare >
void qSort2( T ary[], int left, int right, Compare myCmp )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( myCmp( ary[lft], comp ) < 0 ) ++lft;
        while( myCmp( ary[rht], comp ) > 0 ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    } while( lft <= rht );
   
    if( left < rht   ) qSort2( ary, left, rht, myCmp  ) ;
    if( lft  < right ) qSort2( ary, lft, right, myCmp ) ;
}

// vector version ...
template< typename T, typename Compare >
void qSort2( std::vector< T >& ary, int left, int right, Compare myCmp )
{
    int lft = left, rht = right;
    T tmp, comp = ary[(lft+rht)/2];
    do
    {
        while( myCmp( ary[lft], comp ) < 0 ) ++lft;
        while( myCmp( ary[rht], comp ) > 0 ) --rht;
        if( lft <= rht ) // swap ...
        {
            tmp = ary[lft];
            ary[lft] = ary[rht];
            ary[rht] = tmp;
            ++lft; --rht;
        }
    } while( lft <= rht );
   
    if( left < rht   ) qSort2( ary, left, rht, myCmp  ) ;
    if( lft  < right ) qSort2( ary, lft, right, myCmp ) ;
}

#endif


Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
And finally (here's) the quick sort that uses also an insert sort .... and the shell sort:

Code: [Select]
// qsort_isort_vector.h //  // this version 2016-03-11 //


#ifndef QSORT_ISORT_VECTOR_H
#define QSORT_ISORT_VECTOR_H

// an example compare functor ... here it is for C++ string //
/*
struct MyCmp
{
    int operator () ( const string& a, const string& b )
    {
        if( a < b ) return -1;
        if( a > b ) return  1;
        return 0;
    }
} ;

// ....
// Then use/call like this ...

    MyCmp ascending;
    msort( myWords, 0, myWords.size()-1, ascending );
*/

const int SIZE_TO_START_ISORT = 32;

#include <vector>

using std::vector;



template < typename T >
void shellsort( vector < T >& a )
{
    int len = (int)a.size();
    for
    (
        int gap = len / 2;
        gap > 0;
        gap = gap == 2 ? 1 : static_cast < int > ( gap / 2.2 )
    )
    for ( int i = gap; i < len; ++i )
    {
        T tmp = a[i];
        int j = i;
        for ( ; j >= gap && tmp < a[j-gap] ; j -= gap )
            a[j]  = a[j-gap];
        a[j]  = tmp;
    }
}

template < typename T, typename Compare >
void shellsort( vector < T >& a, Compare myCmp )
{
    int len = (int)a.size();
    for
    (
        int gap = len / 2;
        gap > 0;
        gap = gap == 2 ? 1 : static_cast < int > ( gap / 2.2 )
    )
    for ( int i = gap; i < len; ++i )
    {
        T tmp = a[i];
        int j = i;
        for ( ; j >= gap && myCmp(tmp, a[j-gap]) < 0 ; j -= gap )
            a[j]  = a[j-gap];
        a[j]  = tmp;
    }
}



template < typename T >
void iSort( vector < T >& v, int low, int high )
{
    int top= high +1;

    // start with a vector of just the first 2 elements (if exists)
    for( int i = low+1; i < top; ++i )
    {
        T cmp = v[i]; // get copy of this new cmp element on each outer loop ...
        int j = i-1; // get index of element just to the left of the above 'cmp'
                     // to start comparisons
        while( j >= 0  &&  cmp < v[j] )
        {
            v[j+1] = v[j]; // copy element 'up' ...
            --j; // decrement j in preparation for next inner loop ...
        }
        v[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
}
}

template < typename T, typename Compare >
void iSort( vector < T >& v, int low, int high, Compare myCmp )
{
    int top= high +1;

    // start with a vector of just the first 2 elements (if exists)
    for( int i = low+1; i < top; ++i )
    {
        T cmp = v[i]; // get copy of this new cmp element on each outer loop ...
        int j = i-1; // get index of element just to the left of the above 'cmp'
                     // to start comparisons
        while( j >= 0  &&  myCmp(cmp, v[j]) < 0 )
        {
            v[j+1] = v[j]; // copy element 'up' ...
            --j; // decrement j in preparation for next inner loop ...
        }
        v[j+1] = cmp; // insert element at index j+1 (since j was decremented above)
}
}


template< typename T >
void qSort( vector < T >& ary, int old_lft, int old_rht )
{
    if( old_lft + SIZE_TO_START_ISORT > old_rht )
       iSort( ary, old_lft, old_rht );

    else
    {
        int lft = old_lft, rht = old_rht;
        T tmp, comp = ary[ ( lft + rht ) / 2 ];
        do
        {
            while( ary[lft] < comp ) ++lft;
            while( comp < ary[rht]) --rht;
           
            if( lft < rht ) // swap ...
            {
                tmp = ary[lft];
                ary[lft] = ary[rht];
                ary[rht] = tmp;
                ++lft; --rht;
            }
        } while( lft < rht );

        if( old_lft < rht  ) qSort( ary, old_lft, rht );
        if( lft  < old_rht ) qSort( ary, lft, old_rht );
    }
}

template < typename T >
void qSort( vector< T >& a )
{
    int size  = (int)a.size();
    qSort( a, 0, size - 1 );
}




template< typename T, typename Compare >
void qSort( vector < T >& ary, int old_lft, int old_rht, Compare myCmp )
{
    if( old_lft + SIZE_TO_START_ISORT > old_rht )
       iSort( ary, old_lft, old_rht, myCmp );

    else
    {
        int lft = old_lft, rht = old_rht;
        T tmp, comp = ary[ ( lft + rht ) / 2 ];
        do
        {
            while( myCmp(ary[lft], comp) < 0 ) ++lft;
            while( myCmp(comp, ary[rht]) < 0 ) --rht;

            if( lft < rht ) // swap ...
            {
                tmp = ary[lft];
                ary[lft] = ary[rht];
                ary[rht] = tmp;
                ++lft; --rht;
            }
        } while( lft < rht );

        if( old_lft < rht  ) qSort( ary, old_lft, rht, myCmp );
        if( lft  < old_rht ) qSort( ary, lft, old_rht, myCmp );
    }
}

template < typename T, typename Compare  >
void qSort( vector< T >& a, Compare myCmp )
{
    int size  = (int)a.size();
    qSort( a, 0, size - 1, myCmp );
}


/*
template < typename T >
bool isSorted( const vector < T >& v )
{
    for( int i = v.size()-2; i >= 0; -- i )
    if( v[i+1] < v[i] ) return false;
return true;
}

*/

#endif
« Last Edit: March 11, 2016, 07:55:50 AM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Here is a text file that I used in the times sorts above:


https://drive.google.com/file/d/15IyD5UD8bCp69qeE1IbOzweiNzivbm4A/view?usp=sharing


There were 75040 words read from file: 'Psalms119times32.txt'
495 of which were unique ...


Try this also with C++ library sort added ... also C qsort added with C strings

This next link has the data text file used in the program that follows ...

https://drive.google.com/file/d/1gzwjG_lJj46eEbOcI23ynIqk79WhTArY/view?usp=sharing

Code: [Select]
// test_msort_vs_qsort_etc.etc.cpp //  // 2018-10-10 //

/*
Merge-Sort times compared to various quick sort times and shell sort time //
*/

// sorts and counts word duplicates ...
// comparing TWO different methods
// 1. using a map to hold in insert sorted order, the (key.value) pairs -> (word, count)
// 2. using a vector to hold all the words, then after sorting it, count any duplicates. //



#define myDebug 1 // set to 0 to turn off, to 1 to turn on //

#include "msort.h" // NOTE!!!  ALSO has: typedef unsigned int uns; //
#include "qSort2.H"  // a NON 'record order preserving' qsort //
#include "qsort_isort_vector.h" // an 'record order preserving' qsort //

#include <iostream>
#include <fstream> // re. ifstream
#include <iomanip> // re. std::setw(..)
#include <sstream> // re. std::istringstream
#include <string>
#include <cctype> // re. isalpha, toupper

#include <vector>
#include <map>
#include <ctime> // re. running timimg via clock()
#include <cstdlib> // re. qsort //
#include <cstring> // re. strcmp //
#include <algorithm> // re. vector sorts using STL library sort //


using std::cin;
using std::cout;
using std::endl;
using std::flush;
using std::ifstream;
using std::setw;
using std::istringstream;
using std::string;
using std::vector;
using std::map;


const char* IN_FNAME = "Psalms119times128.txt";


template< typename T >
void print( const T ary[], size_t );

template< typename T >
void print( const vector < T >& );


template< typename T >
bool isSorted( const T* ary, size_t, bool a = true );

bool isSortedCstrs( char** ary, size_t, bool a = true );

template< typename T >
bool isSorted( const vector< T >&, bool a = true );


typedef std::vector<  string > Words; // this is used in next function qnd following ... //

bool fillFromFile( const char* fname, Words& wds, std::map< string, int >& myPairs );

void toAllCaps( string& str );
void pause( const string& msg = "Press 'Enter' to continue ... " );


char** toCstrings( const string* ary, size_t size )
{
    char** pp = (char**)malloc( size*sizeof(char*) );
    if( pp )
    {
        for( size_t i = 0; i < size; ++ i )
        {
            pp[i] = (char*)malloc( sizeof(char)*(ary[i].size() + 1) );
            if(pp[i])
            {
                size_t j = 0;
                for( ; j < ary[i].size(); ++ j ) pp[i][j] = ary[i][j];
                pp[i][j] = 0;
               
            }
            // else  print error and exit //
        }
    }
    // else print srror and exit //
    return pp;
}


struct MyCmp // re. demo sorting strings... using functors ... //
{
    int operator () ( const string& a, const string& b )
    {
        if( a < b ) return -1;
        if( a > b ) return  1;
        return 0;
    }
} ;
bool myStrCmp( const string& a, const string& b )
{
    return a < b;
}

struct MyCmpRev // re. demo sorting strings... using functors ... //
{
    int operator () ( const string& a, const string& b )
    {
        if( b < a ) return -1;
        if( b > a ) return  1;
        return 0;
    }
} ;

int myCmpCstr( const void* a, const void* b )
{
    //return strcmp( *(char* const*)a, *(char* const*)b ); // or //
    return strcmp( *(const char**)a, *(const char**)b );
}
bool myStrCmpRev( const string& a, const string& b )
{
    return b < a;
}

int myCmpCstrRev( const void* a, const void* b )
{
    //return strcmp( *(char* const*)b, *(char* const*)a ); // or //
    return strcmp( *(const char**)b, *(const char**)a );
}


int myqStrCmp( const void* a, const void* b )
{
    return ( (*(string*)a) ).compare( *(string*)b );
}
int myqStrCmpRev( const void* a, const void* b )
{
    return ( (*(string*)b) ).compare( *(string*)a );
}



int main() /////////// MAIN BEGINS /////////////////////////
{
    vector< string > myWords;
    map < string, int > myPairs; // to hold sorted (word, count) pairs //

    if( fillFromFile( IN_FNAME, myWords, myPairs ) )
    {

        // NOW ... GET/HOLD a copy of the 'unsorted'
        // vector in a 'new array' ...
        string* ary = new string[myWords.size()];
        string* aryCpy = new string[myWords.size()];
        for( unsigned i = 0; i < myWords.size(); ++ i )
            aryCpy[i] = ary[i] = myWords[i];


        double t1 = clock();
        msort( myWords, 0, myWords.size()-1 );
        double t2 = clock();


        cout << "There were " << myWords.size()
             << " words read from file: '" <<  IN_FNAME << "'\n"
             << myPairs.size()
             << " of which were unique ... \n\n"

             <<"Re. the vector 'myWords' and the array 'ary' ...\n "

             << "\nTime to msort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted ... report only if NOT //


        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        MyCmp ascending;
        msort( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();

        cout << "Time to msort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        t1 = clock();
        MyCmpRev descending;
        msort( myWords, 0, myWords.size()-1, descending );
        t2 = clock();

        cout << "Time to msort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if rev. sorted  //
       
       
        ////////////////////////////////////////////////////////////
        t1 = clock();
        msort( ary, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to msort array was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            ary[i] = aryCpy[i];

        t1 = clock();
        msort( ary, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to msort array (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        t1 = clock();
        msort( ary, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to msort array (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size(), false ); // test if rev. sorted //
       
       
        ////////////////////////////////////////////////////////////
        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort2 vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort2 vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        t1 = clock();
        qSort2( myWords, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort2 vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if sorted //
       
       
        ////////////////////////////////////////////////////////////
        // re-inital ary ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            ary[i] = aryCpy[i];
        t1 = clock();
        qSort2( ary, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort2 array was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        // re-inital ary ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            ary[i] = aryCpy[i];

        t1 = clock();
        qSort2( ary, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort2 array (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        t1 = clock();
        qSort2( ary, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort2 array (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size(), false ); // test if rev. sorted //
       
       
        ////////////////////////////////////////////////////////////
        // re-inital vector... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        qSort( myWords, 0, myWords.size()-1 );
        t2 = clock();
        cout << "\nTime to qSort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        qSort( myWords, 0, myWords.size()-1, ascending );
        t2 = clock();
        cout << "Time to qSort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        t1 = clock();
        qSort( myWords, 0, myWords.size()-1, descending );
        t2 = clock();
        cout << "Time to qSort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if rev. sorted //
       

        ////////////////////////////////////////////////////////////
        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            ary[i] = aryCpy[i];

        t1 = clock();
        qsort( ary, myWords.size(), sizeof(string), myqStrCmp );
        t2 = clock();
        cout << "\nTime to qsort array of string was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size() ); // test if sorted //

        t1 = clock();
        qsort( ary, myWords.size(), sizeof(string), myqStrCmpRev );
        t2 = clock();
        cout << "Time to qsort array of string (down) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( ary, myWords.size(), false ); // test if rev. sorted //
       
       
        ////////////////////////////////////////////////////////////
        // get ragged ary of C strings ... //
        char** rAryStrs = toCstrings( aryCpy, myWords.size() );
       
        // debugging ...
        // for( size_t k = 0; k < 10; ++ k ) cout << rAryStrs[k] << '\n';

        t1 = clock();
        qsort( rAryStrs, myWords.size(), sizeof(char*), myCmpCstr );
        t2 = clock();
        cout << "\nTime to qsort ragged array was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSortedCstrs( rAryStrs, myWords.size() ); // test if sorted //

        t1 = clock();
        qsort( rAryStrs, myWords.size(), sizeof(char*), myCmpCstrRev );
        t2 = clock();
        cout << "Time to qsort ragged array (down) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSortedCstrs( rAryStrs, myWords.size(), false ); // test if rev. sorted //
       
       
       
        ////////////////////////////////////////////////////////////
        // re-inital ary... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        sort( myWords.begin(), myWords.end() );
        t2 = clock();
        cout << "\nTime to C++ library sort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        sort( myWords.begin(), myWords.end(), myStrCmp );
        t2 = clock();
        cout << "Time to C++ library sort vector (passing in compare function) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        t1 = clock();
        sort( myWords.begin(), myWords.end(), myStrCmpRev );
        t2 = clock();
        cout << "Time to C++ library sort vector (down, passing in compare function) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if sorted //
       
       
       
        ////////////////////////////////////////////////////////////
        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        shellsort( myWords );
        t2 = clock();
        cout << "\nTime to shellsort vector was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        // re-inital vector ... //
        for( unsigned i = 0; i < myWords.size(); ++ i )
            myWords[i] = aryCpy[i];

        t1 = clock();
        shellsort( myWords, ascending );
        t2 = clock();
        cout << "Time to shellsort vector (passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords ); // test if sorted //

        t1 = clock();
        shellsort( myWords, descending );
        t2 = clock();
        cout << "Time to shellsort vector (down, passing in functor) was "
             << (t2-t1)/CLOCKS_PER_SEC << " seconds.\n";
        isSorted( myWords, false ); // test if sorted //


        qSort( myWords, 0, myWords.size()-1, ascending ); // resort up to get ready for next part //

        pause("Press 'Enter' to continue and see the (word, count) pairs ... " );



        // Ok ... show the words and count, 20 at a time ... //
       
        unsigned i = 0, count = 0, stop = 0;
        map< string, int >::const_iterator it; // to traverse the map ... //

        for( it = myPairs.begin(); it != myPairs.end(); ++ it )
        {
            ++stop; // pause every 20 words //

            while( (i < myWords.size())  &&  (myWords[i] == it->first) )
            {
                ++count;
                ++i;
            }
            cout << "The word: '" << it->first << "' occured "
                 << it->second << " times =? " << count << " times.\n";
            if( stop % 20 == 0 )
                pause();

            count = 0; // get ready to count the next duplicate words in the vector //
        }


        cout << "All done?  Now i = " << i
             << " and words in myWords = " << myWords.size() << endl;
             
        delete [] aryCpy;
        delete [] ary;
        // can you code up a clear function for the ragges array? //
    }

    pause( "\nPress 'Enter' to continue/exit ..." );

} //////////////////// MAIN ENDS ///////////////////////////

(CODE CONTINUES on next PAGE ... )
« Last Edit: November 11, 2018, 03:42:09 AM by David »

Offline David

  • Hero Member
  • *****
  • Posts: 647
    • View Profile
Code: [Select]
// several 'definitions' follow ... //

template< typename T >
void print( const T ary[], size_t size )
{
    for( size_t i = 0; i < size; ++i ) cout << setw(8) << ary[i];
    if( size ) std::cout << std::endl;
}
template< typename T >
void print( const vector < T >& v )
{
    for( size_t i = 0; i < v.size(); ++i ) cout << setw(8) << v[i];
    if( v.size() ) cout << endl;
}

template< typename T >
bool isSorted( const T* ary, size_t size, bool a )
{
    if( a ) // ascending
    {
        for( int i = size-1; i > 0; --i )
        {
            if( ary[i-1] > ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    else // descending //
    {
        for( int i = size-1; i > 0; --i )
        {
            if( ary[i-1] < ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    // else ...
    return true;
}
template< typename T >
bool isSorted( const vector< T >& ary, bool a )
{
    if( a )
    {
        for( int i = ary.size()-1 ; i > 0 ; --i )
        {
            if( ary[i-1] > ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    else
    {
        for( int i = ary.size()-1 ; i > 0 ; --i )
        {
            if( ary[i-1] < ary[i] )
            {
#if myDebug
                cout << ary[i-1] << ", " << ary[i] << endl;
                pause();
#endif
                return false;
            }
        }
    }
    // else ...
    return true;
}

bool isSortedCstrs( char** ary, size_t size, bool a )
{
    if( a ) // ascending
    {
        for( int i = size-1; i > 0; --i )
        {
            if( strcmp( ary[i-1], ary[i] ) > 0 )
            {
#if myDebug
                cout << "'" << ary[i-1] << "', '" << ary[i] << "'\n";
                pause();
#endif
                return false;
            }
        }
    }
    else // descending //
    {
        for( int i = size-1; i > 0; --i )
        {
            if( strcmp( ary[i-1], ary[i] ) < 0 )
            {
#if myDebug
                cout << "'" << ary[i-1] << "', '" << ary[i] << "'\n";
                pause();
#endif
                return false;
            }
        }
    }
    // else ...
    return true;
}

bool fillFromFile( const char* fname, Words& wds, std::map<  string, int >& myPairs )
{
    ifstream fin( fname );
    if( fin.is_open() )
    {
        string line, word;
        while( getline( fin, line ) )
        {
            istringstream iss(line);
            while( iss >> word )
            {

                if( word[0] == '[' ) continue;

                toAllCaps( word );

                unsigned i = 0, len = word.size();
                while( i < len  &&  ('A' <= word[i] && word[i] <= 'Z') )
                    ++i ;

                if( i < len )
                    word.erase(i);

                if( !word.size() ) continue;


                if( myPairs.count( word ) ) ++ myPairs[word] ;
                else myPairs[word]  = 1;

                wds.push_back( word );
            }
        }
        fin.close();
        return true;
    }
    // else
    std::cout << "\nThere was a problem opening file "
              << fname << "\n\n";
    return false;
}

void toAllCaps( string& str )
{
    unsigned i = 0, len = str.size();
    while( i < len ) {
        str[i] = toupper(str[i]);
        ++ i;
    }
}

void pause( const string& msg )
{
    cout << msg << flush;
    string dummyLine;
    getline( cin, dummyLine );
}



// typical output to first part ... //
/*
There were 300160 words read from file: 'Psalms119times128.txt'
495 of which were unique ...

Re. the vector 'myWords' and the array 'ary' ...

Time to msort vector was 0.499 seconds.
Time to msort vector (passing in functor) was 0.499 seconds.
Time to msort vector (down, passing in functor) was 0.374 seconds.

Time to msort array was 0.515 seconds.
Time to msort array (passing in functor) was 0.523 seconds.
Time to msort array (down, passing in functor) was 0.43 seconds.

Time to qSort2 vector was 0.327 seconds.
Time to qSort2 vector (passing in functor) was 0.358 seconds.
Time to qSort2 vector (down, passing in functor) was 0.297 seconds.

Time to qSort2 array was 0.296 seconds.
Time to qSort2 array (passing in functor) was 0.327 seconds.
Time to qSort2 array (down, passing in functor) was 0.281 seconds.

Time to qSort vector was 0.265 seconds.
Time to qSort vector (passing in functor) was 0.265 seconds.
Time to qSort vector (down, passing in functor) was 0.218 seconds.

Time to qsort array of string was 0.124 seconds.
Time to qsort array of string (down) was 0.094 seconds.

Time to qsort ragged array was 0.093 seconds.
Time to qsort ragged array (down) was 0.031 seconds.

Time to C++ library sort vector was 0.156 seconds.
Time to C++ library sort vector (passing in compare function) was 0.171 seconds.

Time to C++ library sort vector (down, passing in compare function) was 0.125 se
conds.

Time to shellsort vector was 0.608 seconds.
Time to shellsort vector (passing in functor) was 0.558 seconds.
Time to shellsort vector (down, passing in functor) was 0.421 seconds.
Press 'Enter' to continue and see the (word, count) pairs ...
The word: 'A' occured 1536 times =? 1536 times.
The word: 'ABHOR' occured 128 times =? 128 times.
The word: 'ABIDETH' occured 128 times =? 128 times.
The word: 'ABOVE' occured 256 times =? 256 times.
The word: 'ACCEPT' occured 128 times =? 128 times.
The word: 'ACCORDING' occured 2304 times =? 2304 times.
The word: 'AFFLICTED' occured 512 times =? 512 times.
The word: 'AFFLICTION' occured 384 times =? 384 times.
The word: 'AFRAID' occured 128 times =? 128 times.
The word: 'AFTER' occured 512 times =? 512 times.
The word: 'AGAINST' occured 384 times =? 384 times.
The word: 'AIN' occured 128 times =? 128 times.
The word: 'ALEPH' occured 128 times =? 128 times.
The word: 'ALL' occured 2304 times =? 2304 times.
The word: 'ALMOST' occured 128 times =? 128 times.
The word: 'ALSO' occured 768 times =? 768 times.
The word: 'ALWAY' occured 128 times =? 128 times.
The word: 'AM' occured 384 times =? 384 times.
The word: 'AN' occured 384 times =? 384 times.
The word: 'ANCIENTS' occured 128 times =? 128 times.
Press 'Enter' to continue ...
The word: 'AND' occured 8064 times =? 8064 times.
The word: 'ANGUISH' occured 128 times =? 128 times.
The word: 'ANSWER' occured 128 times =? 128 times.
The word: 'ANY' occured 128 times =? 128 times.
The word: 'ARE' occured 384 times =? 384 times.
The word: 'ART' occured 128 times =? 128 times.
The word: 'AS' occured 896 times =? 896 times.
The word: 'ASHAMED' occured 640 times =? 640 times.
The word: 'ASTRAY' occured 256 times =? 256 times.
The word: 'AT' occured 512 times =? 512 times.
The word: 'AWAY' occured 384 times =? 384 times.
The word: 'AWE' occured 128 times =? 128 times.
The word: 'BANDS' occured 128 times =? 128 times.
The word: 'BE' occured 1664 times =? 1664 times.
The word: 'BECAUSE' occured 1152 times =? 1152 times.
The word: 'BECOME' occured 128 times =? 128 times.
The word: 'BEEN' occured 384 times =? 384 times.
The word: 'BEFORE' occured 640 times =? 640 times.
The word: 'BEGINNING' occured 128 times =? 128 times.
The word: 'BEHELD' occured 128 times =? 128 times.
Press 'Enter' to continue ...
*/
« Last Edit: November 11, 2018, 01:50:39 AM by David »