1

I was trying to solve a very simple coding question:

Consider an array of numeric strings where each string is a positive number with anywhere from 1 to 10^6 digits. Sort the array's elements in non-decreasing, or ascending order of their integer values and print each element of the sorted array on a new line.

The first line contains an integer n denoting the number of strings Each of the n subsequent lines contain an INTEGER STRING.

My code is:

#include <bits/stdc++.h>

using namespace std;

int main()
{
    int i; string s;
    cin >> i;
    int j = i;
    string arr[i];
    int cntr = 0;
    while (i--){
        cin >> s;
        arr[cntr] = s;
        cntr++;
    }
    sort(arr, arr+j);

    for (auto c: arr)
        cout << c << endl;

}

The input is

6
31415926535897932384626433832795
1
3
10
3
5

And my output turns out to be:

1
10
3
3
31415926535897932384626433832795
5

If I make an array and add integer strings to it manually, the above code works fine. Then why is it producing wrong result when it takes input from the website?

PS: Here's the link to the problem:https://www.hackerrank.com/challenges/big-sorting/problem

Chif
  • 830
  • 1
  • 7
  • 20
R dove
  • 35
  • 5
  • 6
    Three things (unrelated to your problem): C++ doesn't have [variable-length arrays](https://en.wikipedia.org/wiki/Variable-length_array), use [`std::vector`](https://en.cppreference.com/w/cpp/container/vector) instead. Also [don't include ``](https://stackoverflow.com/questions/31816095/why-should-i-not-include-bits-stdc-h). Lastly, [don't do `using namespace std;`](https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice). To summarize: Don't use such sites to learn C++, they're infamously bad for it. Take a course or a class or read books. – Some programmer dude Oct 03 '19 at 06:50
  • 1
    As for your problem: String comparison is [*lexiographical*](https://en.wikipedia.org/wiki/Lexicographical_order) not numerical. – Some programmer dude Oct 03 '19 at 06:52
  • I am not using variable-length arrays! – R dove Oct 03 '19 at 06:53
  • 1
    `int i;` followed by `string arr[i];`. That makes `arr` a variable-length array. Array sizes in C++ must be compile-time constants. – Some programmer dude Oct 03 '19 at 06:54
  • @Rdove The array `arr` is variable-length – James Picone Oct 03 '19 at 06:55
  • @Rdove `arr[i]` is a variable length array. Its length is only known at runtime. Set your compiler to strict standard conformance (e.g. `-std=c++17` in GCC) and it will indicate an error. – Erlkoenig Oct 03 '19 at 06:55
  • What do you mean by "takes input from the website"? How are you doing that? It looks to me like it is taking input from standard input. What are you doing differently when it works? It does indeed look like it is sorting the strings correctly as strings (see @Someprogrammerdude's comment); I don't understand how it will work when you get the strings from a different place. – Basya Oct 03 '19 at 06:56
  • @Basya , yes it is taking input from standard input. I will try to use a vector and see if it changes anything! – R dove Oct 03 '19 at 06:59
  • @Someprogrammerdude bool b = "1" > "31415926535897932384626433832795"; makes b = 0! Why isn't this comparing them lexiographically? – R dove Oct 03 '19 at 07:08
  • 1
    Because it is text! Nobody knows that you want compare numbers represented as text. – Klaus Oct 03 '19 at 07:10
  • @Rdove _Why isn't this comparing them lexicographically?_ It does. You likely don't know what lexicographical comparison is. "1" is lexicographically less then "314...", since '1' is less then '3'. – Daniel Langr Oct 03 '19 at 07:12
  • @Rdove : I don't understand your last question to Some programmer dude. b=0 in your example because the first character of the first string, '1', is 'smaller' alphabetically than the first character of the second string, '3'. This is comparing them lexicographically. (Which of course explains why "10" is sorted before "3", in your program) – Basya Oct 03 '19 at 07:15
  • Some unrelated improvements: always test input streams when you read with `>>`; there's no need for a separate `sort()` pass if you just insert into a sorted container such as `std::multiset` instead of an array. – Toby Speight Oct 03 '19 at 15:53
  • When using `std::string` or `strcmp`, then `"1" > "31415926535897932384626433832795"` is false basically because `'1' < '3'`. – Some programmer dude Oct 03 '19 at 20:00

3 Answers3

5

Firstly use a vector of strings instead of a variable size array of strings (which is not allowed in C++).

The STL sort function uses lexicographical search to sort strings by default. You need to pass your own comparison function in order to sort the integer strings numerically.

Assuming the integer strings don't have leading 0's;

sort(arr.begin(), arr.end(), [] (const string& s1, const string& s2) {
    return s1.size() < s2.size() || (s1.size() == s2.size() && s1 < s2);
});
srt1104
  • 959
  • 7
  • 11
0

I will give you an alternative solution as C++ already have sorted containers.

Some hints: Please do not use "using namespace std;"

C++ did not have variable length arrays! So it is much easier to use a container type! In the example, we use a std::multimap which can have elements sorted and allows duplicates.

#include <iostream>
#include <map>

// we want to use our own sorting algorithm for std::multimap
// this needs a functional object with operator() which do the compare
//
// works only for non negative numbers without leading '0's
struct compare
{
    bool operator()( const std::string& s1, const std::string& s2 ) const
    {
        // if the string contains a number as text, we can asume that
        // a number which has less characters is lesser
        if ( s1.size() < s2.size() ) return true;

        // if the size is bigger, the numerical value is bigger
        if ( s1.size() > s2.size() ) return false;

        // if both are equal length
        // so we simply compare lexigraphical
        // this works because a bigger diggit char always means a
        // bigger numerical value
        return s1 < s2;
    }
};

int main()
{
    // instead of later sort, we use a sorted container, multiset because we can use duplicates
    std::multiset<std::string, compare> data;

    // read data as in your code 
    int numberOfElementsToRead;
    std::string tmpInput;

    std::cin >> numberOfElementsToRead;

    while (numberOfElementsToRead--)
    {
        std::cin >> tmpInput;
        data.insert( tmpInput ); // insert automatically sorts
    }

    // print out the container
    for ( auto& s: data )
    {
        std::cout << s << std::endl;
    }
}
Klaus
  • 24,205
  • 7
  • 58
  • 113
  • _Adding data to already sorted containers is faster than collecting all data unsorted and sort them later._ I would be very careful with this statement. For example, [this benchmark](http://quick-bench.com/FrMAGSaAO-oaUZPU2ZnSAQlweZ8) illustrates the exact opposite; namely, insterting into a vector and its subsequent sorting is more than twice as fast. Of course, it's a benchmark, but it shows that we shouldn't make such generic statements. – Daniel Langr Oct 03 '19 at 08:11
  • Relevant answers: https://stackoverflow.com/a/15638063/580083 and https://stackoverflow.com/a/24968324/580083. Other migth be interesting as well. A rule of thumb is basically that if one only needs to insert elements and **then** have them sorted at some moment without future updates, vector + sort would be very likely faster. – Daniel Langr Oct 03 '19 at 08:20
  • sets (and multisets) are usually implemented as balanced binary trees (typically red-black trees). Insertion of element into a red-black tree may violate the properties of the red-black tree and to restore these, there is a requirement of color changes and rotations. So you can imagine why storing the strings in a vector and then sorting it is usually faster than sorting the strings using STL set (and multiset). – srt1104 Oct 03 '19 at 08:48
  • @DanielLangr: Removed the "speed" part from my answer. Thanks for the correction! – Klaus Oct 03 '19 at 15:28
-4

This won't work:

    int i;
    cin >> i;

    string arr[i];

If you need to dynamically resize an array, use std::vector (or new/delete if you really must).

    int i;
    cin >> i;

    std::vector<std::string> arr(i);

In terms of why the sorting fails, you are sorting the numbers alphabetically, which means anything with a '1' at the front will appear first. Sort based on the numeric value instead:

auto compareStringsNumerically = [](const std::string& a, const std::string& b) {
  return std::stoi(a) < std::stoi(b); //< compare integer values
};
std::sort(arr, arr + j, compareStringsNumerically);
robthebloke
  • 9,331
  • 9
  • 12
  • 3
    your sort comparator wont work as the numbers have up to 10^6 digits which is much larger than an `int` – Alan Birtles Oct 03 '19 at 06:58
  • Why use more complicated lambda combined with an auto variable instead a good old function. This is abuse of lambda for me! – Klaus Oct 03 '19 at 07:03
  • @Klaus Maybe, because we cannot define a function inside a function? – Daniel Langr Oct 03 '19 at 07:21
  • @DanielLangr: In this case there is no need to define a function inside a function. Lambdas made for 1) write function in place ( not the case here ), or create functional objects which can store data ( also not the case here ). I can't see any benefit here, it is only more complicated. – Klaus Oct 03 '19 at 07:25
  • @Klaus Agree, there is no need. IMO, it's simply a matter of coding style. Discussions about coding styles are usually meaningless. I don't think the generated assembly would be different: https://godbolt.org/z/Co8VF9. I wouldn't use this approach either, but can see its advantage, i.e., by naming the lambda, a less experienced code reader may find out what it does more easily. (Sure, we have comments, but it's again a matter of coding style.) – Daniel Langr Oct 03 '19 at 07:35