2

We can use getchar_unlocked for fast reading of integers from stdin by manipulating the characters like:

int scan_d()
{
    int ip = getchar_unlocked(), ret = 0, flag = 1;

    for(; ip < '0' || ip > '9'; ip = getchar_unlocked())
    {
        if(ip == '-')
        {
            flag = -1;
            ip = getchar_unlocked();
            break;
        }
    }

    for(; ip >= '0'&& ip <= '9'; ip = getchar_unlocked())
        ret = ret * 10 + ip - '0';
    return flag * ret;
}

Is there a way to read strings from stdin in a fast manner using something like above? gets is faster than cin/scanf but at the same time possess critical handling of whitespaces. I thought of modifying the above code for strings but faced problems with whitespaces. Further, it seems reading every character of a string one by one will be slower.

By stdin I mean string is entered by the user (no file handling)

user1052842
  • 121
  • 1
  • 11
CPPCoder
  • 155
  • 1
  • 1
  • 10
  • 12
    This smells of premature optimization. – Zac Howland Feb 19 '14 at 19:19
  • `getchar_unlocked` also looks to be a non-standard function. – crashmstr Feb 19 '14 at 19:20
  • 2
    There's nothing like a `getchar_unlocked()` method in [tag:c++], what are you talking about? – πάντα ῥεῖ Feb 19 '14 at 19:21
  • You write modify to reading strings. I know from experience that string allocation interface is much slower than to read into a char buffer. – user2672165 Feb 19 '14 at 19:22
  • @πάνταῥεῖ `getchar_unlocked()` is in `stdio.h` – John Odom Feb 19 '14 at 19:22
  • _'gets is faster than cin/scanf'_ `cin` which method?? – πάντα ῥεῖ Feb 19 '14 at 19:23
  • getchar_unlocked is similar to getchar. It's not defined for windows OS – CPPCoder Feb 19 '14 at 19:23
  • std::cin (console input) – CPPCoder Feb 19 '14 at 19:26
  • @CPPCoder If it's not standard, it can't be available using c++ standard headers. But well, I see no reason why e.g. `std::istream::read()` implementation should be slower, than this. BTW: I meant which input method actually, `std::cin` is console input, I know. – πάντα ῥεῖ Feb 19 '14 at 19:28
  • @πάνταῥεῖ If I read a string using gets, it's relatively faster than reading the string using scanf or std::cin. – CPPCoder Feb 19 '14 at 19:33
  • @CPPCoder Again, which method of `std::istream` exactly? Did you mean the overloaded `std::operator>>(std::istream&,std::string)`? The latter might be slower, yes. To give us some insight, how you are measuring exactly would also be helpful. – πάντα ῥεῖ Feb 19 '14 at 19:39
  • 2
    @πάνταῥεῖ Since he's referring to `getchar_unlocked()`, I'd guess that means that it's a version of `getchar` with no syncronization. `std::cin` must sychronize across threads, and also by default synchronizes with cstdio. When you add the per-character virtual call overhead, then it doesn't really matter which method of `std::cin` is used, they're all going to be slower than `getchar_unlocked`. OTOH, this is pure premature optimization. – Mooing Duck Feb 19 '14 at 20:03
  • @MooingDuck You're right of course. _'OTOH, this is pure premature optimization'_ That's agreed for sure!! That's why I've been asking for the measurement methods. – πάντα ῥεῖ Feb 19 '14 at 20:11
  • 6
    _"Is there a way to read strings from stdin in a fast manner using something like above?"_ and due to _"By stdin I mean string is entered by the user (no file handling)"_, the answer is clear: take a faster user. ;) – CouchDeveloper Feb 19 '14 at 21:02

2 Answers2

3

Is there a way to read strings from stdin in a fast manner using something like above?

Certainly. If you were clearer how you expected it to act, we could even provide code.

gets is faster than cin/scanf but at the same time possess critical handling of whitespaces.

cin and scanf can do that as well: How to cin Space in c++? and How do you allow spaces to be entered using scanf?

I thought of modifying the above code for strings but faced problems with whitespaces.

What problems? https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem

Further, it seems reading every character of a string one by one will be slower.

Slower than a block read? Sure. But then, reading one character at a time is really the only way to tell where the end of the string is. You could block read into a buffer and spin through that to find the end of the string, this is called buffering. But stdin is already buffered, so buffering it again will make it slower not faster. There isn't going to be a faster way of reading a space separated string than using getchar_unlocked one character at a time.

Community
  • 1
  • 1
Mooing Duck
  • 64,318
  • 19
  • 100
  • 158
3

Not sure if my benchmarks are correct at all.. It is my first time testing speed really..

Anyway, here goes:

http://ideone.com/KruGD2

#include <cstdio>
#include<iostream>
#include <sstream>
#include <chrono>

std::chrono::time_point<std::chrono::steady_clock> hClock()
{
    return std::chrono::steady_clock::now();
}

std::uint32_t TimeDuration(std::chrono::time_point<std::chrono::steady_clock> Time)
{
    return std::chrono::duration_cast<std::chrono::nanoseconds>(hClock() - Time).count();
}


void Benchmark(const char* Name, std::string &str, void(*func)(std::string &str))
{
    auto time = hClock();
    for (int i = 0; i < 100; ++i)
    {
        func(str);
        str.clear();
    }
    std::cout<<Name<<" took: "<<TimeDuration(time) / 100<<" nano-seconds.\n";
}

void unlocked_bench(std::string &str)
{
    char c = '0';
    while((c = getchar_unlocked()) && (c != -1 && c != '\n' && c != '\r'))
    {
        str += c;
    }
}

void getchar_bench(std::string &str)
{
    char c = '0';
    while((c = getchar())  && (c != -1 && c != '\n' && c != '\r'))
    {
        str += c;
    }
}

void getline_bench(std::string &str)
{
    std::cin.getline(&str[0], str.size());
}

void scanf_bench(std::string &str)
{
    scanf("%[^\n]100s", &str[0]);
}

void fgets_bench(std::string &str)
{
    fgets(&str[0], str.size(), stdin);
}

void cinread_bench(std::string &str)
{
    std::cin.read(&str[0], str.size());
}

int main()
{
    std::string str;
    str.reserve(100);

    Benchmark("getchar_unlocked", str, unlocked_bench);
    Benchmark("getchar", str, getchar_bench);
    Benchmark("getline", str, getline_bench);
    Benchmark("scanf", str, scanf_bench);
    Benchmark("fgets", str, fgets_bench);
    Benchmark("cinread", str, cinread_bench);

    return 0;
}

Input:

Hello There
Hello There
Hello There
Hello There
Hello There
Hello There

Output:

getchar_unlocked took: 436 nano-seconds.
getchar took: 330 nano-seconds.
getline took: 619 nano-seconds.
scanf took: 522 nano-seconds.
fgets took: 44 nano-seconds.
cinread took: 67 nano-seconds.
Brandon
  • 22,723
  • 11
  • 93
  • 186
  • I read the docs on fgets, you did fine there. Also, I think ideone runs these on a separate server for accurate timings (it's part of SPOJ) Therefore, I tried to fix your tests, but I'm having a problem with gets. http://ideone.com/IbfD2E It does clearly demonstrate that `getchar_unlocked` is fastest, and istreams slowest _by far_. – Mooing Duck Feb 19 '14 at 20:57
  • Ok how about now.. I changed the benchmark.. Is it fine now? – Brandon Feb 19 '14 at 21:06
  • Added `cin.read`. @MooingDuck I used ideone because I don't have linux system to access `getchar_unlocked` atm. It doesn't compile in Mingw 4.8.1 on Windows. Doesn't work in MSVC 2013 either. Ideone was my only option. – Brandon Feb 19 '14 at 21:15
  • @ThomasMatthews: cin.read doesn't exhibit the same behavior as the others. The function that does is cin.getline, which is already in my tests – Mooing Duck Feb 19 '14 at 21:22
  • @CantChooseUsernames: You're doing a good job with this answer actually, except you're reading 500 lines and only providing 12. – Mooing Duck Feb 19 '14 at 21:25
  • @MooingDuck: All I'm saying is that `cin.read` is another method to read from the standard input that bypasses formatting. Although some compiler implementers have `cin` call the C language input functions. – Thomas Matthews Feb 19 '14 at 21:27
  • @MooingDuck, I provided 500 but ideone deleted my post and I had to remake the link with less input. So in order for the link to stay valid, I had to trim it down to a certain amount. – Brandon Feb 19 '14 at 21:28
  • @ThomasMatthews: Yes, but it reads a fixed number of bytes, which is very different from every other test here which read (up to a max) until they find a newline. (A) the results wouldn't be comparable, and (B) it's likely to be closer to `cin.getline` than anything else. – Mooing Duck Feb 19 '14 at 21:41
  • I added a validator to verify that inputs were being read accurately, and used lambdas to avoid function pointer overhead: http://ideone.com/t7EJkn. A warmup also made a big difference. @CantChooseUsernames: the fact that `getchar` is slower than `getchar_unlocked` in your tests points out that there's a problem. – Mooing Duck Feb 19 '14 at 22:20