I currently use a loop with scanf("%d", &value), but I would need it to go faster. The amount of data can be as much as 2 000 000 values. Is there any way to speed this up? I read about strtok and strtol, but I do not know how to use them and if they even would achieve the speed up I need.
Asked
Active
Viewed 1,064 times
1
-
1You should use `strtol` not because it's faster, but because, unlike `scanf`, it will tell you when you hit numeric overflow or invalid input. (What does your program do when you feed it `123cheesesandwich`? It crashes, doesn't it? See.) – zwol May 26 '14 at 01:25
-
4I don't really get the hostility here. It sounds like he's asking how to use `strtol`, which is valid because it requires buffering the file manually to some degree. (`strtok` is totally unrelated; you don't need that.) – Potatoswatter May 26 '14 at 01:38
-
1@Zack why would that make it crash? – M.M May 26 '14 at 06:18
-
scanf is pretty much as fast as you can get for parsing ints from strings. 2M values should be read in a fraction of a sec. Please post some code for more details. – vz0 May 26 '14 at 06:53
-
@MattMcNabb It's not guaranteed to crash, but the way `scanf` works, you can easily get stuck in an infinite loop or worse on invalid input. – zwol May 26 '14 at 13:48
-
1@Zack you only get stuck in an infinite loop if you wrote an infinite loop in your code. scanf's behaviour on 123cheesesandwich is well-defined. – M.M May 26 '14 at 21:09
-
@Simon Request to accept any of the answers to close the question. – Anmol Singh Jaggi Mar 25 '16 at 11:40
2 Answers
7
If you want only speed and no error-checking, you can make your own function for taking an input and parsing it as an integer using getchar()
.
void fast_input(int* int_input)
{
*int_input=0;
char next_char=0;
while( next_char < '0' || next_char > '9' ) // Skip non-digits
next_char = getchar();
while( next_char >= '0' && next_char <= '9' )
{
(*int_input) = ((*int_input)<<1) + ((*int_input)<<3) + next_char - '0';
next_char = getchar();
}
}
int main()
{
int x;
fast_input(&x);
printf("%d\n",x);
}

Anmol Singh Jaggi
- 8,376
- 4
- 36
- 77
-
3If there is no worry of locking and the platform is posix compliant, one can use `getchar_unlocked`. Magic numbers like 47, 48, 57, 58 [look bad](http://stackoverflow.com/questions/47882/what-is-a-magic-number-and-why-is-it-bad). Better replace with '0', '9' etc. Also solution would work for ascii input. – Mohit Jain May 26 '14 at 07:07
-
Also making `fast_input` inline can give you some speed if compiler honour the inline request. – Mohit Jain May 26 '14 at 07:09
-
@MohitJain and what is going to magically make the IO faster when inlined? – sehe May 26 '14 at 07:14
-
@sehe It won't make IO faster, but benchmark results may turn better as the function call cost might be saved. Moreover there is no harm in inlining such utility. – Mohit Jain May 26 '14 at 07:38
-
@MohitJain Firstly, "better" is very unclear (did the OP say he wants to optimize for CPU/Power usage?). In _principle_ there is harm in blindly applying micro-optimizations. In this particular case, though I will agree. For the simple reason that there will likely not be a difference at all because the compiler will inline that function on your behalf, and may simply ignore your `inline` suggestion just the same. It can even do this across translation units (most compilers are capable of LTO these days) – sehe May 26 '14 at 07:47
-
-
Edited to replace ASCII codes with their character equivalents. – Anmol Singh Jaggi May 26 '14 at 13:13
5
According to my experiences, memory mapped access is much faster for reading large amount of content from a file.
This can be achieved by
#include <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
... on *Nix and some combination of
CreateFileMapping
OpenFileMapping
MapViewOfFile
MapViewOfFileEx
UnmapViewOfFile
FlushViewOfFile
CloseHandle
... on Windows (refer to the link here.
Basically you want something like:
int fd = open( "filename" , 0 );
char* ptr = mmap( 0 , 4096*1024 // MAX FILE SIZE
, PROT_WRITE | PROT_READ , MAP_PRIVATE , fd , 0 //offset
);
// NOW READ AS IF ptr IS THE HEAD OF SOME STRING
char * thisp = ptr ;
while ( thisp != ptr+4096*1024 && *thisp ){
int some_int_you_want = strtol( thisp , &thisp , 10 );
}
munmap(ptr,4096*1024);
I'm not very confident that the code above is correct but it should have the correct idea....

phoeagon
- 2,080
- 17
- 20
-
Though memory mapping [doesn't always generate the highest speeds](http://stackoverflow.com/questions/17925051/fast-textfile-reading-in-c/17925143#17925143), also see [How to parse space-separated floats in C++ quickly?](http://stackoverflow.com/questions/17465061/how-to-parse-space-separated-floats-in-c-quickly/17479702#17479702) – sehe May 26 '14 at 07:51