0
c89
gcc (GCC) 4.7.2

Hello,

I am looking at some string functions as I need to search for different words in a sentence.

I am just wondering are the c standard functions fully optimized.

For example the functions like these: memchr, strstr, strspn, strchr, etc

In terms of high performance as that is what I need. Is there anything better?

Regards,

ant2009
  • 27,094
  • 154
  • 411
  • 609
  • If searching for the same words in many sentences, there are far faster methods. See, e.g., http://stackoverflow.com/questions/3183582/what-is-the-fastest-substring-search-algorithm – Jim Balter Nov 15 '12 at 08:26

3 Answers3

3

You will almost certainly find that the standard library functions have been optimised as much as they can, and they will probably outdo anything you code up in C.

Note that this is for the general case. If there is some restriction you're able to put on the functions, or some extra piece of information you have on the data itself, you may be able to get your code to run faster, since you have the advantage of that restriction or information.

For example, I've written C code for malloc that blew a library-supplied malloc away because I knew the application would never ask for more than 256 bytes so I just gave 256 bytes on every request. Yes, that was a waste of memory but it allowed speed improvements beyond the general case.

But, for the general case, you're better off sticking with the supplied stuff.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • This misses an important point ... just because those functions are optimized for what they do does not mean that they are the optimal tools for the problem, in this case text search. The only function that is even relevant is `strstr`, and it's considerably less efficient than other methods if searching for the same words in many sentences or for many words in the same sentence. – Jim Balter Nov 15 '12 at 08:21
  • Jim, I thought that would be covered in the "some restriction" function, though as "extra information" so I'll clarify. If you _know_ you won't be using the general case, you can optimise beyond the standard library functions. – paxdiablo Nov 15 '12 at 08:33
  • 1
    I would not expect someone to be able to infer from that general statement that there are algorithms such as Boyer-Moore etc. ... see my comment/link above. It really isn't about restrictions or extra information, it's about the fact that significant improvements in speed are available when operating on multiple strings, where costly constant setup time is amortized. – Jim Balter Nov 15 '12 at 08:54
3

Fully optimized? Optimized for what?

Yes, C functions of stdlib written to be very efficient and were tested/debugged for a years, so you definitely shouldn't worry about most of them.

Oleksandr Kravchuk
  • 5,963
  • 1
  • 20
  • 31
1

Assuming, that you always align your data to 16-byte boundaries and allocate every time about 16 bytes extra or so, it's definitely possible to speed up most stdlib routines.

But assuming that eg. strlen is not known in advance, or that reading just one byte too much can cause a segmentation fault, I wouldn't bother.

Aki Suihkonen
  • 19,144
  • 1
  • 36
  • 57