1

I saw this question but none of the answers were quite what I was looking for. I've tried strstr but it returns a pointer instead of an integer index.

I need to find if string a contains string b and if so, where it's located, kind of like the index returned by strcmp. Is there a function or easy way to do this in C?

For example, if a is "foobar" and b is "bar", then this function/method would return 3 because "bar" is at index 3 of "foobar".

Any help is appreciated!

Community
  • 1
  • 1
MD XF
  • 7,860
  • 7
  • 40
  • 71
  • `strstr` does just that - if it doesn't work as you expect it to, you should post your code here and explain the problem you have with the behavior of `strstr` in it. – DUman Nov 04 '16 at 20:48
  • 3
    `"bar"` is at index `3` in `"foobar"`, not `2`. – mch Nov 04 '16 at 20:49

3 Answers3

4

You can use strstr for this, along with some pointer arithmetic.

char *result = strstr(a, b);
if (result != NULL) {
    printf("index = %tu\n", result - a);
}

Here, result points a particular number of bytes ahead of a. So if you subtract the two, that's your index.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    Correct. Note that Microsoft's C library and some others might not support the `%lu` conversion specifier. On such systems, use `printf("index = %llu\n", (unsigned long long)result - a);` – chqrlie Nov 04 '16 at 21:39
1

You can easily convert the returned pointer into an index by subtracting the pointer to the beginning of the string, a:

char *p = strstr(a, b);
int i = p ? p - a : -1;

(Also, strcmp doesn't return an index.)

melpomene
  • 84,125
  • 8
  • 85
  • 148
  • Does `strcmp` not return the first point at which the first argument and the second one differ? – MD XF Nov 04 '16 at 20:51
  • 1
    @MD XF - no it does not. `strcmp()` can only detect the nature of the difference (is one lexicographically less than, equal to, or greater than the other). It returns no information about where the difference is. – Peter Nov 04 '16 at 21:01
  • 3
    @melpomene - `i` would be better with type `ptrdiff_t`. That is the result of subtracting pointers - and the result is not necessarily able to be stored in an `int`. Otherwise, your approach is okay - if `strstr(a,b)`, if not `NULL`, cannot give a result less than `a`, so the subtraction `p-a` will not give a negative value. – Peter Nov 04 '16 at 21:08
  • @MDXF - `strcmp()` compares the strings in lexical order. Consider your example `foobar` and `bar`. It will compare `f` and `b` and will return 4 because `b` is 4 orders ahead of `f`. If you compare `foobar` and `Foobar`, it will return 32 because `f` and `F` are 32 orders apart. If the first byte are equal it will proceed to compare te next bytes until it finds a difference or the end of either strings – alvits Nov 04 '16 at 21:08
  • 1
    @melpomene `/tmp/test foobar Foobar` - result: `The difference between foobar and Foobar is 32`. Here's another `The difference between foobar and bar is 4`. The statement: `printf("The difference between %s and %s is %d\n", argv[1], argv[2], strcmp(argv[1], argv[2]));`. So what's wrong? Care to explain? When in doubt, test the code. – alvits Nov 04 '16 at 21:16
  • @melpomene http://www.programiz.com/c-programming/library-function/string.h/strcmp - _The first unmatched character between string str1 and str2 is third character. The ASCII value of 'c' is 99 and the ASCII value of 'C' is 67. Hence, when strings str1 and str2 are compared, the return value is 32._ – alvits Nov 04 '16 at 21:27
  • 2
    @alvits: The tutorial from programiz.com is misleading: the value `32` could be any int value greater than `0`. Nothing can be assumed about the return value of `strcmp(a,b);` beyond the fact that it can `0` if the strings are the same, negative if `a` is lexicographically less than `b` and positive otherwise. – chqrlie Nov 04 '16 at 21:50
  • @chqrlie - Here's NetBSD source code for `strcmp()` http://ftp.netbsd.org/pub/NetBSD/NetBSD-current/src/common/lib/libc/string/strcmp.c and Microsoft's http://research.microsoft.com/en-us/um/redmond/projects/invisible/src/crt/strcmp.c.htm. In both cases, they return the difference between the characters that differ. You can still argue but all sources show they return the difference of the character that differs. – alvits Nov 04 '16 at 22:11
  • @chqrlie - and here's from GNU http://ask.systutorials.com/71/strcmp-and-strncmp-implementation-in-glibc as posted in systutorials.com – alvits Nov 04 '16 at 22:14
  • 1
    @alvits: Read the description in programiz.com, the C Standard, the cppreference... All will tell you exactly what I wrote in my comment. The fact that multiple implementations of `strcmp()` happen to return the difference of the mismatched characters is irrelevant, it is just an efficient way to implement `strcmp`. If your code relies on that, it is bogus. Comparing the return value of `strcmp()` to anything else than `0` is meaningless. – chqrlie Nov 04 '16 at 22:14
  • @chqrlie - _. Comparing the return value of strcmp() to anything else than 0 is meaningless_ - I don't argue with that. You are absolutely correct. I am defending my previous comment based on the actual implementations in glibc regarding the first comment by MD XF in this answer. – alvits Nov 04 '16 at 22:20
  • @chqrlie - _The tutorial from programiz.com is misleading: the value 32 could be any int value greater than 0_ - the source code I posted is to backup my claim and to disprove that the returned value is not just any integer. – alvits Nov 04 '16 at 22:23
  • 1
    @alvits: The problem with your wording is it may induce other programmers to believe that the return value be the difference of the mismatched characters. Given the OP original misconception about `strcmp()`, it is important to be concise and unambiguous. BTW did you notice the bug in MSVC's implementation? – chqrlie Nov 04 '16 at 22:25
  • 1
    @alvits: this comment is incorrect and misleading: *strcmp() compares the strings in lexical order. Consider your example foobar and bar. It will compare f and b and will return 4 because b is 4 orders ahead of f. If you compare foobar and Foobar, it will return 32 because f and F are 32 orders apart. If the first byte are equal it will proceed to compare te next bytes until it finds a difference or the end of either strings* – chqrlie Nov 04 '16 at 22:27
  • @chqrlie - that statement is to explain to MD XF why he is seeing return values that he thought would've been the location of the mismatch. It is based on the current implementation of glibc in GNU and NetBSD and the Microsoft. If you find a better wording to describe the current implementation, please let me know and I'll reword. – alvits Nov 04 '16 at 22:32
  • @chqrlie - you can still argue that the current implementation is not standard and may change in the future. That I will agree and no one should rely on the value except comparing against 0 less than 0 or greater than 0 which will be future proof. – alvits Nov 04 '16 at 22:34
  • 2
    The Comment in MSVC source code is rather clear: *Compare strings pointed by pSrt1, pSrt2. Return 0 if same, < 0 if pStr1 is less then pStr2, > 0 otherwise.* Explaining the OP how a particular library computes the different values returned is counterproductive: many casual readers will believe that you are describing what it must do instead of what it happens to do. Such misconceptions are difficult to dislodge. – chqrlie Nov 04 '16 at 23:22
0

Here you go:

#include <stdio.h>
#include <string.h>

int main()
{
    char str1[]="foobar";
    char str2[]="bar";
    char *ptr;
    if((ptr = strstr(str1, str2)) != NULL) {
        printf("Start offset: %td\n", ptr - str1);
    }

    return 0;
}
Kamyar Souri
  • 1,871
  • 19
  • 29
  • Testing will not save you from UB. – melpomene Nov 04 '16 at 21:03
  • My understanding is ptrdiff_t is a signed int which is what %d in printf expects. Is that true? – Kamyar Souri Nov 04 '16 at 21:25
  • No: `ptrdiff_t` is a signed type that can be `int` or larger. On linux 64-bit, it is `long`, on Windows 64-bit, it is `long long`. Either use `printf("Start offset: %td\n", ptr - str1);` or `printf("Start offset: %lld\n", (long long)(ptr - str1));` – chqrlie Nov 04 '16 at 21:41