10

I tried using strncmp but it only works if I give it a specific number of bytes I want to extract.

char line[256] = This "is" an example. //I want to extract "is"
char line[256] = This is "also" an example. // I want to extract "also"
char line[256] = This is the final "example".  // I want to extract "example"
char substring[256]

How would I extract all the elements in between the ""? and put it in the variable substring?

gsamaras
  • 71,951
  • 46
  • 188
  • 305
ShadyBears
  • 3,955
  • 13
  • 44
  • 66
  • Tokenize the string, using `"` as the delimiter. Take the second result. See documentation for `strtok`. It's all you need... – Floris Oct 24 '13 at 01:48

5 Answers5

15

Note: I edited this answer after I realized that as written the code would cause a problem as strtok doesn't like to operate on const char* variables. This was more an artifact of how I wrote the example than a problem with the underlying principle - but apparently it deserved a double downvote. So I fixed it.

The following works (tested on Mac OS 10.7 using gcc):

#include <stdio.h>
#include <string.h>

int main(void) {
const char* lineConst = "This \"is\" an example"; // the "input string"
char line[256];  // where we will put a copy of the input
char *subString; // the "result"

strcpy(line, lineConst);

subString = strtok(line,"\""); // find the first double quote
subString=strtok(NULL,"\"");   // find the second double quote

printf("the thing in between quotes is '%s'\n", subString);
}

Here is how it works: strtok looks for "delimiters" (second argument) - in this case, the first ". Internally, it knows "how far it got", and if you call it again with NULL as the first argument (instead of a char*), it will start again from there. Thus, on the second call it returns "exactly the string between the first and second double quote". Which is what you wanted.

Warning: strtok typically replaces delimiters with '\0' as it "eats" the input. You must therefore count on your input string getting modified by this approach. If that is not acceptable you have to make a local copy first. In essence I do that in the above when I copy the string constant to a variable. It would be cleaner to do this with a call to line=malloc(strlen(lineConst)+1); and a free(line); afterwards - but if you intend to wrap this inside a function you have to consider that the return value has to remain valid after the function returns... Because strtok returns a pointer to the right place inside the string, it doesn't make a copy of the token. Passing a pointer to the space where you want the result to end up, and creating that space inside the function (with the correct size), then copying the result into it, would be the right thing to do. All this is quite subtle. Let me know if this is not clear!

gsamaras
  • 71,951
  • 46
  • 188
  • 305
Floris
  • 45,857
  • 6
  • 70
  • 122
  • Would somebody care to explain what is so unlikeable about my solution? – Floris Oct 24 '13 at 03:13
  • Much cleaner code than the other one above. Thank you. I didn't accept it originally because it didn't compile. I don't think you needed to const char* and then strcpy. I was looking for a simple function call and couldn't figure it out. Runtime is extremely important in a Data Structures class. Thank you! – ShadyBears Oct 24 '13 at 04:29
  • Glad you like it now. The reason for the copy is that a `const char*` may not be modified, but that is what `strtok` does - see e.g. http://stackoverflow.com/questions/9406475/why-is-strtok-changing-its-input-like-this – Floris Oct 24 '13 at 04:47
  • See also important considerations at the end about what happens to your input string, persistence of variables, etc. – Floris Oct 24 '13 at 12:00
  • 1
    I adapted your answer to work for me to get everything after a "=" ... I did this by specifying my delimiter "=" in the first strtok and then specifying "\0" in the second strtok to get the rest of the line. Thank you. – Vince K Jan 24 '20 at 01:41
  • For some reason it doesn't work it the first character in the string is the character that you want to delimiter, e.g: the string "(TESTING)" and the delimiter being parenthesis, – Marcell Monteiro Cruz Dec 14 '21 at 11:04
  • @MarcellMonteiroCruz I don’t understand what you are saying. Perhaps it would be best if you posted a separate question with your code snippet, showing what you want to happen and what is actually happening. – Floris Dec 14 '21 at 12:23
2

if you want to do it with no library support...

void extract_between_quotes(char* s, char* dest)
{
   int in_quotes = 0;
   *dest = 0;
   while(*s != 0)
   {
      if(in_quotes)
      {
         if(*s == '"') return;
         dest[0]=*s;
         dest[1]=0;
         dest++;
      }
      else if(*s == '"') in_quotes=1;
      s++;
   }
}

then call it

extract_between_quotes(line, substring);

Keith Nicholas
  • 43,549
  • 15
  • 93
  • 156
1

Here is a long way to do this: Assuming string to be extracted will be in quotation marks (Fixed for error check suggested by kieth in comments below)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){

    char input[100];
    char extract[100];
    int i=0,j=0,k=0,endFlag=0;

    printf("Input string: ");
    fgets(input,sizeof(input),stdin);
    input[strlen(input)-1] = '\0';

    for(i=0;i<strlen(input);i++){
        if(input[i] == '"'){

                j =i+1;
                while(input[j]!='"'){
                     if(input[j] == '\0'){
                         endFlag++;
                         break;
                     }
                     extract[k] = input[j];
                     k++;
                     j++;
                }
        }
    }
    extract[k] = '\0';

    if(endFlag==1){
        printf("1.Your code only had one quotation mark.\n");
        printf("2.So the code extracted everything after that quotation mark\n");
        printf("3.To make sure buffer overflow doesn't happen in this case:\n");
        printf("4.Modify the extract buffer size to be the same as input buffer size\n");

        printf("\nextracted string: %s\n",extract);
    }else{ 
       printf("Extract = %s\n",extract);
    }

    return 0;
}

Output(1):

$ ./test
Input string: extract "this" from this string
Extract = this

Output(2):

$ ./test
Input string: Another example to extract "this gibberish" from this string
Extract = this gibberish

Output(3):(Error check suggested by Kieth)

$ ./test

Input string: are you "happy now Kieth ?
1.Your code only had one quotation mark.
2.So the code extracted everything after that quotation mark
3.To make sure buffer overflow doesn't happen in this case:
4.Modify the extract buffer size to be the same as input buffer size

extracted string: happy now Kieth ?

--------------------------------------------------------------------------------------------------------------------------------

Although not asked for it -- The following code extracts multiple words from input string as long as they are in quotation marks:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(){

    char input[100];
    char extract[50];
    int i=0,j=0,k=0,endFlag=0;

    printf("Input string: ");
    fgets(input,sizeof(input),stdin);
    input[strlen(input)-1] = '\0';

    for(i=0;i<strlen(input);i++){
        if(input[i] == '"'){
            if(endFlag==0){
                j =i+1;
                while(input[j]!='"'){
                     extract[k] = input[j];
                     k++;
                     j++;
                }
                endFlag = 1;
            }else{
               endFlag =0;
            }

            //break;
        }
    }

    extract[k] = '\0';

    printf("Extract = %s\n",extract);

    return 0;
}

Output:

$ ./test
Input string: extract "multiple" words "from" this "string"
Extract = multiplefromstring
gsamaras
  • 71,951
  • 46
  • 188
  • 305
sukhvir
  • 5,265
  • 6
  • 41
  • 43
  • This answer pre-supposes that you know everything about your string except the bit in quotes. Not very likely, is it. Especially with three examples given. – Floris Oct 24 '13 at 02:05
  • this has some serious problems if there's a single quote in a string – Keith Nicholas Oct 24 '13 at 02:17
  • 1
    @KeithNicholas op didn't ask for error checks .. my solution is exactly what was asked by op. I leave the rest to op for error check, Unless they request it – sukhvir Oct 24 '13 at 02:22
  • when dealing with strings you should always respect 0 termination, whether the op asked for it, or even knew to ask for it. Its just the basic concept of zero terminated strings. – Keith Nicholas Oct 24 '13 at 02:34
  • Thanks for all the effort you put into the answer. Much appreciated. – ShadyBears Oct 24 '13 at 02:45
  • @KeithNicholas - if you don't like clunky, then maybe you would like the solution I posted (which initially got some downvotes because of a bug that I believe I subsequently fixed)? Just two calls to `strtok`. – Floris Oct 24 '13 at 03:12
  • yeah, strtok is ok, I was actually expecting a clean strtok solution to be done and accepted, which is why I just did a non lib based solution. – Keith Nicholas Oct 24 '13 at 03:15
  • I am guessing you don't consider my answer (the one that got all the downvotes) to be a "clean strtok solution" then? What's wrong with it? – Floris Oct 24 '13 at 03:41
1
#include <string.h>
...        
substring[0] = '\0';
const char *start = strchr(line, '"') + 1;
strncat(substring, start, strcspn(start, "\""));

Bounds and error checking omitted. Avoid strtok because it has side effects.

fizzer
  • 13,551
  • 9
  • 39
  • 61
0

Have you tried looking at the strchr function? You should be able to call that function twice to get pointers to the first and second instances of the " character and use a combination of memcpy and pointer arithmetic to get what you want.

godel9
  • 7,340
  • 1
  • 33
  • 53