3

I just want to know whether it is possible to pick up the data that is present between two delimiters (delimiter being a string).

For example the original string is as under

<message%20type%3D"info"%20code%3D"20005">%20<text>Conference%20successfully%20modified</text>%20<data>0117246</data>%20%20</message>%20

and I want the data that is present between <text> tags. The string from which i need the data can be different. The string can also be like this

<message%20type%3D"info"%20code%3D"20001">%20<text>Conference%20deleted</text%20%20<vanity>0116976</vanity>%20</message>%20<message%20type%3D"info"%20code%3D"20002">%20<text>Number%20of%20conferences%20deleted</text>%20<data>1</data>%20%20</message>%20  

but I always need the data present between the <text> tags.

So is it possible in C language or is there any alternative?

tripleee
  • 175,061
  • 34
  • 275
  • 318
k.dev
  • 57
  • 1
  • 3
  • possible duplicate of [parsing the value in between two XML tags](http://stackoverflow.com/questions/3493714/parsing-the-value-in-between-two-xml-tags) – tripleee Feb 11 '15 at 10:06
  • Is the malformed close tag ` – tripleee Feb 11 '15 at 10:08
  • I see your question has changed completely .. Please don't change the question..so that makes the answer look wrong – Gopi Feb 11 '15 at 10:37
  • @Gopi I have not changed the question. – k.dev Feb 11 '15 at 10:57
  • I mean the string shown was different when you posted initially it didn't had ` – Gopi Feb 11 '15 at 10:59
  • @tripleee It is a similar to parsing the value in between two XML tags but the difference is that in this case there could be multiple tags so the solution given in parsing the value in between two XML tags won't work – k.dev Feb 11 '15 at 13:13
  • @Gopi I fixed the formatting so you're right that the tag wasn't originally *visible* but it was there all along. The StackOverflow markup really sucks in that unsupported tags in body text simply disappear. – tripleee Feb 11 '15 at 14:09
  • XML can certainly contain multiple occurrences of the same tag in the same document or even on the same line. – tripleee Feb 11 '15 at 14:10
  • @tripleee That malformed close tag that you are talking of is actually . I think it got left out while i was copying the string. So in a nut shell the close tag is not – k.dev Feb 12 '15 at 06:19
  • I need to escape the double quotes in my string. Is there a way to escape it dynamically – k.dev Feb 19 '15 at 09:31

2 Answers2

7

I'd go with strstr().

For example:

#include <stdio.h>
#include <string.h>

int main(void) {
    char data[] = "<message%20type%3D\"info\"%20code"
                  "%3D\"20005\">%20<text>Conference%"
                  "20successfully%20modified</text>%"
                  "20<data>0117246</data>%20%20</mes"
                  "sage>%20";
    char *p1, *p2;
    p1 = strstr(data, "<text>");
    if (p1) {
        p2 = strstr(p1, "</text>");
        if (p2) printf("%.*s\n", p2 - p1 - 6, p1 + 6);
    }
    return 0;
}
pmg
  • 106,608
  • 13
  • 126
  • 198
4

There are functions strtok() and strtok_r() which can be used to extract the data based on the delimiters.

char a[100] = "%20Conference%20successfully%20modified%200117246%20%20%20";
char *p = strtok(a,"%");
while(p != NULL)
{
  // Save the value in pointer p
  p = strtok(NULL,"%");
}

If you want the string a to be unmodified then have a separate array b char b[100] and copy the string to b

strcpy(b,a);

Code and output:

#include <stdio.h>

int main(void) {
    char a[100] = "%20Conference%20successfully%20modified%200117246%20%20%20";
    char *p = strtok(a,"%");
    char n[20];
    while(p != NULL)
    {
      strcpy(n,p);
      p = strtok(NULL,"%");
      printf("%s\n",n);
    }
    return 0;
}

Output:

20Conference
20successfully
20modified
200117246
20
20
20

PS: strtok() modifies the passed string.Check man http://linux.die.net/man/3/strtok_r

Gopi
  • 19,784
  • 4
  • 24
  • 36
  • "I always need the data present between the `` tag" You're solving a different problem. – weston Feb 11 '15 at 10:29
  • @weston I have added a `PS` please check it.. If that is the case then the initial string needs to be saved and the string which is passed to strotk() should be a different one ..How am I solving some other issue this is the solution OP wants just an addition is that the initial string shouldn't be changed – Gopi Feb 11 '15 at 10:31
  • OP wants to extract the text in between the tags but your example starts with what they want to end with then breaks it up by `%`. – weston Feb 11 '15 at 10:32
  • Oh, I did not realize question was changed. That explains it. – weston Feb 11 '15 at 10:39