1

I have converted a large xml file into characters in R and wondered if it would be possible to search for records which have specific words associated with them? Most of the information I have found on this assumes the data is in a dataframe, however mine is a lot of rows of characters, each being an entry in the xml file.

Unfortunately, I am unsure how to post XML text here and so although I know that I should not post images I am unsure how else to convey this as when I type using the XML format, the formatting disappears, but if a row of characters contains, among other things, a category called models, as in the below example, and I want to search only for models that are Sud, how would I do this? I am relatively new at using R.

example XML

zx8754
  • 52,746
  • 12
  • 114
  • 209
Seq76
  • 57
  • 4
  • 2
    Why not parse the xml: https://stackoverflow.com/q/17198658/680068 – zx8754 Nov 24 '22 at 12:42
  • I'm looking more to find something relatively simple, along the lines of a function similar to SELECT * FROM DATA WHERE COLUMN = VALUE from SQL, is there any way to do this in R when reading/parsing an XML file? – Seq76 Nov 24 '22 at 19:06
  • 1
    I would parse it. Alternatively, you could try to use: `myVector[ grepl(">Sud<", myVector, fixed = TRUE) ]` – zx8754 Nov 24 '22 at 20:25
  • Provide example data, `dput(head(myData))` – zx8754 Nov 24 '22 at 20:26
  • Moved my comment into an answer, see below. – zx8754 Nov 25 '22 at 08:33

1 Answers1

1

I would parse it using dedicated packages, see this post: How to parse an XML file to an R data frame?.

Alternatively, you could try to use grepl:

myVector[ grepl(">Sud<", myVector, fixed = TRUE) ]
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Thank you so much, the grepl option worked wonders for me, thanks so much for your help! – Seq76 Nov 25 '22 at 10:44