0

For whatever stupid reason, the API I'm querying returns file paths as XPath. I'm trying to figure out a way to parse it back into a file path. So for example,

/content/folder[@name='Dropbox']
----> content/Dropbox

/content/folder[@name='whatever']/folder[@name='something else']/report[@name='why']
----> content/whatever/something else/why

/content/folder[@name='A/B']
----> content/A/B

The best idea I have right now is to split the xpath by / delimiters and then comprehend the individual tags, but that will break on test case 3 above. Alternatively, I could just write a pushdown automata and do it myself. Is there an xpath reverse parser that I can use to process these strings into something intelligent?

Jakob Lovern
  • 1,301
  • 7
  • 24

2 Answers2

1

Perhaps use a regular expression find and replace, look for substrings matching this regular expression:

[^\/]*\[@name='([^']+)']

... and replace each match with the captured group. e.g. https://regex101.com/r/n42SqR/1

Conal Tuohy
  • 2,561
  • 1
  • 8
  • 15
  • I don't know C#, but I think this is the functionality I'm thinking of: https://learn.microsoft.com/en-us/dotnet/standard/base-types/substitutions-in-regular-expressions – Conal Tuohy Aug 06 '22 at 12:43
0

As it turns out, XPath allows you to have either single-quoted or double-quoted strings, and you can escape characters from within the string. Building off this answer, I created the following regex:

/\w+\[@name=(?:""(?<name>[^""\\]*(?:\\.[^""\\]*)*)""|'(?<name>[^'\\]*(?:\\.[^'\\]*)*)')\]

And a breakdown:

/\w+\[@name=                            # Match the xpath preamble for the section.
(?:                                     # Anonymous group
    "(?<name>[^"\\]*(?:\\.[^"\\]*)*)"   # Match a double-quoted string with escapes. Save it to the <name> capture group.
   |'(?<name>[^'\\]*(?:\\.[^'\\]*)*)'   # OR match a single-quoted string with escape. Save it to the <name> capture group.
)\]                                     # End the anonymous group and match a literal ]
Jakob Lovern
  • 1,301
  • 7
  • 24