-2

I have a problem I am trying to resolve. I have xml requests coming in 2 formats

<?xml version="1.0" encoding="UTF-8"?>
<Request xmlns="urn:x-facebook-com:DEF.plan.services.test">
  <OneRequest>
    <page_number>1</page_number>
    <page_size>25</page_size>
    <origin>TEST</origin>
    <item_name/>
  </OneRequest>
</Request>

<?xml version="1.0" encoding="UTF-8"?>
<Request xmlns="urn:x-google-com:ABC.plan.services.plans">
 <SecondRequest/>
</Request>

In both cases I want to extract the tag name that is the first one after <Request> . i.e OneRequest and SecondRequest (these will be dynamic and there are 100's of them). I tried using regex but did not get exactly what I wanted . Any inputs or suggestions will be greatly appreciated.

Also did see posts about xml parsers but it seems an overkill for what I basically want is just the first tag after <Request>

My Attempt

String[] requestTags = requestBody.split("</");
String requestName = requestTags[requestTags.length-2].replaceAll("[^a-zA-Z0-9]",

Not the best it kind of works on the first one but completely messes up in second type

Praveen
  • 557
  • 1
  • 5
  • 20
  • 1
    appreciate the downvote but unless I know why you are downvoting I cant really fix it – Praveen Sep 30 '19 at 21:58
  • please share your attempt –  Sep 30 '19 at 22:00
  • 3
    _Also did see posts about xml parsers but it seems an overkill_ - I disagree (but did not downvote). Regex just isn't the right tool for the job. I'd parse it and use a simple XPath: `name(/*/*)` ([See here](https://stackoverflow.com/questions/7984508/getting-elements-name-in-xpath/7984537#7984537) for examples of both XPath 1.0 and 2.0.) – Daniel Haley Sep 30 '19 at 22:02
  • @DanielHaley I can attempt xml parser but there was 2 things 1) What I want is top level tag after Request 2)Request are really large and this can start effecting performance and it might not fly under radar. but I will give it a try none the less and see how it works – Praveen Sep 30 '19 at 22:05

1 Answers1

0

You basically only need the \s option from regex, to achieve this:

Use this regex, and get the value from the tagname group:

<Request .*?>\s*<(?<tagname>.*?)>

see regex101 working example

  • the request comes in as a string. I added new lines here for easy readability apologies for the confusion – Praveen Sep 30 '19 at 22:13
  • oh, i updated the link then. Its not good looking but does the job :) –  Sep 30 '19 at 22:15
  • sweet will take a look. just a quick question can we do tagname in java? I am novice with regex in java and not sure if I have seen it used like that before. – Praveen Sep 30 '19 at 22:17
  • Yes you can, see [here](https://stackoverflow.com/questions/415580/regex-named-groups-in-java) –  Sep 30 '19 at 22:19
  • also I see ur relying on xmlns . I have updated the question but they wont be same . There will be a xmlns but it can vary from each request. Apologies @Niklas for the confusion – Praveen Sep 30 '19 at 22:19
  • If updated the regex again, should be finally woking then, although its not as bulletproof seeing its such a loose selector. –  Sep 30 '19 at 22:21
  • I get that . It is mostly for logging so should not be super bad if it screws up. also the one that I see now is again relying on the fact that it will be in newline which it wont – Praveen Sep 30 '19 at 22:23
  • thanks @Niklas I think this will be perfect I just need to read up on tagNames and this will do it – Praveen Sep 30 '19 at 22:25
  • Sorry to bother again but it the first example it catches all tagName if in same line. Maybe there is way to limit to first one only . https://regex101.com/r/pu3z1Y/6 – Praveen Sep 30 '19 at 22:36