2

I am trying to create a script that will search for an xml raw request that contains multiple xml tags and copies the result to an external file or custom property in SoapUI.

Currently I am trying with this:

 // read the file from path
def file = new File('PathToLogFile.log')

def data= file.filterLine { 
    it =~ /(?ms)(<OpeningRequestTag">[\s\S]*?<\/ClosingRequestTag)/
}

The problem is that it can't read the blocks containing these opening and closing tags, which is a bit strange since I have checked the regular expression definition it regex101 and it finds what I need.

I have also tried with

def data= file.filterLine { 
        it =~ /(?ms)(<OpeningRequesTag">[\s\S]*?<\/ClosingRequestTag)/

but again - nu luck :(. Can you tell me what should I change in order to select the set or xml tags that I want? Note that the opening and closing tags also differ and are not the same - the opening tag contains additional information. It looks like :

<RequestTag 343.75676.76.767>
.
.
.
<RequestTag>

Thank you!

1 Answers1

1

Lets say your sample XML is

 POST https://www.udzial.com HTTP/1.1
 Accept-Encoding: gzip,deflate
 Content-Type: text/xml;charset=UTF-8
 SOAPAction: http://www.udzial.com
 Content-Length: 69476
 Host: www.udzial.com
 Connection: Keep-Alive
 User-Agent: Apache-HttpClient/4.5.2 (Java/1.7.0_162)
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

Let us say that its part of RawRequest of Request1 . Request1 is the name of request in soapui

Then the below code can extract the xml based on the start and endtag

def groovyUtils = new com.eviware.soapui.support.GroovyUtils( context )
def xml=groovyUtils.getXmlHolder("Request1#RawRequest")
String string = xml.getXml()
String starttag="to"
String endtag="heading"
//log.info string

def extract= (string =~  /(?s)<${starttag}.*?${endtag}>/)

log.info extract[0]

The output for the above code is

Mon Jul 16 17:14:39 IST 2018: INFO: <to>Tove</to>
<from>Jani</from>
<heading>

There could be 2 problems in your code

  1. " in regular expression is not required
  2. $ should be used for variables inside regular expression
Gaurav Khurana
  • 3,423
  • 2
  • 29
  • 38
  • Hi, the file that I am trying to extract the xml request is a **.log** file that is updated each time when an application is submitted. The format of the request is always the same ... the unique identification is contained in some of the tags in the request so eventually we will have tons of requests that start and end with the same tags. With `def xml=groovyUtils.getXmlHolder("Request1#RawRequest")` I am unable to identify the file ... so can yo utell me how can I do it for this .log file? – Kristiyan Damyanov Jul 16 '18 at 14:02
  • String fileContents = new File('D:\\automation\\sample.log').text from https://stackoverflow.com/questions/7729302/how-to-read-a-file-in-groovy-into-a-string use this line and replace string with fileContents in the above code then it works i have tried. if it solved your problem you can accept it as solution – Gaurav Khurana Jul 17 '18 at 03:12
  • Hi Gaurav, thank you very much for the help - it seems that the magic works! One more quick question, do you know how to configure the regular expression so that it will find a specific block out of many blocks that contain the same start and end tags. Should it be something like `def extract= (fileContents =~ /(?s)<${starttag}.*?value${endtag}>/)` ? Or can we configure it so that it will take the latest block from the whole document? – Kristiyan Damyanov Jul 23 '18 at 13:19
  • Thank you. Can you accept this as solution by tikcing the check mark under my answer under upvote button. is the existing code not helping ? i mean def extract= (string =~ /(?s)<${starttag}.*?${endtag}>/) should work. As it brings a particular block.. YOu can post a fresh question and share a link.. as i think i am not correctly able to undertsand what you are asking – Gaurav Khurana Jul 24 '18 at 02:43
  • Hi, I have accepted this as a solution, it works fine :) The current problem is that I have the block that I want to extract more than 1 time present in the log. Since these log is always filled with similar requests I don't know the exact count so I want to find a way how to extract a specific block from all of them and was thinking if I can enter in the regular expression in the middle of the opening and closing tag the a specific tag that is related to the applications unique ID that will tell the system to extract this specific block. If it will be easy for you I can post a new question :) – Kristiyan Damyanov Jul 24 '18 at 07:28
  • ok so basically you want the matter between first and last occurrence of the tag. if that is the case you can avoid '?' from the expression. ? is a non-greedy operator. It stops at first occurrence of the tag. So if you remove ? , it will become greedy and will stop at last tag – Gaurav Khurana Jul 24 '18 at 10:43
  • Ok, so if it stops at the last tag I will have extracted only the last one or all of the tags encountered? – Kristiyan Damyanov Jul 24 '18 at 11:44
  • it will pick the last one for example 4324554353453454889 suppose this is your sample and you write 3.*4, then it will give you 324554353453454 (3 and last 4),, But if you want a non greedy expression 3.*?4 , then you will get 324 i.e. first occurence of 4. – Gaurav Khurana Jul 25 '18 at 03:00
  • I see it now :) But if I want to take it *from the last 3 to the last 4* i.e. in your example 324554353453454 (last 3 and last 4) to be 3454 how should I do it? – Kristiyan Damyanov Jul 25 '18 at 06:49
  • def a="324554353453454" def pattern = (a=~ /3.*?4/) log.info pattern.size() log.info pattern[2] // so all the patterns are captured. then you can point to the last one.. as all are in that array – Gaurav Khurana Jul 25 '18 at 08:58
  • I think this should work, but how I can define it so that it will bring the last block from the log. I have tried with log.info extract[pattern.size().toString()-1] but it is not working. – Kristiyan Damyanov Jul 26 '18 at 14:21
  • Post it as a new questionm It would be easy to look at. you can then give the link here – Gaurav Khurana Sep 20 '18 at 11:35