I have a log file that contains many things and among them it contains xml message that I would like to extract and write to a file if inside of that xml message I find specific string (transID).
For example, this is a file I want to search for string 'TODPG201412041625130415', and once I find it, I want to grab everything between:
<?xml version = "1.0" encoding = "ISO-8859-1" ?>
<SalesOrderAcknowledgement>
<HeaderData>
<TransID>TODPG201412041625130415</TransID>
and:
</SalesOrderAcknowledgement>
File:
05/12/2014 15:07:53 INFO [Search.java 445] - The Trans ID: TODPG201412041625130370 has already been processed.
05/12/2014 15:07:53 INFO [Search.java 316] - The message for Trans ID TODPG201412041625130370 was ALREADY CONSUMED. Consumed Original Message: <?xml version = "1.0" encoding = "ISO-8859-1" ?>
<SalesOrderAcknowledgement>
<HeaderData>
<TransID>TODPG201412041625130415</TransID>
<Description>Estimate</Description>
<SiteQueueName>TODPG</SiteQueueName>
<LineItems>5</LineItems>
<TimeStamp>201412041625130370</TimeStamp>
</HeaderData>
<SalesOrderDetail>
<SalesID>2002726862</SalesID>
</SalesOrderDetail>
<SalesOrderLineItems>
<LineItem>
<SalesLineNum>20</SalesLineNum>
<UnitPrice>0.4300</UnitPrice>
<BurdenRate>0.0000</BurdenRate>
<ExtendedPrice>0.00</ExtendedPrice>
<RecordStatus>A</RecordStatus>
<ErrorMessage1>Sales Order 2002726862 modified</ErrorMessage1>
<ErrorMessage2></ErrorMessage2>
<ErrorMessage3></ErrorMessage3>
</LineItem>
<LineItem>
<SalesLineNum>30</SalesLineNum>
<UnitPrice>3.6500</UnitPrice>
<BurdenRate>0.0000</BurdenRate>
<ExtendedPrice>0.00</ExtendedPrice>
<RecordStatus>A</RecordStatus>
<ErrorMessage1>Sales Order 2002726862 modified</ErrorMessage1>
<ErrorMessage2></ErrorMessage2>
<ErrorMessage3></ErrorMessage3>
</LineItem>
</SalesOrderLineItems>
</SalesOrderAcknowledgement>
05/12/2014 15:07:55 INFO [Search.java 232] - **** XML Message:
<?xml version = "1.0" encoding = "ISO-8859-1" ?>
<SalesOrderAcknowledgement>
<HeaderData>
<TransID>TODPG201412041635120944</TransID>
<Description>Estimate</Description>
<SiteQueueName>TODPG</SiteQueueName>
<LineItems>5</LineItems>
<TimeStamp>201412041635120944</TimeStamp>
</HeaderData>
<SalesOrderDetail>
<SalesID>2002720443</SalesID>
</SalesOrderDetail>
<SalesOrderLineItems>
<LineItem>
<SalesLineNum>10</SalesLineNum>
<UnitPrice>0.0870</UnitPrice>
<BurdenRate>0.0000</BurdenRate>
<ExtendedPrice>0.00</ExtendedPrice>
<RecordStatus>A</RecordStatus>
<ErrorMessage1>Sales Order 2002720443 modified</ErrorMessage1>
<ErrorMessage2></ErrorMessage2>
<ErrorMessage3></ErrorMessage3>
</LineItem>
</SalesOrderLineItems>
</SalesOrderAcknowledgement>
the transID will be always different and there can be multiple transID's in the same file.
I got to the point where I am printing the line number where the string is found, but I don't know how to get the string from <?xml version = "1.0"
.... :
import java.util.ArrayList;
import java.util.Scanner;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.*;
public class installation
{
public static String searchString = "TODPG201412041625130415";
public static void main(String args[])
{
final File folder = new File("C:/Users/Administrator/Desktop/Estimated_Acualized/LogBackup/2014");
listFilesForFolder(folder);
}
public static void listFilesForFolder(final File folder)
{
for (final File fileEntry : folder.listFiles())
{
findWord(searchString, fileEntry);
}
}
public static void findWord(String word, File file){
try
{
Scanner scanner = new Scanner(file);
int lineNum = 0;
while (scanner.hasNextLine())
{
String line = scanner.nextLine();
lineNum++;
if(line.indexOf(searchString) > -1)
{
System.out.println("found string on line " +lineNum);
System.out.println(line);
}
}
}
catch(Exception ex){
ex.printStackTrace();
}
}
}
Can someone please, shed some light as I am stuck.