I need to parse log files into an ArrayList of ArrayLists. The regex is working, and I can get the correct results in a variable or .csv output. The problem is that I need to manipulate the output by adding a value in entries where a condition is not true, and appending additional values based on index[0] (filename) matches between the original and to-be-appended rows.
Each log file can have 1-~200 entries, depending on number of field collected inputs. Log file entries are multiline and variable; but structured, so all variations are known (n=18 regexes - not all relevant to the snippet, below). I need to be able to manipulate row content based on some of those variations.
This means I need to loop through individual, potentially unequal-length rows (i.e across the table) to edit and append, and loop over each of the rows (i.e. down the table). So, simple arrays won't work as well as ArrayLists.
I'm successfully creating an ArrayList of a single ArrayList (all of what should be individual rows are put into a single ArrayList, which then goes into the parent ArrayList...).
Trying to get individual ArrayLists by moving 'covArrayList = new ArrayList(covArrayList);' between the 'while ((corrLine...)' and 'for (String..)' loops, or into the 'if(fileMatcher.find)' block returns multiple outputs per regex match, and changes the order, so values can't each be linked to a specific 'file1Name'...
FYI: I'm using JDK 10. I'll have to refactor down so JRE 8 can run the program, but want to do that later for developmental reasons.
This is a subset of my code, which is all within the main method:
//arraylist of covArrayLists init:
List<List<String>> coverage = new ArrayList<>();
//coverage arrayList init:
List<String> covArrayList = new ArrayList<String>();
//log file Reader init:
File corrFile = new File("D:\\Utilities\\Development\\Java\\HPGPSLogParser\\Correct_2015-10-13_10-51.txt");
BufferedReader corrReader = new BufferedReader(new InputStreamReader(new FileInputStream(corrFile),"UTF-16LE"));
//NOTE: PFO differential correction log files are encoded in UTF-16 LE
String corrText = "";
String corrLine = "";
//corrWriter init:
File stateCSV = new File("D:\\Utilities\\Development\\Java\\HPGPSLogParser\\tcov.csv");
BufferedWriter corrWriter = new BufferedWriter(new FileWriter(stateCSV, true));
String coverageOutput = "";
String processingOutput = "";
//regex variables:
//Coverage Details regex
Pattern fileName1 = Pattern.compile("Rover file: (?<fileName1>[A-Z]{2}-\\d{3}-\\d{5}-SP\\d\\.SSF)+");
String firstFileName = "";
Pattern noBase = Pattern.compile("(?<noBase>No matching base data found)");
String noBaseText = "";
Pattern totalCoverage = Pattern.compile("(?<totalCoverage>[\\d]{1,3})\\% total coverage");
String totalCovText = "";
Pattern coverageBy = Pattern.compile("(?<coverageBy>[\\d]{1,3})+\\% coverage by (?<baseStation>\\b\\w+\\b\\.[zZ].*)+", Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE);
String covByPct = "";
String covByProvider = "";
try(corrReader)
{
while ((corrLine = corrReader.readLine())!=null)
{
corrText = corrLine.trim();
String delim = " ";
String[] words = corrLine.split(delim);
covArrayList = new ArrayList<String>(covArrayList);
for (String s : words)
{
//Coverage details regex search begin - write to coverageOutput
Matcher file1Matcher = fileName1.matcher(corrText);
if(file1Matcher.find())
{
firstFileName = file1Matcher.group("fileName1");
covArrayList.add(firstFileName);
} //end if(file1Matcher)
Matcher baseMatcher = noBase.matcher(corrText);
if (baseMatcher.find())
{
noBaseText = baseMatcher.group("noBase");
covArrayList.add("TRUE");
} //end if(baseMatcher)
Matcher totCovMatcher = totalCoverage.matcher(corrText);
if(totCovMatcher.matches())
{
totalCovText = totCovMatcher.group("totalCoverage");
covArrayList.add(totalCovText);
} //end if(totCovMatcher)
Matcher covByMatcher = coverageBy.matcher(corrText);
if(covByMatcher.matches())
{
covByPct = covByMatcher.group("coverageBy");
covArrayList.add(covByPct);
covByProvider = covByMatcher.group("baseStation");
covArrayList.add(covByProvider);
} //end if(covByMatcher)
} //end for(String)
} //end while loop - regex searches & initial output file end
coverage.add(covArrayList);
processing.add(procArrayList);
corrWriter.write(coverage.toString());
corrWriter.flush();
outWriter.write(processing.toString());
outWriter.flush();
The catch/finally blocks are in the code, not in the snippet.
Here's a snippet of log file with the three potential variations in this section:
--------Coverage Details:-------------------- Rover file: AA-123-12345-SP1.SSF Local time: 2/3/2015 4:06:14 PM to 2/3/2015 4:06:44 PM 0% total coverage. No matching base data found. Rover file: AA-123-12345-SP2.SSF Local time: 2/17/2014 5:51:01 PM to 2/1 7/2014 6:18:57 PM 100% total coverage 4% coverage by guug04914003.zip 100% coverage by guug04914022.zip Rover file: AA-123-12345-SP3.SSF Local time: 2/17/2014 9:53:40 PM to 2/17/2014 10:45:59 PM 100% total coverage 100% coverage by guug04914044.zip
NOTE: The line endings aren't being recognized: Actual Log File format
The closest match I can get to the log file encoding is UTF-16LE, no other option gets close to the charset/formatting of the log files.
The output I need should look like:
NOTE: Please pretend there isn't an extra line between entries (the algorithm eliminating whitespace is really screwing with the formatting I need to illustrate).
NOTE: When "noBase" is matched, no subsequent regexes will be matched (from this block).
NOTE: "covByPct" and "baseStation" may not occur, or will occur once or twice.
[["fileName1", "totalCoverage", "covByPct", "baseStation"]
["fileName1", "noBase"]
["fileName1", "totalCoverage", "covByPct", "baseStation", "covByPct", "baseStation"]]
The output closest to what I need is:
[["fileName1", "totalCoverage", "covByPct", "baseStation", "fileName1", "noBase", "fileName1", "totalCoverage", "covByPct", "baseStation", "covByPct", "baseStation"]]
I'm a beginner, and am working on a project for work that's way above current skill level. :(
Can someone help me correct my code so that the group of regex matches gets put into a new ArrayList for each entry in a log file?
Thanks so much!!