0

I have a huge log file and inside have hundreds of exceptions, each exception has a time that it occurred too but they're not necessarily in the same place (in the Stack trace) for each exception. The only common denominator I can see is that they're always between squared brackets.

I know the dates in the log files are always going to be of the form [25/05/21 10:28:41:231 BST]. Is there a method to only display characters between "[ ]" on a List<String>. Or just on a String if I can parse it to a string. I was then thinking of writing some logic that will test if the string 25/05/21 10:28:41:231 BST is a date to avoid getting unwanted results if there is more data between brackets.

So far I tried splitting it by "[" but I've ran into a few issues. There is some "[" in the data file which means the time and date isn't necessarily the always the first input in the String array so I couldn't just select them all.

Any input or suggestions of how to fix this/other ways to do this would be much appreciated!

  • Use a regex for the date format, assign it into a group and extract that group only. – m0skit0 May 26 '21 at 08:35
  • How would I use a regex? I've never used one if you could give me a brief overview or point me in the direction of a useful article that would be brilliant :) – Connor Gill May 26 '21 at 08:39
  • Do you have the `List – Zeus Almighty May 26 '21 at 08:50
  • @ConnorGill Then it's time to learn regular expressions :) – m0skit0 May 26 '21 at 09:00
  • I have a `List` with hundreds of stack traces (Each one has the date/time). I'm happy to have the dates in a `List` or of any other form. – Connor Gill May 26 '21 at 09:06
  • 1
    Does this answer your question? [Regular Expression to match valid dates](https://stackoverflow.com/questions/51224/regular-expression-to-match-valid-dates) – CinchBlue May 26 '21 at 09:21
  • @ConnorGill This is a standard domain problem for regexes. In many languages, they often just go beyond regexes and provide helper classes to parse standard date formats. For example, see [Java 7's SimpleDateFormat](https://docs.oracle.com/javase/7/docs/api/java/text/SimpleDateFormat.html). I will also mention that regexes and finite state machines (FSMs) are really useful for string parsing. Regular expressions are often either FSMs or pushdown automata in terms of expressive power, and are usually good enough to parse simple data formats (non-programming languages). – CinchBlue May 26 '21 at 09:22
  • What you've all said looks very helpful! I guess I could just use a regex if I split the whole file up into induvial words in a `List` and then run a regex to find all the things of correct format? – Connor Gill May 26 '21 at 09:36

2 Answers2

0

Something of the form

"\[(\d\d/\d\d/\d\d \d\d:\d\d:\d\d:\d\d [A-Z]{3})\]"

E.g. in scala

scala> val pattern = "\\[(\\d\\d/\\d\\d/\\d\\d \\d\\d:\\d\\d:\\d\\d:\\d\\d\\d [A-Z]{3})\\]".r
val pattern: scala.util.matching.Regex = \[(\d\d/\d\d/\d\d \d\d:\d\d:\d\d:\d\d\d [A-Z]{3})\]

scala> pattern.matches("[25/05/21 10:28:41:231 BST]")
val res5: Boolean = true
Dragonborn
  • 1,755
  • 1
  • 16
  • 37
0

You can try this getBetween() method. You may find it useful for other things as well:

/**
 * Retrieves any string data located between the supplied string leftString
 * parameter and the supplied string rightString parameter.<br><br>
 * <p>
 * This method will return all instances of a substring located between the
 * supplied Left String and the supplied Right String which may be found
 * within the supplied Input String.<br>
 *
 * @param inputString (String) The string to look for substring(s) in.<br>
 *
 * @param leftString  (String) What may be to the Left side of the substring
 *                    we want within the main input string. Sometimes the
 *                    substring you want may be contained at the very
 *                    beginning of a string and therefore there is no
 *                    Left-String available. In this case you would simply
 *                    pass a Null String ("") to this parameter which
 *                    basically informs the method of this fact. Null can
 *                    not be supplied and will ultimately generate a
 *                    NullPointerException.<br>
 *
 * @param rightString (String) What may be to the Right side of the
 *                    substring we want within the main input string.
 *                    Sometimes the substring you want may be contained at
 *                    the very end of a string and therefore there is no
 *                    Right-String available. In this case you would simply
 *                    pass a Null String ("") to this parameter which
 *                    basically informs the method of this fact. Null can
 *                    not be supplied and will ultimately generate a
 *                    NullPointerException.<br>
 *
 * @param options     (Optional - Boolean - 2 Parameters):<pre>
 *
 *      ignoreLetterCase    - Default is false. This option works against the
 *                            string supplied within the leftString parameter
 *                            and the string supplied within the rightString
 *                            parameter. If set to true then letter case is
 *                            ignored when searching for strings supplied in
 *                            these two parameters. If left at default false
 *                            then letter case is not ignored.
 *
 *      trimFound           - Default is true. By default this method will trim
 *                            off leading and trailing white-spaces from found
 *                            sub-string items. General sentences which obviously
 *                            contain spaces will almost always give you a white-
 *                            space within an extracted sub-string. By setting
 *                            this parameter to false, leading and trailing white-
 *                            spaces are not trimmed off before they are placed
 *                            into the returned Array.</pre>
 *
 * @return (String[] Array) Returns a Single Dimensional String Array of all 
 *         the sub-strings found within the supplied Input String which are 
 *         between the supplied Left-String and supplied Right-String.
 */
public static String[] getBetween(String inputString, String leftString, String rightString, boolean... options) {
    // Return null if nothing was supplied.
    if (inputString.isEmpty() || (leftString.isEmpty() && rightString.isEmpty())) {
        return null;
    }

    // Prepare optional parameters if any supplied.
    // If none supplied then use Defaults...
    boolean ignoreCase = false;      // Default.
    boolean trimFound = true;        // Default.
    if (options.length > 0) {
        if (options.length >= 1) {
            ignoreCase = options[0];
            if (options.length >= 2) {
                trimFound = options[1];
            }
        }
    }

    // Remove any control characters from the
    // supplied string (if they exist).
    String modString = inputString.replaceAll("\\p{Cntrl}", "");

    // Establish a List String Array Object to hold
    // our found substrings between the supplied Left
    // String and supplied Right String.
    List<String> list = new ArrayList<>();

    // Use Pattern Matching to locate our possible
    // substrings within the supplied Input String.
    String regEx = java.util.regex.Pattern.quote(leftString) + "{1,}"
            + (!rightString.isEmpty() ? "(.*?)" : "(.*)?")
            + java.util.regex.Pattern.quote(rightString);
    if (ignoreCase) {
        regEx = "(?i)" + regEx;
    }

    java.util.regex.Pattern pattern = java.util.regex.Pattern.compile(regEx);
    java.util.regex.Matcher matcher = pattern.matcher(modString);
    while (matcher.find()) {
        // Add the found substrings into the List.
        String found = matcher.group(1);
        if (trimFound) {
            found = found.trim();
        }
        list.add(found);
    }
    return list.toArray(new String[list.size()]);
}

To use it you might do something like this:

String[] res = getBetween(logString, "[", "]");
System.out.println(res[0]);

String[] parts = res[0].split("\\s+");
String date = parts[0];
String time = parts[1] + " " + parts[2];
System.out.println("Date: --> " + date);
System.out.println("Time: --> " + time);
DevilsHnd - 退職した
  • 8,739
  • 2
  • 19
  • 22
  • Perfect thank you! This is working on individual dates, I just need to do a little manipulation to change my `List` to one large string or run it in a loop. Thank you – Connor Gill May 26 '21 at 10:44