-2

I'm building a small tool to help me filter data in a massive logfile (about 4.5 million lines). In order to filter these results, I'm using searchparameters which should either be included in the search or excluded from the search results.

I'm reading the logfile line by line due to memory space restrictions. For each line I'm checking if the neccessary conditions are met. So it looks something like this:

if (line.Contains(parameterToInclude1) && line.Contains(parameterToInclude2) && !line.Contains(parameterToExclude1) && !line.Contains(parameterToExclude2))

Hard-coded it works fine, however I'm trying to make this dynamic by allowing the user to add parameters to include and exclude.

For this I'm using a class called SearchParametersClass

public class SearchParametersClass
{
    public List<string> included { get; set; }
    public List<string> excluded { get; set; }
}

The question is now, how can I make my code so it checks if the line contains the different parameters from the Included list and excludes the parameters from the Excluded list?

Thanks

Cainnech
  • 439
  • 1
  • 5
  • 17
  • 1
    If there is some structure to the log file, you can improve performance quite a bit by searching a substring where you expect the given line to be. E.g. a log with lines like `2022-03-05 http foobarbaz` and `2023-03-05 ssh quuzbar` has a protocol (http or ssh) in a fixed position. Checking only that fixed position instead of the entire line makes a big difference. – Eric J. Mar 06 '23 at 00:21

1 Answers1

1

A simple (but maybe not the fastest) way to do it with Linq would look like this:

if (searchParams.included.All(x => line.Contains(x)) && 
  searchParams.excluded.All(x => !line.Contains(x)))
{
...
}
Sergey Kudriavtsev
  • 10,328
  • 4
  • 43
  • 68
  • Hey Sergey, that seems to be getting close. However, as a precaution, I would like to have the search perform case insensitive, or convert the entire line to upper case and also have the params converted to uppercase so I'm sure the correct value is found. IS there any way we can implement this in this solution? – Cainnech Mar 06 '23 at 00:26
  • Oh actually if (searchParams.included.All(x => line.toUpper().Contains(x.toUpper())) && searchParams.excluded.All(x => !line.toUpper().Contains(x.toUpper()))) seems to do the trick. Thanks Sergey! – Cainnech Mar 06 '23 at 00:29
  • 1
    @Cainnech Don't use `ToUpper` (or `ToLower`). You are looking for speed, that's about double the work and spins out garbage strings making the GC do a lot of cleanup. Instead, use `string.IndexOf()`. It supports case insensitive searching: https://stackoverflow.com/questions/444798/case-insensitive-containsstring – Flydog57 Mar 06 '23 at 01:45