So I have a lot of logs txt files that look somewhat like this:
2021-04-01T12:54:38.156Z START RequestId: 123 Version: $LATEST
2021-04-01T12:54:42.356Z END RequestId: 123
2021-04-01T12:54:42.356Z REPORT RequestId: 123 Duration: 4194.14 ms Billed Duration: 4195 ms Memory Size: 2048 MB Max Memory Used: 608 MB
I need to create a pandas dataframe with this data with following features where each row would present one log:
DateTime, Keyword(start/end), RequestId, Duration, BilledDuration, MemorySize, MaxMemoryUsed
The problem is that each file has different length and there are different types of logs so not every line looks the same but there are patterns. I've never used RegEx but I think this is what I have to use. So is there a way to transform this string into a dataset?
(my goal is to perform memory usage anomaly detection)