Please help me to optimize the bash script. It takes too much time to execute.
Requirements:
The log file that I am working with has some rows with date at the beginning of row, and some rows are without date at the beginning of the row.
I need to insert date from upper row if date is absent at start of row.
I work in MingW64 under Windows 10.
Date is in format: 2022-06-09 17:47:08,371
Given file:
date1 string1
string2 date(just a date in log, not the date at the beginning of the row)
, string3
date2 string4
string5
]string6
date3 string7
date4 string8
date5 string9
Example of given file:
2022-06-09 10:00:01,000 string1
string2 2022-06-09 10:00:01,000 string2 2022 string2
, string3 string3 string3
2022-06-09 10:00:02,000 string4
string5
]string6 string6 string6
}
2022-06-09 10:00:03,000 string7 string7
2022-06-09 10:00:04,000 string8 string8
2022-06-09 10:00:05,000 string9
Expected file:
date1 string1
date1 string2 date
date1 , string3
date2 string4
date2 string5
date2 ]string6
date3 string7
date4 string8
date5 string9
Example of given file:
2022-06-09 10:00:01,000 string1
2022-06-09 10:00:01,000 string2 2022-06-09 10:00:01,000 string2 2022 string2
2022-06-09 10:00:01,000 , string3 string3 string3
2022-06-09 10:00:02,000 string4
2022-06-09 10:00:02,000 string5
2022-06-09 10:00:02,000 ]string6 string6 string6
2022-06-09 10:00:02,000 }
2022-06-09 10:00:03,000 string7 string7
2022-06-09 10:00:04,000 string8 string8
2022-06-09 10:00:05,000 string9
my script which needs optimization
I tried the following:
I did it with loop, it is very slow
nn_lines_to_replace=$(grep -Evn "^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}" "$file" | cut -d ":" -f1)
for nn_line in $nn_lines_to_replace ; do
replace=$(sed -n $(($nn_line-1))p "$file"|cut -d " " -f1-2)
sed -i ""$nn_line" s/^/$replace/" "$file"
done
Maybe it could be done with sed or awk.
If you have ideas how to optimize it or have better approach, please share, I will really appreciate any help
Update: I complicated the condition of this issue link