0

Сontinuation of the question
The log file that I am working with has some rows with or without date at the beginning of row.
I'm able to insert date from upper row if date is absent at start of row.
I need mark existing date with 00000 and with sequence for new inserted date.

I work in MingW64 under Windows 10.
Date is in format: 2022-06-09 17:47:08,371

Given file:
date1 string1
(some spaces or tabs) string2 date (just a date in log, not the date at the beginning of the row; leading spaces)
, string3
date2 string4
string5
]string6
date3 string7
date4 string8
date5 string9

Expected file:
date1:00000 string1
date1:00001 , string2 - increment of mark
date1:00002 string3 date (just a date in log, not the date at the beginning of the row; leading spaces)
date2:00000 string4 - new date starts with 00000
date2:00001 string5
date2:00002 ]string6
date3:00000 string7 - new date starts with 00000
date4:00000 string8 - new date starts with 00000
date5:00000 string9 - new date starts with 00000

Example of given file:

2022-06-09 10:00:01,000:00000 string1
       string2
string3
2022-06-09 10:00:02,000:00000 string4  
string5
string6
2022-06-09 10:00:03,000:00000 string7
2022-06-09 10:00:04,000:00000 string8
2022-06-09 10:00:05,000:00000 string9

Example of expected file:

2022-06-09 10:00:01,000:00000:00000 string1  
2022-06-09 10:00:01,000:00000:00001        string2
2022-06-09 10:00:01,000:00000:00002 string3  
2022-06-09 10:00:02,000:00000:00000 string4
2022-06-09 10:00:02,000:00000:00001 string5
2022-06-09 10:00:02,000:00000:00002 string6
2022-06-09 10:00:03,000:00000:00000 string7
2022-06-09 10:00:04,000:00000:00000 string8
2022-06-09 10:00:05,000:00000:00000 string9 

script that can be used as a template from @Walter A

awk '
  /^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3}/ {last_d_t=$1 " " $2; print; next}
 {print last_d_t $0}
' test.txt

or from @Daweo

awk '/^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9]/{d=substr($0, 1, 24);print;next}{print d $0}' test.txt
Andlas
  • 3
  • 2

1 Answers1

2

I simplified a little bit the regex
It should work as long as your "strings" cannot start with YYYY-MM-DD

awk '
    match($0,/^[0-9]{4}(-[0-9]{2}){2} [0-9:,]+/) {
        n = 0
        prefix = substr($0,1,RLENGTH)
        $0 = substr($0,1+RLENGTH)
    }
    { printf("%s:%05d%s%s\n",prefix,n,(n++?" ":""),$0) }
' file
2022-06-09 10:00:01,000:00000:00000 string1
2022-06-09 10:00:01,000:00000:00001        string2
2022-06-09 10:00:01,000:00000:00002 string3
2022-06-09 10:00:02,000:00000:00000 string4  
2022-06-09 10:00:02,000:00000:00001 string5
2022-06-09 10:00:02,000:00000:00002 string6
2022-06-09 10:00:03,000:00000:00000 string7
2022-06-09 10:00:04,000:00000:00000 string8
2022-06-09 10:00:05,000:00000:00000 string9
Fravadona
  • 13,917
  • 1
  • 23
  • 35