0

For eg

I have a huge file with the pattern below. The search pattern is the date and i have to prefix the date for the next n number of rows. The date pattern is at the exact nth repetitive line and the next few lines are also a fixed number.

date 1  
line 1  
line 2  
line 3  
date 2  
line 4  
line 5  
line 6  
date 3  
line 7  
line 8  
line 9  

The above pattern should be transformed to look like below

date 1 line 1  
date 1 line 2  
date 1 line 3  
date 2 line 4  
date 2 line 5  
date 2 line 6  
date 3 line 7  
date 3 line 8  
date 3 line 9  

Is there a simple command using sed/awk which would do this or should i go ahead and write a bash/python script ?

Austin
  • 25,759
  • 4
  • 25
  • 48

4 Answers4

2

With awk:

awk -v pattern="date" '$0~pattern{p=$0;next}{print p,$0}' file

Change the pattern variable to whatever matches your file.

oliv
  • 12,690
  • 25
  • 45
0

This is one way in Python:

with open('file.txt') as f:
    for line in f:
        if line.startswith('date'):
            date = line
            continue
        print(date, line)

Output:

date 1 line 1
date 1 line 2
date 1 line 3
date 2 line 4
date 2 line 5
date 2 line 6
date 3 line 7
date 3 line 8
date 3 line 9
Austin
  • 25,759
  • 4
  • 25
  • 48
  • Downvoter, please leave a comment. – Austin Jun 07 '18 at 12:32
  • Any time you see bulk downvoting like this you can be sure it was @jww. I'm upvoting all answers and the question again to compensate. – Ed Morton Jun 07 '18 at 13:23
  • https://stackoverflow.com/users/608639/jww. Any time he doesn't like something about a question he downvotes everyone who answered it, eg see https://stackoverflow.com/questions/50635955/sed-to-split-based-on-quotes-handling-comma-within-quotes-along-with-data-withou#comment88310460_50635955, https://stackoverflow.com/questions/50635955/sed-to-split-based-on-quotes-handling-comma-within-quotes-along-with-data-withou#comment88310460_50635955, https://stackoverflow.com/questions/50559897/replace-line-with-space-and-backslash-with-a-string-containing-spaces/50559951#comment88145649_50559951 – Ed Morton Jun 07 '18 at 13:33
0

A simple python scrip that could do this:

k=nth_rep
line_k=0
with open("the_file.txt") as f:
    for line in f:
        if (line_k % k == 0):
            date_line = str(line)
            line = '\n'
        else:
            line = line + " " + date_line
        line_k= line_k + 1

not tested

Cezar Cobuz
  • 1,077
  • 1
  • 12
  • 34
0

There is awk solution:

awk '{if ($1 == "date") a_date = $0} { if ($1 == "line") print a_date $0}'

Explenation: if line first column equals date store this line. if line first column equals line print stored value and this line.

Luk
  • 2,186
  • 2
  • 11
  • 32