0

I have data as

Cell A
function (A+B)
Cell B
function (A^B)
function (A+B)
function (A1A2)
Cell C
function (A1A2)
function ((B1+B2)A2)

I want output as

Cell A
function (A+B)
Cell B
function (A^B)
Cell C
function (A1A2)

I want to get only the 1st function line printed if function is repeated.

I tried

awk "/function/ && !a[$0]++{print;next} !/function/{delete a;print}" file

But I am getting no change in data.

Shreya
  • 639
  • 4
  • 11
  • 1
    `awk '/Cell/{print; getline; print}' file`? – Cyrus Feb 23 '21 at 11:15
  • If you're using double quotes because you're running on Windows then you should add that tag to the question. If that's not the reason then - don't do that. In Unix always quote strings (including scripts) with single quotes until if/when you **need** double quotes to get the shell involved to interpret the string. – Ed Morton Feb 23 '21 at 14:56

4 Answers4

2

Like @Cyrus' in the comments, my first thought was printing record at Cell and then next line, but if you need it the other way around:

$ awk '/function/&&f{print p ORS $0;f=0}{p=$0}/Cell/{f=1}' file

Output:

Cell A
function (A+B)
Cell B
function (A^B)
Cell C
function (A1A2)

Explained:

$ awk '
/function/ && f {   # seeing "function" when the f flag is up
    print p ORS $0  # print stored previous and current records
    f=0             # flag down
}
{
    p=$0            # store current as previous for next round
}
/Cell/ {            # at "Cell"
    f=1             # flag up
}' file 

(You could store the Cell as the f flag's value and print when that is set:

$ awk '/function/&&f{print f ORS $0;f=""}/Cell/{f=$0}' 

)

James Brown
  • 36,089
  • 7
  • 43
  • 59
2

This might work for you (GNU sed):

sed -n 'N;/^Cell.*\nfunction/p;D' file

Turn on explicit printing by setting the option -n.

Append the next line.

If the first line begins Cell and the second line begins function print them.

Delete the first line and repeat

potong
  • 55,640
  • 6
  • 51
  • 83
1
$ awk '/Cell/{c=2} c&&c--' file
Cell A
function (A+B)
Cell B
function (A^B)
Cell C
function (A1A2)

or if "Cell" isn't always the text in the non-function block:

$ awk '!/function/{c=2} c&&c--' file
Cell A
function (A+B)
Cell B
function (A^B)
Cell C
function (A1A2)

See Printing with sed or awk a line following a matching pattern for details.

Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

this trick will do

$ uniq -w8 file

Cell A
function (A+B)
Cell B
function (A^B)
Cell C
function (A1A2)

compares "function".length() -> 8 characters. uniq will eliminate contiguous repeated entries, so always the first one will be selected.

If your Cell lines are not repeated contiguously this will be the shortest code.

karakfa
  • 66,216
  • 7
  • 41
  • 56