0

I have a text file that sometimes-not always- will have an array with a unique name like this

unique_array=(1,2,3,4,5,6)

I would like to find the size of the array-6 in the above example- when it exists and skip it or return -1 if it doesnt exist.

grepping the file will tell me if the array exists but not how to find its size.

The array can fill multiple lines like

unique_array=(1,2,3,
   4,5,6,
7,8,9,10)

Some of the elements in the array can be negative as in

unique_array=(1,2,-3,
   4,5,6,
7,8,-9,10)
user2175783
  • 1,291
  • 1
  • 12
  • 28

7 Answers7

4
awk -v RS=\) -F, '/unique_array=\(/ {print /[0-9]/?NF:0}' file.txt
  • -v RS=\) - delimit records by ) instead of newlines
  • -F, - delimit fields by , instead of whitespace
  • /unique_array=(/ - look for a record containing the unique identifier
  • /[0-9]?NF:0 - if record contains digit, number of fields (ie. commas+1), otherwise 0

There is a bad bug in the code above: commas preceding the array may be erroneously counted. A fix is to truncate the prefix:

awk -v RS=\) -F, 'sub(/.*unique_array=\(/,"") {print /[0-9]/?NF:0}' file.txt
jhnc
  • 11,310
  • 1
  • 9
  • 26
3

Your specifications are woefully incomplete, but guessing a bit as to what you are actually looking for, try this at least as a starting point.

awk '/^unique_array=\(/ { in_array = 1; n = split(",", arr, $0); next }
    in_array && /\)/ { sub(/\)./, ""); quit = 1 }
    in_array { n += split(",", arr, $0);
      if (quit) { print n; in_array = quit = n = 0 } }' file

We keep a state variable in_array which tells us whether we are currently in a region which contains the array. This gets set to 1 when we see the beginning of the array, and back to 0 when we see the closing parenthesis. At this point, we remove the closing parenthesis and everything after it, and set a second variable quit to trigger the finishing logic in the next condition. The last condition performs two tasks; it adds the items from this line to the count in n, and then checks if quit is true; if it is, we are at the end of the array, and print the number of elements.

This will simply print nothing if the array was not found. You could embellish the script to set a different exit code or print -1 if you like, but these details seem like unnecessary complications for a simple script.

tripleee
  • 175,061
  • 34
  • 275
  • 318
1

Using sed and declare -a. The test file is like this:

$ cat f
saa

dfsaf

sdgdsag unique_array=(1,2,3,
   4,5,6,
7,8,9,10) sdfgadfg

sdgs
sdgs
sfsaf(sdg)

Testing:

$ declare -a "$(sed  -n  '/unique_array=(/,/)/s/,/ /gp' f | \
                sed 's/.*\(unique_array\)/\1/;s/).*/)/;
                     s/`.*`//g')"

$ echo ${unique_array[@]}
1 2 3 4 5 6 7 8 9 10

And then you can do whatever you want with ${unique_array[@]}

Ivan
  • 6,188
  • 1
  • 16
  • 23
1

I think what you probably want is this, using GNU awk for multi-char RS and RT and word boundaries:

$ awk -v RS='\\<unique_array=[(][^)]*)' 'RT{exit} END{print (RT ? gsub(/,/,"",RT)+1 : -1)}' file
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
1

With your shown samples please try following awk.

awk -v RS=  '
{
  while(match($0,/\<unique_array=[(][^)]*\)/)){
    line=substr($0,RSTART,RLENGTH)
    gsub(/[[:space:]]*\n[[:space:]]*|(^|\n)unique_array=\(|(\)$|\)\n)/,"",line)
    print gsub(/,/,"&",line)+1
    $0=substr($0,RSTART+RLENGTH)
  }
}
'  Input_file
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

With GNU grep or similar that support -z and -o options:

grep -zo 'unique_array=([^)]*)' file.txt | tr -dc =, | wc -c
  • -z - (effectively) treat file as a single line
  • -o - only output the match
  • tr -dc =, - strip everything except = and ,
  • wc -c - count the result

Note: both one- and zero-element arrays will be treated as being size 1. Will return 0 rather than -1 if not found.

jhnc
  • 11,310
  • 1
  • 9
  • 26
-2

here's an awk solution that works with gawk, mawk 1/2, and nawk :

TEST INPUT

saa

dfsaf

sdgdsag unique_array=(1,2,3,
   4,5,6,
7,8,9,10) sdfgadfg

sdgs
sdgs
sfsaf(sdg)

CODE

{m,n,g}awk '
BEGIN {  __ = "-1:_ERR_NOT_FOUND_"
        RS =  "^$" (_ = OFS = "")
       FS =   "(^|[ \t-\r]?)unique[_]array[=][(]"
     ___ =    "[)].*$|[^0-9,.+-]" 
}  $!NF = NR < NF ? $(gsub(___,_)*_) : __'

OUTPUT

1,2,3,4,5,6,7,8,9,10
RARE Kpop Manifesto
  • 2,453
  • 3
  • 11