0

I have an XML file of the format:

<classes>

 <subject lb="Fall Sem 2020">
  <name>Operating System</name>
  <credit>3</credit>
  <type>Theory</type>
  <faculty>Prof. XYZ</faculty> 
 </subject>

 <subject lb="Spring Sem 2020">
  <name>Web Development</name>
  <credit>3</credit>
  <type>Lab</type>
 </subject>

 <subject lb="Fall Sem 2021">
  <name>Computer Network</name>
  <credit>3</credit>
  <type>Theory</type>
  <faculty>Prof. ABC</faculty> 
 </subject>

 <subject lb="Spring Sem 2021">
  <name>Software Engineering</name>
  <credit>3</credit>
  <type>Lab</type>
 </subject>

</classes>

I'm able to get the desired result using sed command. i.e. sed -En 's/.* lb="([^"]+)".*/\1/p' file

Output:

Fall Sem 2020
Spring Sem 2020
Fall Sem 2021
Spring Sem 2021

I want this output to be stored in an array. i.e.

arr[0]="Fall Sem 2020"

My try: arr=($(sed -En 's/.* lb="([^"]+)".*/\1/p' file)) But in this case, I'm getting individual element as an array element. i.e. arr[0]="Fall"

Bogota
  • 401
  • 4
  • 15

3 Answers3

1

With bash:

# disable job control and enable lastpipe to run mapfile in current environment
set +m; shopt -s lastpipe

sed -En 's/.* lb="([^"]+)".*/\1/p' file | mapfile -t arr

declare -p arr

Output:

declare -a arr=([0]="Fall Sem 2020" [1]="Spring Sem 2020" [2]="Fall Sem 2021" [3]="Spring Sem 2021")

In a script job control is disabled by default.

Cyrus
  • 84,225
  • 14
  • 89
  • 153
1

Could you please try following(considering that OP doesn't have xml tools and can't install them too).

IFS=',';array=( $(
awk '
BEGIN{ OFS="," }
/<subject lb="/{
  match($0,/".*"/)
  val=(val?val OFS:"")substr($0,RSTART+1,RLENGTH-2)
}
END{
  print val
}' Input_file))

To print all elements of array use:

echo ${array[@]}
Fall Sem 2020 Spring Sem 2020 Fall Sem 2021 Spring Sem 2021

To print specific element use:

echo ${array[0]}
Fall Sem 2020
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
0

You could use an XML aware tool such as XmlStarlet to extract the attribute you want, and then use readarray and process substitution to read the output into an array:

$ readarray -t arr < <(xml sel -t -v 'classes/subject/@lb' infile.xml)
$ declare -p arr
declare -a arr=([0]="Fall Sem 2020" [1]="Spring Sem 2020" [2]="Fall Sem 2021" [3]="Spring Sem 2021")
Benjamin W.
  • 46,058
  • 19
  • 106
  • 116