-1

I have below set of data and need somewhat transposed. I am struggling with script. Any help would be appreciated. All column/values are dynamic

File format:

ID    FieldName          FieldValue

1   Rooms Required?        Yes

1   Country of Meeting     US

2   Rooms Required? 

2   Country of Meeting  

3   Rooms Required? Yes

3   Country of Meeting  US

4   Rooms Required? No

4   Country of Meeting  BL


Output Required:

ID  Rooms Required? Country of Meeting

1     Yes                   US

2       

3     Yes                   US

4     No                     BL

Please help

Sunny
  • 9
  • 3
  • 1
    possible duplicate of https://stackoverflow.com/questions/1729824/an-efficient-way-to-transpose-a-file-in-bash – Jan Myszkier Oct 01 '18 at 20:50
  • 1
    This isn't an actual duplicate, it's sorting the rows based on 2 different columns – Sunny Patel Oct 01 '18 at 20:53
  • What is the output and input field separator? Are the additional empty lines also input/output? Can fields have empty values? – KamilCuk Oct 01 '18 at 20:55
  • You can take any pipe as input separator. Output also as Pipe. First column is ID , so cannot be empty. Second column will have column name so cannot be empty. Third column can be empty as its a value. – Sunny Oct 01 '18 at 20:59

2 Answers2

0

Here is the general idea using join (and bash for the shell):

$ echo ID Rooms Country; \
  join -j1 -o '0 1.4 2.5' -a1 -a2 -e- <(grep -F Rooms data.txt) <(grep -F Country data.txt)

ID Rooms Country
1 Yes US
2 - -
3 Yes US
4 No BL

Adapt it to your needs.

xhienne
  • 5,738
  • 1
  • 15
  • 34
  • Thanks. However I need columns names well. Also can you explain what is Rooms data.txt and Country data.txt – Sunny Oct 02 '18 at 02:22
  • `data.txt` is the file containing all the data, the same you published in your question. "Rooms" and "Country" are words in columns 2 in that file. As for the header, just add an `echo` line (I'll change my answer) – xhienne Oct 02 '18 at 08:54
  • You are hardcoding the columns. I dont want that. – Sunny Oct 02 '18 at 14:58
0

A pure awk solution based on your fields separated by the tab character '\t' is as follows:

awk 'BEGIN { FS = "\t"; PROCINFO["sorted_in"] = "@ind_num_asc" } { if ( $1 !~ /^[0-9]+$/ ) next; A[$1][$2] = $3; H[$2] } END { printf "ID"; for (h in H) printf "\t" h; for (i in A) { printf "\n\n" i; for (j in A[i]) printf "\t" A[i][j] } print "\n" }' filename

And broken down:

awk 'BEGIN {
    FS = "\t"  #Set Field Separator as the Tab
    PROCINFO["sorted_in"] = "@ind_num_asc"  #Set array order as numbers
}
{
    if ( $1 !~ /^[0-9]+$/ )  #Skip all rows without numeric ID
        next
    A[$1][$2] = $3  #Store value in multi-dimensional array
    H[$2]           #Store header name
}
END {
    printf "ID"
    for (h in H)    #Print all headers found
        printf "\t" h
    for (i in A) {  #Print each record with corresponding values
        printf "\n\n" i
        for (j in A[i])
            printf "\t" A[i][j]
    }
    print "\n"
}' filename

Let me know if any further explanations are necessary. This will work with as many fields as you have set and in any order. If records don't have all the same fields, your output may look jagged.

Sunny Patel
  • 7,830
  • 2
  • 31
  • 46
  • Thanks for the response. However, its not working as expected. I am getting data as below. Its not displaying value for Rooms Required? ID Rooms Required? Country of Meeting 1 US 3 US – Sunny Oct 02 '18 at 02:18
  • @Sunny, You'll have to expand on that, I'm not sure how different your results are. This is all hinged on the assumption that your fields are separated with a single tab `'\t'` character. I made this assumption because when I pasted your code in my IDE, the spacing lined up as if the spaces were originally tabs. – Sunny Patel Oct 03 '18 at 14:46