I have a file (data.rdb) with the following format:
date star jdb texp
2013-11-22 epsInd 2400000.23551544 100.
2013-11-22 epsInd 2400000.23551544 100.
2013-11-22 epsInd 2400000.23551544 100.
2013-11-22 HD217987 2400000.23551544 900.
2013-11-22 TOI-134 2400000.23551544 900.
2013-11-22 tauCet 2400000.23551544 60.
2013-11-22 BD+01316 2400000.23551544 300.
2013-11-22 BD+01316 2400000.23551544 300.
2013-11-22 BD+01316 2400000.23551544 300.
2013-11-22 BD+01316 2400000.23551544 300.
some properties:
- all columns are tab separated
- the columns do not have the same width
- the cells might not have the same length
- the file will have much more columns than presented and a few hundreds of lines
- the columns names can be any word, with no tabs or spaces or special characters
How can I move the column with header jdb
to be the first column?
Some constrains:
- this will be applied to multiple files, and the column
jdb
will not always appear at the same position - ideally the order of the remaining columns should not change
jdb
will always be the 1st column in the end.
Thanks!
UPDATE
this is the awk
block I am using at the moment:
BEGIN {
numCols = split(column_list,cols)
OFS="\t"
}
{ sub(/\r$/,"") }
NR==1 {
for (fldNr=1; fldNr<=NF; fldNr++) {
f[$fldNr] = fldNr
}
}
{
for (colNr=1; colNr<=numCols; colNr++) {
colName = cols[colNr]
colVal = (colNr=1 ? $(f["jdb"]): (colNr <= $(f["jdb"] ?
$(f[colName] -1) : $(f[colName]))))
printf "%s%s", colVal, (colNr<numCols ? OFS : ORS)
}
}
but it gives me no output... What I (think I) did:
assign each column header value a number
iterate over a range
2.1 if iterator = 0 -> print column
jdb
2.2 if iterator <= column number of jdb -> print column number
iterator - 1
2.3 if iterator > column number of jdb -> print column number
iterator
(this is on the continuation of the question I posed in https://stackoverflow.com/questions/56132249/extract-columns-from-tab-separated-file)
END RESULT
In the end I ended up using @Ed Morton's solution:
$ cat move_to_first.awk
BEGIN { FS=OFS="\t" }
NR==1 {
cols[++numCols] = tgt
for (fldNr=1; fldNr<=NF; fldNr++) {
f[$fldNr] = fldNr
if ($fldNr != tgt) {
cols[++numCols] = $fldNr
}
}
}
{
for (colNr=1; colNr<=numCols; colNr++) {
colName = cols[colNr]
printf "%s%s", $(f[colName]), (colNr<numCols ? OFS : ORS)
}
}
As a curiosity, to move the column to the last position, the above code just needs the following modification:
$ cat move_to_last.awk
BEGIN {
FS=OFS="\t"
}
NR==1 {
for (fldNr=1; fldNr<=NF; fldNr++) {
f[$fldNr] = fldNr
if ($fldNr != target) {
cols[++numCols] = $fldNr
}
}
cols[++numCols] = target
}
{
for (colNr=1; colNr<=numCols; colNr++) {
colName = cols[colNr]
printf "%s%s", $(f[colName]), (colNr<numCols ? OFS : ORS)
}
}