1

I have a text file and want to convert it to csv file before to convert it, i want to add a header to text file so that the csv file has the same header. I have one thousand columns in text file and want to have one thousand column name. As a side note, the content of the text file is just rows of some numbers which is separated by comma ",". Is there any way to add the header line in bash?

I tried the way below and didn't work. I did the command below first in python.

> for i in range(1001):
   > print "col" + "_" + "i"

save the output of this in text file with this command (python header.py >> header.txt) and add the output of this in format of text file to the original text file that i have like below:

cat header.txt filename.txt > newfilename.txt

then convert the txt file to csv file with "mv newfilename.txt newfilename.csv". But unfortunately this way doesn't work as the header line has double number of other rows for some reason. I would appreciate any help to make this problem solve.

user8034918
  • 441
  • 1
  • 9
  • 20
  • 3
    Yes, there are many ways! If you need a specific answer you should ask a specific question. Where is the header coming from. How do you convert text file to csv? What is the current field delimiter etc. – karakfa Aug 27 '18 at 17:52
  • What you've pasted looks like python code, not bash. Did you not want to do this in python for whatever reason? – paulski Aug 27 '18 at 18:04
  • @paulski i didn't know how to do it in bash. That's why i ended up doing part of this in python. – user8034918 Aug 27 '18 at 18:06
  • One problem might be this line: `python header.py >> header.txt`, which APPENDS to header.txt rather than replacing the contents. Does your header.txt have just the one line in it? – paulski Aug 27 '18 at 18:13

4 Answers4

2

based on the description your file is already comma separated, so is a csv file. You just want to add a column number header line.

$ awk -F, 'NR==1{for(i=1;i<=NF;i++) printf "col_%d%s", $i,(i==NF?ORS:FS)}1' file

will add column headers as many as the fields in the first row of the file

e.g.

$ seq 5 | paste -sd, |      # create 1,2,3,4,5 as a test input
  awk -F, 'NR==1{for(i=1;i<=NF;i++) printf "col_%d%s", i, (i==NF?ORS:FS)}1'

col_1,col_2,col_3,col_4,col_5
1,2,3,4,5
karakfa
  • 66,216
  • 7
  • 41
  • 56
1

You can generate the column names in bash using one of the options below. Each example generates a header.txt file. You already have code to add this to the beginning of your file as a header.

Using bash loops

Bash loops for this many iterations will be inefficient, but will work.

for i in {1..10}; do
  echo -n "col_$i "
done > header.txt
echo >> header.txt

or using seq

for i in $(seq 1 1000); do
  echo -n "col_$i "
done > header.txt
echo >> header.txt

Using seq only

Using seq alone will be more efficient.

seq -f "col_%g" -s" " 1 1000 > header.txt
JGC
  • 5,725
  • 1
  • 32
  • 30
  • seq supports sequence prefixes. Your solution requires 1,000 iterations, which seems needlessly inefficient. In addition, if you're going to loop anyway, why not use the Bash-centric `for i in {1..1001}` instead of spawning seq? – Todd A. Jacobs Aug 27 '18 at 20:55
0

Use seq and sed

You can use the seq utility to construct your CSV header, with a little minor help from Bash expansions. You can then insert the new header row into your existing CSV file, or concatenate the header with your data.

For example:

# construct a quoted CSV header
columns=$(seq -f '"col_%g"' -s', ' 1 1001)

# strip the trailing comma
columns="${columns%,*}"

# insert headers as first line of foo.csv with GNU sed
sed -i -e "1 i\\${columns}" /tmp/foo.csv

Caveats

If you don't have GNU sed, you can also use cat, sponge, or other tools to concatenate your header and data, although most of your concatenation options will require redirection to a new combined file to avoid clobbering your existing data.

For example, given /tmp/data.csv as your original data file:

seq -f '"col_%g"' -s', ' 1 1001 > /tmp/header.csv
sed -i -e 's/,[[:space:]]*$//' /tmp/header.csv
cat /tmp/header /tmp/data > /tmp/new_file.csv

Also, note that while Bash solutions that avoid calling standard utilities are possible, doing it in pure Bash might be too slow or memory intensive for large data sets.

Your mileage may vary.

Todd A. Jacobs
  • 81,402
  • 15
  • 141
  • 199
  • thanks for your detailed response. but i got the error below from the third line of code! – user8034918 Aug 27 '18 at 20:34
  • -bash: /bin/sed: Argument list too long – user8034918 Aug 27 '18 at 20:34
  • @user8034918 That's a system-specific limitation; it works for me. Per my answer, if you can't use GNU sed or get it working as an in-place edit because of argument list limitations, then just concatenate the header and data files. – Todd A. Jacobs Aug 27 '18 at 20:51
  • @user8034918 See also [this answer](https://stackoverflow.com/a/11475732/1301972) about "argument list too long,” and [using built-ins as workarounds](https://stackoverflow.com/q/47443380/1301972). – Todd A. Jacobs Aug 27 '18 at 20:59
0
printf "col%s," {1..100} |
sed 's/,$//' |
cat - filename.txt >newfilename.txt

I believe sed should supply the missing final newline as a side effect. If not, maybe try 's/,$/\n/' though this isn't entirely portable, either. You could probably replace the cat with sed as well, something like

... | sed 's/,$//;r filename.txt'

but again, I'm not entirely sure how portable this is.

tripleee
  • 175,061
  • 34
  • 275
  • 318