I have data from a Internet table in a text file. I need to convert this file to .csv standard (comma-separated, etc.) and to clean it. E.g.:
Data Fechamento Variação Variação (%) Abertura Máxima Mínima Volume
30 Abr 2020 2,00 0,76 61,29% 1,99 2,10 1,80 152.100
29 Abr 2020 1,24 -0,44 -26,19% 1,28 1,71 1,20 125.700
My code:
echo -e "File: \c"
read nome_arq
arq=$(<$nome_arq)
arq=$(echo $arq | sed 's/%//g')
arq=$(echo $arq | sed 's/()//g')
arq=$(echo $arq | sed 's/\.//g')
arq=$(echo $arq | sed 's/\+//g')
arq=$(echo $arq | sed 's/ Abr /_04_/g')
arq=$(echo $arq | sed 's/ Mar /\_03_/g')
arq=$(echo $arq | sed 's/\,/\./g')
arq=$(echo $arq | sed 's/\ /\,/g')
append="_clean"
echo -e $arq >> $nome_arq$append
However, there is no line breaks in output, the output file has just a single line:
Data,Fechamento,Variação,Variação,Abertura,Máxima,Mínima,Volume,30_04_2020,2.00,0.76,61.29,1.99,2.10,1.80,152100,29_04_2020,1.24,-0.44,-26.19,1.28,1.71,1.20,125700,
What can I do to keep the original line breaks in my output?
Edit May, 5:
I get my result with the following code:
append="_clean"
cat $nome_arq|while read z;do echo "$z"|sed "s/\s\+/\"xxxx\"/g; s/^/\"/g; s/$/\"/g";done >> $nome_arq$append
sed 's/%//g' $nome_arq$append > output
rm $nome_arq$append
sed 's/()//g' output > output1
rm output
sed 's/\.//g' output1 > output2
rm output1
sed 's/\+//g' output2 > output3
rm output2
sed 's/\"//g' output3 > output4
rm output3
sed 's/xxxxMaixxxx/_05_/g' output4 > output5
rm output4
sed 's/xxxxAbrxxxx/\_04_/g' output5 > output6
rm output5
sed 's/xxxxMarxxxx/\_03_/g' output6 > output7
rm output6
sed 's/,/\./g' output7 > output8
rm output7
sed 's/xxxx/,/g' output8 > output9
rm output8
Obviously, it's far from optmized. I couldn't use "tr" command, for example. How can I get my script leaner?
Edit May, 13
The final code, with some modification:
echo -e "Arquivo nao-estruturado: \c"
read nome_arq
cp $nome_arq $nome_arq"_clean"
arq=$nome_arq"_clean"
sed -i 's/%//g;s/()//g;s/\.//g;s/\+//g;s/ Mai /_05_/g;s/ Abr /_04_/g;s/ Mar /\_03_/g;s/\,/\./g' $arq
sed -r -i 's/[[:space:]]+/,/g' $arq
sed -i 's/Data,Fechamento,Variação,Variação,Abertura,Máxima,Mínima,Volume/ref.date,price.close,var,var.perc,price.open,price.high,price.low,volume/g' $arq