2

For example, I have a csv file as follow,

12345432|1346283301|5676438284971|13564357342151697 ...
87540258|1356433301|1125438284971|135643643462151697 ...
67323266|1356563471|1823543828471|13564386436651697 ...

and hundreds more columns but I want to remove first three columns and save to a new file(if possible same file would be better for me)

This is the result I want.

13564357342151697 ...
135643643462151697 ...
13564386436651697 ...

I have been looking and trying but I am not able to do it. And below is the code I have.

awk -F'|' '{print $1 > "newfile"; sub(/^[^|]+\|/,"")}1' old.csv > new.csv

Appreciate if someone can help me. Thank you.

codeforester
  • 39,467
  • 16
  • 112
  • 140
Danny
  • 65
  • 9

4 Answers4

4

You can use cut :

cut -f4- -d'|' old.csv  > new.csv
Bertrand Martel
  • 42,756
  • 16
  • 135
  • 159
  • 1
    More efficient than `awk`. – codeforester Mar 07 '17 at 03:42
  • Thank you so much. Do you guys know is it possible I can cut first three columns for 10 files at once? Do I need to use loop? – Danny Mar 07 '17 at 03:51
  • @HengUnn: Please post sample Input and expected output so that it could be more clear then. – RavinderSingh13 Mar 07 '17 at 03:52
  • @RavinderSingh13 The sample input and output are just same as what I posted above. Just I have 10 files which all I have to drop first three columns as well. I'm wondering if there are any ways to do it together for 10 files? So I just need to drop first three columns for 10 files for one time only. – Danny Mar 07 '17 at 03:58
  • @HengUnn: just try with my awk above command and then Input_file1 Input_file2 Input_file3 ............Input_file10 and let me know if that helps you. – RavinderSingh13 Mar 07 '17 at 04:00
1

This is what you're looking for:

awk -F '|' '{$1=$2=$3=""; print $0}' oldfile > newfile

But it will have leading whitespaces so then add the following substitution:

sub(/^[ \t\|]+/,"") --> changed to sub(/^[ \t\|]+/,"") (escaped leading '|' from column removal)

awk -F '|' '{$1=$2=$3="";OFS="|";sub(/^[ \t\|]+/,"") ;print $0}' oldFile > newFile

SVTAnthony
  • 451
  • 2
  • 5
  • Thanks @SVTAnthony I tried with your code but my output was messed up. All columns are being put into one column. – Danny Mar 07 '17 at 03:59
  • use `awk -i inplace -F '|' '{$1=$2=$3=""; gsub(/\s+/,"|");}1' data.txt` – Bertrand Martel Mar 07 '17 at 04:19
  • I just edited the example to fit your needs. What I missed was the Output Field Separator(OFS), I also did some clean up for the leading separators that the removal of the fields caused. Thanks for letting me know. – SVTAnthony Mar 07 '17 at 08:05
  • Should set `OFS` _before_ setting `$1=$2=$3=""` -- preferably in BEGIN or commandline `-vOFS='|'` -- and then you only need `sub(/^\|\|\|/,"")` or even simpler `print substr($0,4)` – dave_thompson_085 Mar 08 '17 at 21:26
1

@Heng: try:

awk -F"|" '{for(i=4;i<=NF;i++){printf("%s%s",$i,i==NF?"":"|")};print ""}'  Input_file

OR

awk -F"|" '{for(i=4;i<=NF;i++){printf("%s%s",$i,i==NF?"\n":"|")};}'  Input_file

you could re-direct this command's output into a file as per your need.

EDIT:

awk -F"|" 'FNR==1{++e;fi="REPORT_A1_"e;} {for(i=4;i<=NF;i++){printf("%s%s",$i,i==NF?"\n":"|") > fi}}'   Input_file1  Input_file2  Input_file3
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thanks @RavinderSingh13 ! Cut works too. Do you know is it possible I can remove first three columns for 10 files at once? Do I need to use loop with cut or awk? – Danny Mar 07 '17 at 03:54
  • Yes, with awk we could do it, but we need to know your requirement and expected output to help you more. kindly do post the same with details. – RavinderSingh13 Mar 07 '17 at 03:57
  • I want to save the output to csv files. If I do it like what you suggested Input_file1 Input_file2 Input_file3 ............Input_file10, is it means I have to output it to separate file names too? Because all my 10 files have similar names like REPORT_A1_1, REPORT_A1_2, REPORT_A1_3, ........REPORT_A1_10, so I'm wondering if I can have a function to call all these 10 files to be dropped in sequence? – Danny Mar 07 '17 at 04:06
  • I commented here. @RavinderSingh13 – Danny Mar 07 '17 at 04:14
  • please checkout my edit of my answer, you could like it if your question is solved too :) – RavinderSingh13 Mar 07 '17 at 04:25
  • It works too man. You are awesome. Thank you. One more question, after dropping first three columns, is it possible to replace it to original file name and do the same for 10 files at once also? @RavinderSingh13 – Danny Mar 07 '17 at 05:37
  • I didn't get, you want to put all 10 Input_files output into a single same Input_file ? – RavinderSingh13 Mar 07 '17 at 06:03
  • If above is the case then, you could simply put command's output to a temp_input_file and then rename it to Input_file, let me know if this helps. – RavinderSingh13 Mar 07 '17 at 06:03
  • No, this is not what I mean. Let me give a example, For example REPORT_A1_1 has been removed for first three columns right? And I want the data to save back to the same file name, REPORT_A1_1. And I want to do this for all 10 files at one time also, is it possible? Sorry for unclear information and thanks for being patient too. @RavinderSingh13 – Danny Mar 07 '17 at 08:11
0
awk -F\| '{print $NF}' file >newfile

13564357342151697 ...
135643643462151697 ...
13564386436651697 ...
Claes Wikner
  • 1,457
  • 1
  • 9
  • 8
  • 1
    Hi!, please read http://stackoverflow.com/help/how-to-answer and try to provide some explanation why this code. Thanks! – Eel Lee Mar 09 '17 at 15:46
  • Whilst this code snippet is welcome, and may provide some help, it would be [greatly improved if it included an explanation](//meta.stackexchange.com/q/114762) of *how* and *why* this solves the problem. Remember that you are answering the question for readers in the future, not just the person asking now! Please [edit] your answer to add explanation, and give an indication of what limitations and assumptions apply. – Toby Speight Mar 10 '17 at 11:19