I am coding using bash on terminal through a docker container on my mac. I am struggling to figure out how to remove the last 2 columns on my TSV file. It has 7 total and the last 2 are not needed for my work and are required to be removed.
Edit: The first picture is the original data file, the second is what the code is doing and it is deleting some random entries from the column. The third picture is what the end result of this program should do. The month and year columns I am struggling with also but I deleted the code and tried to simplify the data first.
I tried using awk and using NF = NF - 2 which does remove the last 2 columns but for some reason deletes some of the data I have in my 5th column which I need. So whilst I got the column deletion I needed, the code did a little extra. Here is the code:
preprocess() {
31 input_file="$1"
32
33 # Extract the base name of the input file
34 base_name=$(basename "$input_file" .tsv)
35
36 # Create the new output file name
37 output_file="${base_name}_clean.tsv"
38
39 awk -F'\t' 'BEGIN{OFS=FS}
40 {
41 NF = NF - 2
42
43 print
44 }' "$input_file" > "$output_file"
45 }
I Have a few other lines but they shouldn't cause any issues. They just check the file exists etc.