I have a text file supplied.tsv with filepaths and a column with filesize as follows, I want to ensure that the filenames are unique
./statistics/variant_calls/v12_HG03486_hgsvc_pbsq2-ccs_1000.snv.QUAL10.GQ100.vcf.cluster.stats 676
./statistics/variant_calls/v12_HG03486_hgsvc_pbsq2-ccs_1000.snv.QUAL10.GQ100.vcf.stats 788
./v12_config_20200721-092246_HG02818_HG03125_HG03486.json 887
./v12_config_20200721-092246_HG02818_HG03125_HG03486.json 887
./variant_calls/v12_HG02818_hgsvc_pbsq2-ccs_1000.wh-phased.vcf.bgz 566
./variant_calls/v12_HG02818_hgsvc_pbsq2-ccs_1000.wh-phased.vcf.bgz 566
./variant_calls/v12_HG02818_hgsvc_pbsq2-ccs_1000.wh-phased.vcf.bgz.tbi 772
Expected output Yes all unique filenames
MY PLAN I will extract the first column from file
awk -F"\t" '{print $1}' supplied.tsv > supplied_firstcolumn.txt
Extract filename and then check the distinct lines. Kindly let me know how to do this efficiently.