I'm new here and I'm analyzing certain data. Inspecting the data, I found some issues in the strings of a column. as you can see, there are some string with duplicate words. My idea is to remove only them. could you suggest me a way to do it? There are about 30.000 rows and only the ones with WT_d8_r2 report this error. Thank you
KO_d6_r1_AAACATGCACCTAATG-1 7
KO_d6_r1_AAACATGCAGGAATCG-1 8
KO_d6_r1_AAACATGCAGGATAAC-1 18
KO_d6_r1_AAACCAACAATATAGG-1 22
KO_d6_r1_AAACCGAAGCGAGTAA-1 8
WT_d8_r2_WT_d8_r2_AGGCTAAAGTCAATCA-1 20
WT_d8_r2_WT_d8_r2_AGGGCTACAATGAATG-1 3
WT_d8_r2_WT_d8_r2_AGGGCTACACACTAAT-1 3
WT_d8_r2_WT_d8_r2_AGGGCTACAGCTTACA-1 18
WT_d8_r2_WT_d8_r2_AGGGCTACATAGCTGC-1 9
WT_d8_r2_WT_d8_r2_AGGGTTGCAAAGCTCC-1 19
WT_d8_r2_WT_d8_r2_AGGGTTGCAACCCTAA-1 4
WT_d8_r2_WT_d8_r2_AGGGTTGCAGCTCAAC-1 2
I'm expcting this:
KO_d6_r1_AAACATGCACCTAATG-1 7
KO_d6_r1_AAACATGCAGGAATCG-1 8
KO_d6_r1_AAACATGCAGGATAAC-1 18
KO_d6_r1_AAACCAACAATATAGG-1 22
KO_d6_r1_AAACCGAAGCGAGTAA-1 8
WT_d8_r2_AGGCTAAAGTCAATCA-1 20
WT_d8_r2_AGGGCTACAATGAATG-1 3
WT_d8_r2_AGGGCTACACACTAAT-1 3
WT_d8_r2_AGGGCTACAGCTTACA-1 18
WT_d8_r2_AGGGCTACATAGCTGC-1 9
WT_d8_r2_AGGGTTGCAAAGCTCC-1 19
WT_d8_r2_AGGGTTGCAACCCTAA-1 4
WT_d8_r2_AGGGTTGCAGCTCAAC-1 2