-1

I have this data frame:

CHROM   POS     ID  162014      162015  162016
1       1645    M1  0|1:0.96    0|0:0   0|0:0.33
1       23253   M3  1|1:1.97    0|0:0   0|0:0.33
1       29491   M4  1|1:1.97    0|0:0   0|0:0.33
1       30698   M6  0|0:0.03    1|0:1   1|1:1.67
1       43616   M9  0|0:0.03    1|1:2   1|1:1.67
1       53188   M11 1|1:1.97    0|0:0   0|0:0.33
1       53632   M12 1|1:1.97    0|0:0   0|0:0.33
1       57628   M13 1|1:1.97    0|0:0   0|0:0.33
1       59879   M14 0|0:0.03    1|1:2   1|1:1.67
1       64576   M15 0|0:0.03    1|1:2   1|1:1.67

I want to know, how can I remove all characters and numbers after 0|0, 0|1, 1|0, and 1|1 in all columns except for ID,#CHR, and POS columns in pandas, same as this table;

#CHROM  POS     ID  162014  162015  162016
1       1645    M1  0|1     0|0     0|0
1       23253   M3  1|1     0|0     0|0
1       29491   M4  1|1     0|0     0|0
1       30698   M6  0|0     1|0     1|1
1       43616   M9  0|0     1|1     1|1
1       53188   M11 1|1     0|0     0|0
1       53632   M12 1|1     0|0     0|0
1       57628   M13 1|1     0|0     0|0
1       59879   M14 0|0     1|1     1|1
1       64576   M15 0|0     1|1     1|1
Siavash
  • 190
  • 1
  • 13
  • Please provide the tables not as image but in the text – DavideBrex Jun 28 '20 at 15:41
  • Please provide a small set of sample data as text that we can copy and paste. Include the corresponding desired result. Check out the guide on [how to make good reproducible pandas examples](https://stackoverflow.com/a/20159305/3620003). – timgeb Jun 28 '20 at 15:51
  • 1
    Thanks! I changed tables. – Siavash Jun 28 '20 at 15:56
  • Ist the prefix you want to keep always `'0|1'` or `'1|1'` or can the pattern get more complicated and/or have another length than 3 characters? – timgeb Jun 28 '20 at 16:00
  • Also are the numeric column labels strings or integers? – timgeb Jun 28 '20 at 16:01
  • Actualy, I want to keep only 0|1, 0|0, 1|0, and 1|1 in all columns except for ID,#CHROM, and POS. So I must have only 3 characters in all columns except for the aforementioned columns. their labels are string. – Siavash Jun 28 '20 at 16:07

1 Answers1

1

Take the first three characters of each element with the str accessor.

>>> df.iloc[:, 3:] = df.iloc[:, 3:].apply(lambda s: s.str[:3])
>>> df
   CHROM    POS   ID 162014 162015 162016
0      1   1645   M1    0|1    0|0    0|0
1      1  23253   M3    1|1    0|0    0|0
2      1  29491   M4    1|1    0|0    0|0
3      1  30698   M6    0|0    1|0    1|1
4      1  43616   M9    0|0    1|1    1|1
5      1  53188  M11    1|1    0|0    0|0
6      1  53632  M12    1|1    0|0    0|0
7      1  57628  M13    1|1    0|0    0|0
8      1  59879  M14    0|0    1|1    1|1
9      1  64576  M15    0|0    1|1    1|1
timgeb
  • 76,762
  • 20
  • 123
  • 145