4

I have a dataframe given shown below

df = pd.DataFrame({
 'subject_id':[1,1,1,1,1,1],
  'val' :[5,6.4,5.4,6,6,6]
 })

It looks like as shown below

enter image description here

I would like to drop the values from val column which ends with .[1-9]. Basically I would like to retain values like 5.0,6.0 and drop values like 5.4,6.4 etc

Though I tried below, it isn't accurate

df['val'] = df['val'].astype(int)
df.drop_duplicates()  # it doesn't give expected output and not accurate.

I expect my output to be like as shown below

enter image description here

The Great
  • 7,215
  • 7
  • 40
  • 128
  • 1
    Both the answers below are really good and easy to understand.However I can mark only one answer and I mark @Jezrael answer for detailed explanation and diiferent ways to get to the output. Nonetheless, Allen's answer is also good and useful to know. Thank you both for the help – The Great Sep 02 '19 at 06:20

2 Answers2

4

First idea is compare original value with casted column to integer, also assign integers back for expected output (integers in column):

s = df['val']
df['val'] = df['val'].astype(int)

df = df[df['val'] == s]
print (df)
   subject_id  val
0           1    5
3           1    6
4           1    6
5           1    6

Another idea is test is_integer:

mask = df['val'].apply(lambda x: x.is_integer())
df['val'] = df['val'].astype(int)

df = df[mask]
print (df)

   subject_id  val
0           1    5
3           1    6
4           1    6
5           1    6

If need floats in output you can use:

df1 = df[ df['val'].astype(int) == df['val']]
print (df1)
   subject_id  val
0           1  5.0
3           1  6.0
4           1  6.0
5           1  6.0
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • If you don't need to transform 'val' to int, You could pass using the `s` and just use `df = df[df['val'] == df['val'].astype(int)]` – Aryerez Sep 02 '19 at 06:13
  • @Aryerez - I know, but reason is for integer output in `val` column. – jezrael Sep 02 '19 at 06:13
4

Use mod 1 to determine the residual. If residual is 0 it means the number is a int. Then use the results as a mask to select only those rows.

df.loc[df.val.mod(1).eq(0)].astype(int)

    subject_id  val
0   1           5
3   1           6
4   1           6
5   1           6
Allen Qin
  • 19,507
  • 8
  • 51
  • 67