2

I have done the same thing multiple times earlier but I am not able to select a row based on column value from my pandas dataframe. I also tried making this column as index but didn't work. I can only query the first row. This the the dataframe:

| PLAYER   | Pts

|0 | SunilNarine | 379.5|

|1 |Shane Watson | 318.0|

df[df.PLAYER=='SunilNarine']

works fine. Same for any other record doesnt give anything

df[df.PLAYER=='Shane Watson']

does not give anything. I tried making this column as index too, only works for the first record. Also tried:

for player in df['PLAYER']:
    if str(player).strip().capitalize=='Shane Watson'.capitalize:
        print('Y')

It prints nothing.

I have multiple records, I have only presented two here. Its unable to select any of the rows based on PLAYER column except for the first row. Works fine for other columns. Cant figure out what is going incorrect here.

Community
  • 1
  • 1
MagicBeans
  • 343
  • 1
  • 5
  • 18
  • Try to improve your data export following [these tips](https://stackoverflow.com/a/20159305/2162212), so people can reproduce your problem – mucio Sep 28 '19 at 19:51
  • 1
    What source file encoding do you use? If you on python>2 it is usually utf8, but it may be not the case. Check that encodings do match. Also strip all non printable characters from the string. https://stackoverflow.com/a/93029/8339821 is a good SO answer for that. Try to replace all tabs, double spaces etc, etc in a column. The `df.col.str.replace(r'\s+', ' ')` may help. – user14063792468 Sep 28 '19 at 19:52
  • This worked. Thanks! Removing all non printable using using unicode somehow didn't work as suggested here [link]stackoverflow.com/a/93029/8339821[link] . – MagicBeans Sep 28 '19 at 20:35

2 Answers2

1

try the code below:

# Import pandas library 
  import pandas as pd 
  data = [['SunilNarine', 379.5], ['Shane Watson', 318.0], ['Virat Kohli', 543]] 
  df = pd.DataFrame(data, columns = ['PLAYER', 'Pts']) 

  # print the records
  print(df[df.PLAYER=='Virat Kohli'])
  print(df[df.PLAYER=='Shane Watson'])
  print(df[df.Pts== 379.5])

Please refer the Image

Pranab
  • 11
  • 1
  • 3
0
[unidecode.unidecode(x) for x in df['PLAYER']]

worked well for removing all special characters. has a lot of '/xa' and '/n' characters in the column.

MagicBeans
  • 343
  • 1
  • 5
  • 18