0

I am a beginner in Python and getting an error while trying to drop values from a column in pandas dataframe. I keep getting Keyerror after sometime. Here is the code snippet:

for i in data['FilePath'].keys():
if '.' not in data['FilePath'][i]:
    value = data['FilePath'][i]
    data = data[data['FilePath'] != value]

I keep getting Keyerror near the line "if '.' not in data['FilePath'][i]". Please help me fix this error

phoenix
  • 13
  • 5

1 Answers1

0

If I understand your logic correctly, then you should be be able to do this without a loop. From what I can see, it looks like you want to drop rows if the FilePath column does not begin with .. If this is correct, then below is one way to do this:

Create sample data using nested list

d = [
    ['BytesAccessed','FilePath','DateTime'],
    [0, '/lib/x86_64-linux-gnu/libtinfo.so.5 832.0', '[28/Jun/2018:11:53:09]'], 
    [1, './lib/x86-linux-gnu/yourtext.so.6 932.0', '[28/Jun/2018:11:53:09]'],
    [2, '/lib/x86_64-linux-gnu/mytes0', '[28/Jun/2018:11:53:09]'],
    ]
data = pd.DataFrame(d[1:], columns=d[0])
print(data)

   BytesAccessed                                   FilePath                DateTime
0              0  /lib/x86_64-linux-gnu/libtinfo.so.5 832.0  [28/Jun/2018:11:53:09]
1              1    ./lib/x86-linux-gnu/yourtext.so.6 932.0  [28/Jun/2018:11:53:09]
2              2               /lib/x86_64-linux-gnu/mytes0  [28/Jun/2018:11:53:09]

Filtered data to drop rows that do not contain . at any location in the FilePath column

data_filtered = (data.set_index('FilePath')
                    .filter(like='.', axis=0)
                    .reset_index())[data.columns]
print(data_filtered)

   BytesAccessed                                   FilePath                DateTime
0              0  /lib/x86_64-linux-gnu/libtinfo.so.5 832.0  [28/Jun/2018:11:53:09]
1              1    ./lib/x86-linux-gnu/yourtext.so.6 932.0  [28/Jun/2018:11:53:09]
edesz
  • 11,756
  • 22
  • 75
  • 123
  • Thanks @W R for this option. I am looking if FilePath contains '.' because there are many entries which doesn't contain any '.' – phoenix Nov 04 '18 at 19:29
  • Ok, thanks for the additional info. I've modified the sample data to intentionally contain one row (3rd row) that does not contain any `.`. I also changed the pandas filter to return any rows that contain a `.`. – edesz Nov 04 '18 at 20:42
  • Thank you so much @W R! This works great. I had no idea that I could use an option like 'filter'. – phoenix Nov 04 '18 at 22:28