1

When I mask my set of data with another, it shows up with a UserWarning: Boolean Series key will be reindexed to match DataFrame index. How would I avoid this? Python will automatically reindex it with but the header for that column is blank and I cannot seem to rename it so I may reference that column in my code. I prefer to not rely on this implicit correction as well.

I have tried to rename the columns manually in two ways pd.DataFrame.columns() or pd.DataFrame.rename(). For some reason I either get an error that it was expecting 3 elements rather than 4 or the empty column index that was added will not be renamed.

# select data and filter it which results in the error which fixes the dataframe but leaves the column name empty

stickData = data[['Time','Pitch Stick Position(IN)','Roll Stick Position (IN)']]
filteredData = stickData[contactData['CONTACT'] == 1]

# moving forward from the error I tried using rename which does not error but also does nothing
filteredData.rename(index={0:'Index'})

# I also tried this
filteredData.rename(index={'':'Old_Index'})

# I even went and tried to add the names of the dataframe like so which resulted in ValueError: Length mismatch: Expected axis has 3 elements, new values have 4 elements
filteredData.columns = ['Old_Index','Time','Pitch Stick Position(IN)','Roll Stick Position (IN)']

The current dataframe of filteredData.head() looks like this after the implicit correction from python:

Index              Time          Pitch Stick Position(IN)  Roll Stick Position (IN)
0       1421  240:19:06:40.200                  0.007263                 -0.028500
1       1422  240:19:06:40.400                  0.022327                  0.139893
2       1423  240:19:06:40.600                 -0.016409                  0.540756
3       1424  240:19:06:40.800                 -0.199329                  0.279971
4       1425  240:19:06:41.000                  0.013719                 -0.018069

But I would like to display with Old_index labeled and more so without relying on the implicit correction:

Index   Old_index   Time          Pitch Stick Position(IN)  Roll Stick Position (IN)
1       1421  240:19:06:40.200                  0.007263                 -0.028500
2       1422  240:19:06:40.400                  0.022327                  0.139893
3       1423  240:19:06:40.600                 -0.016409                  0.540756
4       1424  240:19:06:40.800                 -0.199329                  0.279971
5       1425  240:19:06:41.000                  0.013719                 -0.018069
jpp
  • 159,742
  • 34
  • 281
  • 339
Minutia
  • 111
  • 6

2 Answers2

1

There are a few errors in your code:

  1. Don't use chained indexing. Use loc / iloc accessors instead.
  2. Assign back to variables when using methods that don't operate in place.
  3. In general, don't use Boolean indexers derived from other dataframes. If you can guarantee row alignment, then extract the NumPy array representation via pd.Series.values.

For example, this would work, assuming the rows in contactData align with the rows in filteredData

cols = ['Time','Pitch Stick Position(IN)','Roll Stick Position (IN)']

filteredData = data.loc[(contactData['CONTACT'] == 1).values, cols]\
                   .rename(index={0:'Index'})

Notice we can chain methods such as loc and rename instead of explicitly assigning back to filteredData each time.

jpp
  • 159,742
  • 34
  • 281
  • 339
0

can you try:

filteredData = stickData[contactData['CONTACT'] == 1].reset_index().rename(columns={'index': 'Old_index') 

or put this piece somewhere, i don't have your sample data, i can't test it out

.reset_index().rename(columns={'index': 'Old_index')
Jessica
  • 2,923
  • 8
  • 25
  • 46