This question is very similar to one I asked here:
Python Pandas SettingWithCopyWarning copies vs new objects
I'd like to understand how I can exclude records within a given dataframe (IE operate on the dataframe and not a view of it) while also having the option of applying additional operations on the results.
I'm struggling with understanding how Python is managing reference vs value assignment when operating on Pandas DataFrame objects. I'm working with a dataset that is in a Pandas Dataframe and I'd like to reduce the set down based on certain attribute values. I'd also like to apply additional operations on the results of this operation. The preferred method I'd like to use is the .query() method. Here is a simple example:
mydf = pd.DataFrame({'col1':['A','B','C'],
'col2':['x','y','z']})
mydf = mydf.query('col1 == \'A\'')
This will conceptually accomplish what I'm looking for; a reduction in the dataset I'm working with based on a query against it. The question I have is this:
"Is this the correct application of the query function or should I be doing something else if I have additional operations to perform on 'mydf'"?
I've read through this documentation but still don't understand what pitfalls to watch out for...