0

I am starting to work with pandas, so this is probably a pretty obvious question, but I have been struggling with it for a while now and found no solution.

Consider this dataframe:

import pandas_datareader as pdr
apple = pdr.DataReader('AAPL', data_source='yahoo', 
                      start=datetime.datetime(2013, 1, 1), 
                      end=datetime.datetime(2020, 1, 1))

Now, I can add a new column to this dataframe simply doing:

apple['new_column'] = np.arange(apple.shape[0])

However, if I usse iloc to extract a subdataframe and try to add a new column to the subdataframe:

apple_2 = apple.iloc[1:5,:]
apple_2['test2'] = np.arange(4)

I get the error message:

<stdin>:1: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

What am I doing wrong and how am I supposed to do this in pandas? The error suggests using .locbut I do not know how to use it to add new columns.

1 Answers1

1

You can do:

apple_2 = apple.loc[:, 'High':'Close']

this will give you all the columns between 'High' and ' Close' (without close). But there are also other ways to column-slice a dataframe. You an check this question.

EDIT:

apple_2 = apple.loc[:, 'High':'Close']
#add a new column to apple_2
apple_2['new_column'] = np.arange(apple_2.shape[0]) 
Melinda
  • 245
  • 1
  • 9
  • Thank you for your answer. My question was more about how to add a new column rather than performing column-slice on existing columns. – Álvaro Méndez Civieta Sep 25 '20 at 15:13
  • Something like: apple_2['new_column'] = np.arange(apple_2.shape[0]) – Melinda Sep 25 '20 at 15:24
  • If you execute the very same code from your answer, you will get the error message I was asking about: :1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead. Thank you but this does not answer the question – Álvaro Méndez Civieta Sep 26 '20 at 10:36
  • I don't know exactly what have you tried, I edit the answer with the code that it's working. The error you are saying is because of using .iloc instead of .loc, so I provide you a way of using .loc, before adding the new column. Then you add the new column taking into account the new shape of the df apple_2. – Melinda Sep 26 '20 at 10:47