0

I am always getting a warning from python at following scenario: I have a list with a length smaller then the dataframe I created as a copy of another dataframe column and I want to add it to that new dataframe as a new column with an known offset so the list ends with the last dataframe entry.

data = data["ColumnName"].copy()
data["NewColumn"] = float("NaN")
data.iloc[offset:]["NewColumn"] = list

This gives me the error:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

When I do the same with .loc:

data.loc[offset:,"NewColumn"] = list

I am getting following warning:

FutureWarning: Slicing a positional slice with .loc is not supported, and will raise TypeError in a future version. Use .loc with labels or .iloc with positions instead.

Can anyone help me understanding the problem and show how to fix the warning?

soderdaen
  • 23
  • 4

2 Answers2

0

Just suggesting an alternative route:

You can "pad" the list before adding it to the dataframe.

df = pd.DataFrame({'test': [1,2,3,4,5,6,7,8,9,10]})
list_to_add = [11,12,13,14,15]

# Pad beginning of list with NANs to make legth same as df
list_to_add = [np.nan] * (df.shape[0] - len(list_to_add)) + list_to_add

#[nan, nan, nan, nan, nan, 11, 12, 13, 14, 15]    

df['new_col'] = list_to_add
    
    test    new_col
0   1   NaN
1   2   NaN
2   3   NaN
3   4   NaN
4   5   NaN
5   6   11.0
6   7   12.0
7   8   13.0
8   9   14.0
9   10  15.0
Shubham Periwal
  • 2,198
  • 2
  • 8
  • 26
0

In the following line you're trying to store the a Series into the variable 'data'. At this point it stops being a dataframe

data = data["ColumnName"].copy()

If you try doing this with 'data' still being a dataframe, it doesn't throw any warning.

>>> data = pd.DataFrame({"ColumnName": [1,2,3,4,5]})
>>> data["NewColumn"] = float("NaN")
>>> list_of_items = [30,43,52]
>>> offset = 2
>>> data.loc[offset:,"NewColumn"] = list_of_items # Here you need to pass both the row and column selection directly to loc
>>> data
   ColumnName  NewColumn
0           1        NaN
1           2        NaN
2           3       30.0
3           4       43.0
4           5       52.0
>>>