21

pandas offers the ability to look up by lists of row and column indices,

In [49]: index = ['a', 'b', 'c', 'd']

In [50]: columns = ['one', 'two', 'three', 'four']

In [51]: M = pandas.DataFrame(np.random.randn(4,4), index=index, columns=columns)

In [52]: M
Out[52]: 
        one       two     three      four
a -0.785841 -0.538572  0.376594  1.316647
b  0.530288 -0.975547  1.063946 -1.049940
c -0.794447 -0.886721  1.794326 -0.714834
d -0.158371  0.069357 -1.003039 -0.807431

In [53]: M.lookup(index, columns) # diagonal entries
Out[53]: array([-0.78584142, -0.97554698,  1.79432641, -0.8074308 ])

I would like to use this same method of indexing to set M's elements. How can I do this?

duckworthd
  • 14,679
  • 16
  • 53
  • 68
  • Possible duplicate of [How to get a value from a cell of a data frame?](https://stackoverflow.com/q/16729574/1278112) – Shihe Zhang Nov 02 '17 at 08:47
  • No, it's most probably not a duplicate. In the OP's case, specific loadings at (row, col)'s are needed, while in the question linked it is about either indexing one cell or a matrix-like indexing of all [rows, cols]. – user48867 Aug 06 '22 at 20:23

3 Answers3

26

Multiple years have passed since this answer was written so I though I might contribute a little bit. With the refactoring of pandas, attempting to set a value at a location with

M.iloc[index][col]

May give you a warning about trying to set a value in a slice.

SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

In pandas versions after 0.21 the correct "pythonic" way is now the pandas.DataFrame.at operator

which looks like this:

M.at[index,col] = new_value

Answer for older versions: the more "pythonic" way to do this in older versions is with the pandas.DataFrame.set_value instruction. Note that this instruction returns the resulting DataFrame.

M.set_value(index,column,new_value)

I just thought I'd post this here after figuring out the source of the warnings that can be generated by the .iloc or .ix approaches.

The set_value approach also works for multiindex DataFrames by putting the multiple levels of the index in as a tuple (e.g. replacing column with (col,subcol) )

Ezekiel Kruglick
  • 4,496
  • 38
  • 48
  • Thanks for posting this, today my "professor" in a Data Science class at a major university told me it's better to make copies and operate on entire columns of dataframes vs. "modifying one value at a time" (with apply)-- which the former option does anyway! There's a real education in this answer that one just can't pay for haha. The link you provide is great: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy also off-topic but interesting is http://pandas.pydata.org/pandas-docs/stable/gotchas.html – JimLohse Nov 01 '16 at 04:41
  • `set_value` is deprecated since version 0.21.0: Use .at[] or .iat[] accessors instead. [pandas Documentation](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.set_value.html) – Owlright Nov 07 '18 at 11:49
  • @JimLohse - Thanks, updated main answer since it seems like people still find the page. – Ezekiel Kruglick Nov 08 '18 at 17:27
  • Regarding the `SettingWithCopyWarning` resulting from using [`.loc()`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html). It is my understanding that you can safely ignore it when you intend to overwrite the original DataFrame. See [this thread](https://stackoverflow.com/questions/20625582/how-to-deal-with-settingwithcopywarning-in-pandas#answer-20627316) for more info. – fpersyn Dec 02 '19 at 16:02
18

I'm not sure I follow you, but do you use DataFrame.ix to select/set individual elements:

In [79]: M
Out[79]: 
        one       two     three      four
a -0.277981  1.500188 -0.876751 -0.389292
b -0.705835  0.108890 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726

In [75]: M.ix[0]
Out[75]: 
one     -0.277981
two      1.500188
three   -0.876751
four    -0.389292
Name: a

In [78]: M.ix[0,0]
Out[78]: -0.27798082190723405

In [81]: M.ix[0,0] = 1.0

In [82]: M
Out[82]: 
        one       two     three      four
a  1.000000  1.500188 -0.876751 -0.389292
b -0.705835  0.108890 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726

In [84]: M.ix[(0,1),(0,1)] = 1

In [85]: M
Out[85]: 
        one       two     three      four
a  1.000000  1.000000 -0.876751 -0.389292
b  1.000000  1.000000 -1.502786 -0.302773
c  0.880042 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726

You can also slice by indices:

In [98]: M.ix["a":"c","one"] = 2.0

In [99]: M
Out[99]: 
        one       two     three      four
a  2.000000  1.000000 -0.876751 -0.389292
b  2.000000  1.000000 -1.502786 -0.302773
c  2.000000 -0.056620 -0.550164 -0.409458
d  0.704202  0.619031  0.274018 -1.755726
Andy Hayden
  • 359,921
  • 101
  • 625
  • 535
reptilicus
  • 10,290
  • 6
  • 55
  • 79
  • 2
    This will set individual elements, but given a list of (row, index, value) pairs, is there any easy way to set them all at once? – duckworthd Aug 23 '12 at 22:19
  • 1
    I don`t think there is, you need to iterate over the (row, index, value) tuples and set value one by one. If you would have three lists index_labels, column_labels, values this looks like an todo extension of M.set_value => M.set_value(index_labels, column_labels, values). Other option is to use M.update(), but here you need to construct a different frame first. – Wouter Overmeire Aug 24 '12 at 08:00
  • You can pass a tuple/list into ix() to set values in a dataframe. See edit above. For example M.ix[(0,1),(0,1)] = 1 – reptilicus Aug 24 '12 at 13:44
  • This is a very useful thing to know (I just came here to figure out how to do it!), but unfortunately does not allow me to set values as I had originally hoped. Thanks! – duckworthd Oct 21 '12 at 20:38
  • 1
    `Starting in 0.20.0, the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers.` [ix-indexer-is-deprecated](http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated) – Shihe Zhang Nov 02 '17 at 08:45
0

I'm facing exactly the same issue and I think currently Pandas does not offer a built-in method for this. Note that the difference between OP's goal and the usual value setting is that OP only wants the specific loadings indexed by (row, col) pairs to be set to specific values, but not all loadings (in a matrix-like way, as df.loc[rows, cols]=xxx does). In fact, even the lookup function has been deprecated (see here).

For short, I think one can either:

(1) Use for-loops; or

(2) First transform to numpy, then index numpy arrays, then transform back to pandas dataframe (as the link above shows).

Nevertheless, I think Pandas should add such functionalities back!

user48867
  • 141
  • 1
  • 9