2

I am applying some processing like replacing matrix element from one matrix index value to another. it works fine.

ds1 = [[ 4, 13,  6,  9],
      [ 7, 12,  5,  7],
      [ 7,  0,  4, 22],
      [ 9,  8, 12,  0]]

ds2 = [[ 4,  1],
       [ 5,  3],
       [ 6,  1],
       [ 7,  2],
       [ 4, 1 ],
       [ 8,  2],
       [ 9,  3],
       [12,  1],
       [13,  2],
       [22,  3]]

ds1= pd.DataFrame(ds1)
ds2= pd.DataFrame(ds2)

#Processing ds1 by replacing
print type(ds2)
ds2 = ds2.groupby(0).mean() #.........X
print type(ds2)
C = np.where(ds1.values.ravel()[:, None] == ds2.values[:, 0])
ds1_new = ds1.values.ravel()
ds1_new[C[0]]=ds2.values[C[1], 1]  #when I comment line x, it works.Otherwise getting error on this line
ds1_new = ds1_new.reshape(4,4)

Reason behind using ds2 = ds2.groupby(0).mean() is getting average value of similar elements. When I uncomment it, it works without error.

Version

Python 2.7.3
numpy - 1.9.2
pandas - 0.15.2

Edit

My main goal is to match the index value from ds2 into ds1 and replace it with corresponding value, so the output would look like

ds1_new = [[ 1, 2,  1,  3],
      [ 2, 1,  3,  2],
      [ 2,  0,  1, 3],
      [ 3,  2, 1,  0]]
nlper
  • 2,297
  • 7
  • 27
  • 37
  • 1
    I get no errors running your code – EdChum May 25 '15 at 07:35
  • @EdChum: I updated my version, could it be due to version? – nlper May 25 '15 at 07:44
  • That's not that old a version, fundamentally you are assigning to a named reference so it doesn't matter what that groupby does, it will raise no error – EdChum May 25 '15 at 08:12
  • @EdChum: Error is not on that line, error comes on `ds1_new[C[0]]=ds2.values[C[1], 1]` if I uncomment `ds2 = ds2.groupby(0).mean()` – nlper May 25 '15 at 08:21
  • 1
    I was referring to the original edit where the groupby was an error, I get an error on running the entire code, the problem as JohnE has highlighted is that there are no matches so you get an empty array. it's a bit weird what you're trying here, you need to fully explain what you're trying to do – EdChum May 25 '15 at 15:06
  • @JohnE: I updated the question, plz have a look if you can help – nlper May 27 '15 at 14:35
  • @EdChum: ohh, actually when I printed `ds2` after `ds2.groupby(0).mean()` then it's type remaining same, but the way it looks, differs. Problem is in this line only i guess, but could not understand how to resolve it – nlper May 27 '15 at 14:40
  • OK, that's helpful. It is appreciated that you made an honest attempt at the solution but merely showing desired results is often very helpful also. – JohnE May 27 '15 at 15:26

1 Answers1

2

I bet this will be easier than you expected. First, let's make ds2 a dictionary rather than a dataframe.

 ds2 = dict([
       [ 4,  1],
       [ 5,  3],
       [ 6,  1],
       [ 7,  2],
       [ 4,  1],
       [ 8,  2],
       [ 9,  3],
       [12,  1],
       [13,  2],
       [22,  3]])

Now, we'll just use ds2 to directly map all the elements in ds1:

ds3 = ds1.copy()
for i in range(4):
    ds3[i] = ds3[i].map( ds2 )

   0   1  2   3
0  1   2  1   3
1  2   1  3   2
2  2 NaN  1   3
3  3   2  1 NaN

If you want 0's instead of NaN, just do ds3.fillna(0).

For some reason, I couldn't get this to work:

ds3.applymap( ds2 )

But this works and avoids the looping over columns, though the syntax is not quite as simple as it is for a series:

ds1.applymap( lambda x: ds2.get(x,0) )
JohnE
  • 29,156
  • 8
  • 79
  • 109
  • thanks a lot, please have a look onhttp://stackoverflow.com/questions/30489216/matrix-processing-using-pandas-fails-on-larger-data-size – nlper May 27 '15 at 17:25