-1

Is it possible to utilize the third column in the following example, to kind of "spread out"/unravel the values in e.g. a Pandas DataFrame in Python without actually duplicating the rows? So If we have an object looking like this:

X   Y   Count
1   2   3
2   2   2
4   3   1

How would I be able to give Count meaning here without unraveling the rows into Count * row because that does not seem like a good solution as it makes the data take up much more space in memory.

So I don't want the DataFrame to just look like this:

X   Y   Count
1   2   1
1   2   1
1   2   1
2   2   1
2   2   1
4   3   1
eikooc
  • 2,363
  • 28
  • 37
  • My question actually applies to any programming language – eikooc Apr 20 '16 at 18:42
  • 2
    I don't understand your question. You say you want to "spread" the values (without saying what that means), then you say you want to "give Count meaning" (without saying what that means), then you say you want to do KNN clustering. What is it you actually want to do? – BrenBarn Apr 20 '16 at 18:48
  • @BrenBarn I find it hard to formulate it. I want the `count` column to have some meaning in doing a KNN clustering or whatever. If the values we're just one entry per row it would be easier, but they are added together based on the *X* and *Y*. Does it make sense? – eikooc Apr 20 '16 at 18:49
  • Are you looking for something like this? http://stackoverflow.com/questions/26777832/replicating-rows-in-a-pandas-data-frame-by-a-column-value – ayhan Apr 20 '16 at 18:56
  • @ayhan kind of but I would like to avoid duplicating the data if possible as _greole_ is pointing out in the comments – eikooc Apr 20 '16 at 18:58

1 Answers1

0

I think you mean something like this:

new_df = df.loc[df.index.repeat(df['Count'])]

Then row df.loc[n] is repeated df.Count[n] number of times. It's sort of a reverse to groupby.

Update

I tried new_df['Count'] = 1 and it raised a SettingWithCopyWarning unless I made an explicit copy:

new_df = df.loc[df.index.repeat(df['Count'])].copy()
new_df['Count'] = 1    # <- now it works without a warning
ptrj
  • 5,152
  • 18
  • 31