I am trying to obtain a list
of tuples
from a panda's DataFrame
. I'm more used to other APIs like apache-spark
where DataFrame
s have a method called collect
, however I searched a bit and found this approach. But the result isn't what I expected, I assume it is because the DataFrame
has aggregated data. Is there any simple way to do this?
Let me show my problem:
print(df)
#date user Cost
#2016-10-01 xxxx 0.598111
# yyyy 0.598150
# zzzz 13.537223
#2016-10-02 xxxx 0.624247
# yyyy 0.624302
# zzzz 14.651441
print(df.values)
#[[ 0.59811124]
# [ 0.59814985]
# [ 13.53722286]
# [ 0.62424731]
# [ 0.62430216]
# [ 14.65144134]]
#I was expecting something like this:
[("2016-10-01", "xxxx", 0.598111),
("2016-10-01", "yyyy", 0.598150),
("2016-10-01", "zzzz", 13.537223)
("2016-10-02", "xxxx", 0.624247),
("2016-10-02", "yyyy", 0.624302),
("2016-10-02", "zzzz", 14.651441)]
Edit
I tried what was suggested by @Dervin, but the result was unsatisfactory.
collected = [for tuple(x) in df.values]
collected
[(0.59811124000000004,), (0.59814985000000032,), (13.53722285999994,),
(0.62424731000000044,), (0.62430216000000027,), (14.651441339999931,),
(0.62414758000000026,), (0.62423407000000042,), (14.655454959999938,)]