I have two datasets, the first has 60k rows and looks like this:
type power bonus
0 Eletric 10 3
1 Flying 5 5
.. ... ... ...
2 Grass 10 5
[61000 rows x 3 columns]
and the second one has half a million of rows and looks like that:
pokemon type attack
0 Pikachu Eletric 105
1 Bulbasaur Grass 90
.. ... ... ...
2 Treeko Grass 105
3 Dragonite Flying 125
[650000 rows x 3 columns]
I want to apply this function on the joint table of the two datasets (type == type)
points = attack * power + bonus
so at the end I want to obtain a Series that looks like this:
pokemon
Pikachu 1053
Bulbasaur 905
...
Treeko 1055
Dragonite 630
Name: points, Length: 650000
I've already managed to write a solution using pd.apply
function, but it takes too long imo.
What's the fastest way to manage well the computational complexity? Should I quit pandas and work with native python data structures?