I'm using pandas to tidy up some live sport data and rank players. When I do the team stand however, for some reason the second teams stats are all being converted into floats, even through in the datasource they are integers.
How can I convert these floats back to integers. Bearing in mind some rows actually should be floats (E.G. Accuracy which represents a percentage.)
Here's what I have so far:
import pandas as pd
teams_df = pd.read_excel("GAME_STATS.xlsm", skiprows=8, nrows=12, usecols=[0,1,2])
teams_df.to_csv("team_stats.csv")
This writes to a CSV which results in:
+----------------------------------------------------------+
| | Unnamed: 0 | VS. | Unnamed: 2 |
+----------------------------------------------------------+
| 0 | 342 | Disposals | 339.0 |
| 1 | 232 | Kicks | 214.0 |
| 2 | 110 | Handballs | 125.0 |
| 3 | 23 | Hit Outs | 45.0 |
| 4 | 27 | Frees | 25.0 |
| 5 | S.A. | | |
| 6 | | VS | |
| 7 | 23 | Scoring Shots | 18.0 |
| 8 | 0.5652 | Accuracy | 0.5 |
| 9 | 26.3 | Disposal Per Goal | 37.7 |
| 10 | 14.9 | Disposal Per Scoring Shot | 18.8 |
| 11 | 44 | Inside 50s | 43.0 |
+----------------------------------------------------------+
As you can see for the team on the left: Rows 1-5 as well as 8 & 12 should all be integers. For some reason Pandas returns them as floats. Is there a way I can convert these particular columns / rows back to ints or stop this from happening in the first place?
I've tried using .astype(int) function but only getting errors.
When printing as dict this is the output:
print(teams_df.to_dict())
{'Unnamed: 0': {0: 342, 1: 232, 2: 110, 3: 23, 4: 27, 5: 'S.A.', 6: nan, 7: 23, 8: 0.5652, 9: 26.3, 10: 14.9, 11: 44}, 'VS': {0: 'Disposals', 1: 'Kicks', 2: 'Handballs', 3: 'Hit Outs', 4: 'Frees', 5: nan, 6: 'VS', 7: 'Scoring Shots', 8: 'Accuracy', 9: 'Disposal Per Goal', 10: 'Disposal Per Scoring Shot', 11: 'Inside 50s'}, 'Unnamed: 2': {0: 339.0, 1: 214.0, 2: 125.0, 3: 45.0, 4: 25.0, 5: nan, 6: nan, 7: 18.0, 8: 0.5, 9: 37.7, 10: 18.8, 11: 43.0}}