I am new to Pandas and I am just starting to take in the versatility of the package. While working with a small practice csv file, I pulled the following data in:
Rank Corporation Sector Headquarters Revenue (thousand PLN) Profit (thousand PLN) Employees
1.ÿ PKN Orlen SA oil and gas P?ock 79 037 121 2 396 447 4,445
2.ÿ Lotos Group SA oil and gas Gda?sk 29 258 539 584 878 5,168
3.ÿ PGE SA energy Warsaw 28 111 354 6 165 394 44,317
4.ÿ Jer¢nimo Martins retail Kostrzyn 25 285 407 N/A 36,419
5.ÿ PGNiG SA oil and gas Warsaw 23 003 534 1 711 787 33,071
6.ÿ Tauron Group SA energy Katowice 20 755 222 1 565 936 26,710
7.ÿ KGHM Polska Mied? SA mining Lubin 20 097 392 13 653 597 18,578
8.ÿ Metro Group Poland retail Warsaw 17 200 000 N/A 22,556
9.ÿ Fiat Auto Poland SA automotive Bielsko-Bia?a 16 513 651 83 919 5,303
10.ÿ Orange Polska telecommunications Warsaw 14 922 000 1 785 000 23,805
I have two serious problems with it that I cannot seem to find solution for:
1) data in "Ravenue" and "Profit" columns is pulled in as strings because of funny formatting with spaces between thousands, and I cannot seem to figure out how to make Pandas translate into floating point values.
2) Data under "Rank" column is pulled in as "1.?", "2.?" etc. What's happening there? Again, when I am trying to re-write this data with something more appropriate like "1.", "2." etc. the DataFrame just does not budge.
Ideas? Suggestions? I am also open for outright bashing because my problem might be quite obvious and silly - excuse my lack of experience then :)