I just started with programming and have some trouble to correctly import my CSV file.
To import it I use the following code:
data_fundamentals = open(path_fundamentals, newline= '')
reader_fundamentals = csv.reader(data_fundamentals)
header_fundamentals = next(reader_fundamentals)
fundamentals = [row for row in reader_fundamentals]
Then convert it into a DataFrame:
df_fundamentals = pd.DataFrame(fundamentals, columns= header_fundamentals)
Here comes my first problem: Out of the CSV file "fundamentals" I just need certain columns for my DataFrame. I started by inserting them all by hand, which of course is not very efficient. Do you have an easier way?
df_kennzahlen.insert(1, 'Fiscal Year' , df_fundamentals['fyear'])
df_kennzahlen.insert(2, 'Current Assets' , df_fundamentals['act'])
df_kennzahlen.insert(3, 'Net Income/Loss' , df_fundamentals['ni'])
df_kennzahlen.insert(4, 'Total Liabilities' , df_fundamentals['lt'])
df_kennzahlen.insert(5, 'Long-Term Debt' , df_fundamentals['dltp'])
df_kennzahlen.insert(6, 'Cash' , df_fundamentals['ch'])
df_kennzahlen.insert(7, 'Total Assets' , df_fundamentals['at'])
df_kennzahlen.insert(8, 'Trade Payables' , df_fundamentals['ap'])
df_kennzahlen.insert(9, 'R&D-Expenses' , df_fundamentals['xrd'])
df_kennzahlen.insert(10, 'Sales' , df_fundamentals['sale'])
The values in the DataFrame are numbers, but have the string data-type. To convert them I use the following code:
df_kennzahlen['Net Income/Loss'] = pd.to_numeric(df_kennzahlen['Net Income/Loss'], downcast='integer')
df_kennzahlen['Total Liabilities'] = pd.to_numeric(df_kennzahlen['Total Liabilities'], downcast='integer')
df_kennzahlen['Long-Term Debt'] = pd.to_numeric(df_kennzahlen['Long-Term Debt'], downcast='integer')
df_kennzahlen['Cash'] = pd.to_numeric(df_kennzahlen['Cash'], downcast='integer')
df_kennzahlen['Total Assets'] = pd.to_numeric(df_kennzahlen['Total Assets'], downcast='integer')
df_kennzahlen['Trade Payables'] = pd.to_numeric(df_kennzahlen['Trade Payables'], downcast='integer')
df_kennzahlen['R&D-Expenses'] = pd.to_numeric(df_kennzahlen['R&D-Expenses'], downcast='integer')
df_kennzahlen['Sales'] = pd.to_numeric(df_kennzahlen['Sales'], downcast='integer')
Again I have the same problem, it is not very efficient and the values in the DataFrame are not converted correctly. For example a 4680 is displayed as 0.4680 and 3235300 is shown as 323.530. Do you have any ideas how I can make the code more efficient and have the correct values in the DataFrame?