Update at end Update 2 at end
I read from here: get list from pandas dataframe column
Pandas DataFrame columns are Pandas Series when you pull them out
However this is not true in my case:
First part (building up the DataFrame reading json scraped) Because it contains business info I cannot show the full code, but basically it reads one row of data (stored in Series) and append at the end of the DataFrame.
dfToWrite = pandas.DataFrame(columns=[lsHeader]) # Empty with column headers
for row in jsAdtoolJSON['rows']:
lsRow = []
for col in row['row']:
lsRow.append((col['primary'])['value'])
dfRow = pandas.Series(lsRow, index = dfToWrite.columns)
dfToWrite = dfToWrite.append(dfRow, ignore_index = True)
Next part (check type): (Please ignore the functionality of the function)
def CalcMA(df: pandas.DataFrame, target: str, period: int, maname: str):
print(type(df[target]))
Finally call the function: ("Raw_Impressions" is a column header)
CalcMA(dfToWrite, "Raw_Impressions", 5, "ImpMA5")
Python console shows:
class 'pandas.core.frame.DataFrame'
Additional Question: How to get a list from a Dataframe column if it's not a Series (in which case I can use tolist()
)?
Update 1 From here: Bokeh: AttributeError: 'DataFrame' object has no attribute 'tolist'
I figured out that I need to use .value.tolist()
, however it still doesn't explain why I'm getting another Dataframe, not a Series when I pull out a column.
Update 2 Found out that df has MultiIndex, very surprised:
MultiIndex(levels=[['COST_/CPM', 'CTR', 'ECPM/_ROI', 'Goal_Ratio', 'Hour_of_the_Day', 'IMP./Joins', 'Raw_Clicks_/_Unique_Clicks', 'Raw_Impressions', 'Unique_Goal_/_UniqueGoal_Forecasted_Value']], labels=[[4, 7, 5, 6, 1, 8, 3, 0, 2]])
I don't see the labels
when printing out the df / writing to .csv, it's just a normal DataFrame. Not sure where did I get the labels.