I was wondering if anyone could help me with parallel coordinate plotting.
First this is how the data looks like:
It's data manipulated from : https://data.cityofnewyork.us/Transportation/2016-Yellow-Taxi-Trip-Data/k67s-dv2t
So I'm trying to normalise some features and use that to compute the mean of trip distance, passenger count and payment amount for each day of the week.
from pandas.tools.plotting import parallel_coordinates
feature = ['trip_distance','passenger_count','payment_amount']
#normalizing data
for feature in features:
df[feature] = (df[feature]-df[feature].min())/(df[feature].max()-df[feature].min())
#change format to datetime
pickup_time = pd.to_datetime(df['pickup_datetime'], format ='%d/%m/%y %H:%M')
#fill dayofweek column with 0~6 0:Monday and 6:Sunday
df['dayofweek'] = pickup_time.dt.weekday
mean_trip = df.groupby('dayofweek').trip_distance.mean()
mean_passanger = df.groupby('dayofweek').passenger_count.mean()
mean_payment = df.groupby('dayofweek').payment_amount.mean()
#parallel_coordinates('notsurewattoput')
So if I print mean_trip:
It shows the mean of each day of the week but I'm not sure how I would use this to draw a parallel coordinate plot with all 3 means on the same plot.
Does anyone know how to implement this?