I have a csv file from 2 smartwatches. The file contains time, date, HR etc. data. when I try to read the file with pandas it stacks everything into the first column and then fills the rest of the columns with Nan.
The first row is:
Activity Type,Date,Favorite,Title,Distance,Calories,Time,Avg HR,Max HR,Avg Speed,Max Speed,Elev Gain,Elev Loss,Avg Stride Length,Avg Vertical Ratio,Avg Vertical Oscillation,Training Stress Score®,Grit,Flow,Total Strokes,Avg. Swolf,Avg Stroke Rate,Bottom Time,Min Temp,Surface Interval,Decompression,Best Lap Time,Number of Runs,Max Temp
and data looks like this:
"road_biking,2018-08-29 13:02:00,false,""bike"",""51,60"",""1.192"",""02:10:05"",""--"",""--"",""23,8"",""--"",""--"",""--"",""0,00"",""0,0"",""0,0"",""0,0"",""0,0"",""0,0"",""--"",""--"",""--"",""0:00"",""0,0"",""0:00"",""No"",""00:00.00"",""1"",""0,0"""
I have tried various things from stackoverflow such as df = pd.read_csv(filename, sep=',').replace('"','', regex=True)
(pandas data with double quote)
import numpy as np
import pandas as pd
df_garmin = pd.read_csv("dogacapanoglu garmindata until may1st2019.csv")
df_garmin.to_csv("garmindata_till_may2019")
df_garmin = pd.read_csv("garmindata_till_may2019").set_index("Unnamed: 0")
df_garmin.head()
df_garmin.columns
returns this: Index(['Activity Type', 'Date', 'Favorite', 'Title', 'Distance', 'Calories', 'Time', 'Avg HR', 'Max HR', 'Avg Speed', 'Max Speed', 'Elev Gain', 'Elev Loss', 'Avg Stride Length', 'Avg Vertical Ratio', 'Avg Vertical Oscillation', 'Training Stress Score®', 'Grit', 'Flow', 'Total Strokes', 'Avg. Swolf', 'Avg Stroke Rate', 'Bottom Time', 'Min Temp', 'Surface Interval', 'Decompression', 'Best Lap Time', 'Number of Runs', 'Max Temp'], dtype='object')
df_garmin.dtypes
returns all columns float64 except "Activity type" (returnsit as object)
I get all the columns without problem but the code stacks every data into 'Activity type' column. The rest of the columns are all filled NaN.
What can I do to disperse data to their proper columns?