I've got a growing data set (let's call it ADC) where each row looks something like this:
AK1,2018-03-27,22:42:21,AKDT,Running Feedback calls in a loop,Iteration 1,AIN0(SE),0.091 ,AIN1(SE),-0.007 ,AIN2(SE),-0.008 ,AIN3(SE),3.659 ,Temperature,283.824,Iteration 2,AIN0(SE),0.096 ,AIN1(SE),-0.007 ,AIN2(SE),-0.003 ,AIN3(SE),3.649 ,Temperature,283.824,Iteration 3 ...
through Iteration 5. Picture in case it doesn't make sense inline
Thus you would have 5 entire columns of "AIN0", "AIN1", so on and so forth adjacent 5 columns of value of the that measurement. I'd like to make a new row for each iteration that retains the values of the first 5 columns from the original row and removes the AIN*, Temperature columns, making them the column names for the values that currently follow each one, so it looks like this. I've tried various methods using reshape, stack, wide_to_long, but it never quite works.
ADC.melt(id_vars=ADC.columns[0:4]) gives me something like this
pd.wide_to_long(ADC, stubnames="Iteration", i=[ADC.iloc[:,1:4]], j="Number")
throws this error: ValueError: stubname can't be identical to a column name
Where am I going wrong?
EDIT: Thanks for the comment @DJK, here is a copy-pastable, truncated version of the data frame:
AK1,2018-03-27,22:42:21,AKDT,Running Feedback calls in a loop,Iteration 1,AIN0(SE),0.091,AIN1(SE),-0.007,AIN2(SE),-0.008,AIN3(SE),3.659,Temperature,283.824,Iteration 2,AIN0(SE),0.096,AIN1(SE),-0.007,AIN2(SE),-0.003,AIN3(SE),3.649,Temperature,283.824
AK1,2018-03-27,22:47:05,AKDT,Running Feedback calls in a loop,Iteration 1,AIN0(SE),0.101,AIN1(SE),-0.007,AIN2(SE),-0.003,AIN3(SE),3.725,Temperature,285.634,Iteration 2,AIN0(SE),0.101,AIN1(SE),-0.007,AIN2(SE),-0.003,AIN3(SE),3.725,Temperature,285.634
AK1,2018-03-28,00:32:58,AKDT,Running Feedback calls in a loop,Iteration 1,AIN0(SE),0.116,AIN1(SE),1.399,AIN2(SE),-0.008,AIN3(SE),3.872,Temperature,290.462,Iteration 2,AIN0(SE),0.121,AIN1(SE),1.399,AIN2(SE),-0.003,AIN3(SE),3.933,Temperature,290.462
AK1,2018-03-28,09:33:19,AKDT,Running Feedback calls in a loop,Iteration 1,AIN0(SE),0.101,AIN1(SE),1.399,AIN2(SE),0.012,AIN3(SE),3.71,Temperature,258.479,Iteration 2,AIN0(SE),0.101,AIN1(SE),1.405,AIN2(SE),0.012,AIN3(SE),3.705,Temperature,258.479
I would really like it to look like (a few extra spaces added at the beginning to maybe help with readability):
Name, Date,Time,TZ,Process,Iteration,AIN0(SE),AIN1(SE),AIN2(SE),AIN3(SE),Temperature
AK1, 2018-03-27, 22:42:21, AKDT,Running Feedback calls in a loop,1,0.091,-0.007,-0.008,3.659,283.824
AK1, 2018-03-27, 22:42:21, AKDT,Running Feedback calls in a loop,2,0.096,-0.007,-0.003,3.649,283.824
AK1, 2018-03-27, 22:47:05, AKDT,Running Feedback calls in a loop,1,0.101,-0.007,-0.003,3.725,285.634,
AK1, 2018-03-27, 22:47:05, AKDT,Running Feedback calls in a loop,2,0.101,-0.007,-0.003,3.725,285.634
AK1, 2018-03-28, 00:32:58, AKDT,Running Feedback calls in a loop,1,0.116,1.399,-0.008,3.872,290.462
AK1, 2018-03-28, 00:32:58, AKDT,Running Feedback calls in a loop,2,0.121,1.399,-0.003,3.933,290.462
AK1, 2018-03-28, 09:33:19, AKDT,Running Feedback calls in a loop,1,0.101,1.399,0.012,3.71,258.479,
AK1, 2018-03-28, 09:33:19, AKDT,Running Feedback calls in a loop,2,0.101,1.405,0.012,3.705,258.479
UPDATE: I found a solution by reading the CSV in to multiple data frames (one for each iteration) using pd.read_csv(usecols=) to select only the columns containing data and manually naming the columns. I then used pd.concat to combine the dataframes and manually assigned indecies.
import pandas as pd
adc_names=["name", "date", "time", "tz", "iteration", "AIN0(SE)", "AIN1(SE)", "AIN2(SE)", "AIN3(SE)", "temperature"]
Iteration1 = pd.read_csv('ADC_CLEAN.csv', index_col=False, header=None, usecols=[0,1,2,3,5,7,9,11,13,15], names=adc_names)
Iteration2 = pd.read_csv('ADC_CLEAN.csv', index_col=False, header=None, usecols=[0,1,2,3,16,18,20,22,24,26], names=adc_names)
Iteration3 = pd.read_csv('ADC_CLEAN.csv', index_col=False, header=None, usecols=[0,1,2,3,27,29,31,33,35,37], names=adc_names)
Iteration4 = pd.read_csv('ADC_CLEAN.csv', index_col=False, header=None, usecols=[0,1,2,3,38,40,42,44,46,48], names=adc_names)
Iteration5 = pd.read_csv('ADC_CLEAN.csv', index_col=False, header=None, usecols=[0,1,2,3,49,51,53,55,57,59], names=adc_names)
ADC = pd.concat([Iteration1, Iteration2, Iteration3, Iteration4, Iteration5], ignore_index=True)
ADC.iteration = ADC.iteration.str.strip('Iteration ')
ADC = ADC.set_index(["name","date","time","tz","iteration"])