0

I have a simple data entry form that writes the inputs to a csv file. Everything seems to be working ok, except that there are extra columns being added to the file in the process somewhere, seems to be during the user input phase. Here is the code:

import pandas as pd

#adds all spreadsheets into one list
Batteries= ["MAT0001.csv","MAT0002.csv", "MAT0003.csv", "MAT0004.csv", 
"MAT0005.csv", "MAT0006.csv", "MAT0007.csv", "MAT0008.csv"]


#User selects battery to log
choice = (int(input("Which battery? (1-8):")))
def choosebattery(c):
    done = False
    while not done:
        if(c in range(1,9)):
            return Batteries[c]
            done = True
        else:
            print('Sorry, selection must be between 1-8')
cfile = choosebattery(choice)

cbat = pd.read_csv(cfile)


#Collect Cycle input
print ("Enter Current Cycle")
response = None
while response not in {"Y", "N", "y", "n"}:
    response = input("Please enter Y or N: ")
    cy = response

#Charger input    
print ("Enter Current Charger")
response = None
while response not in {"SC-G", "QS", "Bosca", "off", "other"}:
    response = input("Please enter one: 'SC-G', 'QS', 'Bosca', 'off', 'other'")
    if response == "other":
        explain = input("Please explain")
        ch = response + ":" + explain
    else:
        ch = response 

#Location
print ("Enter Current Location")
response = None
while response not in {"Rack 1", "Rack 2", "Rack 3", "Rack 4", "EV001", "EV002", "EV003", "EV004", "Floor", "other"}:
    response = input("Please enter one: 'Rack 1 - 4', 'EV001 - 004', 'Floor' or 'other'")
    if response == "other":
        explain = input("Please explain")
        lo = response + ":" + explain
    else:
        lo = response    

#Voltage        
done = False
while not done:
    choice = (float(input("Enter Current Voltage:")))
    modchoice = choice * 10
    if(modchoice in range(500,700)):
        vo = choice
        done = True
    else:
        print('Sorry, selection must be between 50 and 70')


#add inputs to current battery dataframe
log = pd.DataFrame([[cy,ch,lo,vo]],columns=["Cycle", "Charger", "Location", "Voltage"])
clog = pd.concat([cbat,log], axis=0)




clog.to_csv(cfile, index = False)
pd.read_csv(cfile)

And I receive:

Out[18]:
   Charger  Cycle   Location    Unnamed: 0  Voltage
0     off      n       Floor           NaN     50.0

Where is the "Unnamed" column coming from?

Josmolio
  • 13
  • 4

1 Answers1

0

There's an 'unnamed' column coming from your csv. The reason most likely is that the lines in your input csv files end with a comma (i.e. your separator), so pandas interprets that as an additional (nameless) column. If that's the case, check whether your lines end with your separator. For example, if your files are separated by commas:

Column1,Column2,Column3, 
val_11, val12, val12,
...

Into:

Column1,Column2,Column3 
val_11, val12, val12
...

Alternatively, try specifying the index column explicitly as in this answer. I believe some of the confusion stems from pandas concat reordering your columns .

vmg
  • 4,176
  • 2
  • 20
  • 33
  • The first thought looks promising, though I can't think how I might remedy that. Since I've already passed index=False, it must be something other than index. I'm at a loss, I can't seem to find a simple workaround for this, should I be using a different method or data structure? – Josmolio May 26 '17 at 22:50
  • @Josmolio if that's teh case, just remove the trailing separator in each line. I've updated the question with an example – vmg Jun 01 '17 at 10:22