1

I'm currently writing some code and am using pandas to export all of the data into csv files. My program runs multiple iterations until it has gone through all of the necessary files. Pandas is re-writing one file each iteration but when it moves onto the next file I need it to reset all of the data (I think).

Structure is roughly:

While loop>a few variables are named>program runs>dataframe=(pandas.DataFrame(averagepercentagelist,index=namelist,columns=header))

This part works with no problem for one file. When moving onto the next file, all of the arrays I use are reset and this I think is why pandas gives the error Shape of passed values is (1,1), indices imply (3,1).

Please let me know if I need to explain it better.

EDIT:

While True:
    try:
        averagepercentagelist=[]
        namelist=[]
        columns=[]
        for row in database:
            averagepercentagelist=["12","23"]
            namelist=["Name0","Name1"]
            columns=["Average percentage"]
            dataframe=(pandas.DataFrame(averagepercentagelist,index=namelist,columns=header))
    except Exception as e:
        print e
        break

SNIPPET:

dataframe= (pandas.DataFrame(averagepercentagelist,index=namelist,columns=header))

currentcalculatedatafrane = 'averages' + currentcalculate

dataframeexportpath = os.path.join(ROOT_PATH,'Averages',currentcalculatedatafrane)

dataframe.to_csv(dataframeexportpath)

FULL PROGRAM SO FAR:

import csv
import os
import re
import pandas
import tkinter as tk
from tkinter import messagebox
from os.path import isfile, join
from os import listdir
import time



ROOT_PATH = os.path.dirname(os.path.abspath(__file__))
indexforcalcu=0
line_count=0
testlist=[]
namelist=[]
header=['Average Percentage']



def clearvariables():
    indexforcalcu=0
    testlist=[]

def findaverageofstudent(findaveragenumber,numoftests):
    total=0
    findaveragenumber = findaveragenumber/numoftests
    findaveragenumber = round(findaveragenumber, 1)
    return findaveragenumber





def removecharacters(nameforfunc):
    nameforfunc=str(nameforfunc)
    elem=re.sub("[{'}]", "",nameforfunc)
    return elem





def getallclasses():
    onlyfiles = [f for f in listdir(ROOT_PATH) if isfile(join(ROOT_PATH, f))]
    onlyfiles.remove("averagecalculatorv2.py")
    return onlyfiles





def findaveragefunc():
    indexforcalcu=-1
    while True:
        try:
            totaltests=0
            line_count=0
            averagepercentagelist=[]
            indexforcalcu=indexforcalcu+1
            allclasses=getallclasses()
            currentcalculate=allclasses[indexforcalcu]
            classpath = os.path.join(ROOT_PATH, currentcalculate)
            with open(classpath) as csv_file:
                classscoredb = csv.reader(csv_file, delimiter=',')
                for i, row in enumerate(classscoredb):
                    if line_count == 0:
                        while True:
                            try:
                                totaltests=totaltests+1
                                rowreader= {row[totaltests]}
                            except:
                                totaltests=totaltests-1
                                line_count = line_count + 1
                                break
                    else:
                        calculating_column_location=1
                        total=0

                        while True:
                            try:
                                total = total + int(row[calculating_column_location])
                                calculating_column_location = calculating_column_location + 1
                            except:
                                break

                        i=str(i)
                        name=row[0]
                        cleanname=removecharacters(nameforfunc=name)
                        namelist.append(cleanname)
                        findaveragenumbercal=findaverageofstudent(findaveragenumber=total,numoftests=totaltests)
                        averagepercentagelist.append(findaveragenumbercal)
                        line_count = line_count + 1
                        dataframe= (pandas.DataFrame(averagepercentagelist,index=namelist,columns=header))
                        currentcalculatedatafrane = 'averages' + i + currentcalculate
                        dataframeexportpath = os.path.join(ROOT_PATH,'Averages',currentcalculatedatafrane)
                        dataframe.to_csv(dataframeexportpath)
                        i=int(i)


        except Exception as e:
            print("ERROR!!!!!!!!!!!!!!!!!!!!!!!!!!!!\n\n",e)
            break





def makenewclass():
    global newclassname
    getclassname=str(newclassname.get())

    if getclassname == "":
        messagebox.showerror("Error","The class name you have entered is invalid.")
    else:
        classname = getclassname + ".csv"
        with open(classname, mode='w') as employee_file:
            classwriter = csv.writer(employee_file, delimiter=',', quotechar='"', quoting=csv.QUOTE_MINIMAL)
            classwriter.writerow(["Name","Test 1"])











root=tk.Tk()




root.title("Test result average finder")


findaveragebutton=tk.Button(root,text="Find Averages",command=findaveragefunc())
findaveragebutton.grid(row=2,column=2,padx=(10, 10),pady=(0,10))


classnamelabel=tk.Label(root, text="Class name:")
classnamelabel.grid(row=1, column=0,padx=(10,0),pady=(10,10))


newclassname = tk.Entry(root)
newclassname.grid(row=1,column=1,padx=(10, 10))


newclassbutton=tk.Button(root,text="Create new class",command=makenewclass)
newclassbutton.grid(row=1,column=2,padx=(0, 10),pady=(10,10))



root.mainloop()

Thanks in advance, Sean

Cyril
  • 28
  • 7
  • 2
    Can you create [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) ? – jezrael Feb 01 '19 at 11:16
  • @jezrael added an edit... I haven't tested this small extract of code just better describes the structure. – Cyril Feb 01 '19 at 11:25

1 Answers1

0

Use:

import glob, os
import pandas as pd


ROOT_PATH = os.path.dirname(os.path.abspath(__file__))

#extract all csv files to list    
files = glob.glob(f'{ROOT_PATH}/*.csv')
print (files)

#create new folder if necessary
new = os.path.join(ROOT_PATH,'Averages')
if not os.path.exists(new):
    os.makedirs(new)

#loop each file
for f in files:
    #create DataFrame and convert first column to index
    df = pd.read_csv(f, index_col=[0])
    #count average in each row, rond and create one colum DataFrame
    avg = df.mean(axis=1).round(1).to_frame('Average Percentage')
    #remove index name if nncessary
    avg.index.name = None
    print (avg)

    #create new path
    head, tail = os.path.split(f)
    path = os.path.join(head, 'Averages', tail)
    print (path)  

    #write DataFrame to csv
    avg.to_csv(path)   
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • I tried this earlier and was not successful because I didn't know how to implement this in my code... how would I add "i" to `dataframe.to_csv(dataframeexportpath)` – Cyril Feb 01 '19 at 11:36
  • it is a variable which I have made... it includes root path, subfolder, and file name which changes after multiple iterations have been completed for each file – Cyril Feb 01 '19 at 11:40
  • Just tried that... for some reason the same error persists – Cyril Feb 01 '19 at 11:48
  • Have posted full code - name of the application must be `averagecalculatorv2.py` and there has to be another folder named `Averages` in the program folder. There are not allowed to be any other files in the program folder. – Cyril Feb 01 '19 at 11:52
  • Here is the whole program (I have included the files I would like it to output, right now, is can output only the year10test.csv. This program can write the csv files by itself and work out and output the averages for one of the csv files on its own - just need to get it tto work for multiple files. https://drive.google.com/drive/folders/1EaEluKQUwxBMMDMpuE47XnMouBweKeWM?usp=sharing – Cyril Feb 01 '19 at 12:23
  • Input data is the year10test.csv and year11test.csv – Cyril Feb 01 '19 at 12:40
  • The output(s) are meant to be found in the averages folder. Right now the code works for one file with no problem. It is just that I cannot reset pandas dataframe after it has completed that file. The actual code is working... – Cyril Feb 01 '19 at 12:45
  • The input files are in the VERSION 2 test folder. They are named `year10test.csv` and `year11test.csv` – Cyril Feb 01 '19 at 12:48
  • Ok thank you very much! I appreciate your help... I'm only 14 and trying to learn how to handle csv databases in python – Cyril Feb 01 '19 at 12:56
  • @SeanMcC. - Please check solution for get csv files, get average and write to new folder – jezrael Feb 01 '19 at 13:59
  • @Sean McC. Sure, not sure with path, so create this way. Also working only once witj tkintee, so cannot help with this code. Only witg pandas. – jezrael Feb 01 '19 at 20:32
  • Hey jezrael, this code is so simple and looks very effective. I'm having trouble with the database - I'm getting an error in the read csv file line (line 19). How did you design the database? – Cyril Feb 02 '19 at 09:34
  • Here is a link to the errors I'm getting: https://drive.google.com/open?id=1XlvxqpN5qst_0KOglkm8iQm9BMvgccDQ – Cyril Feb 02 '19 at 10:14
  • @SeanMcC. - Change `df = pd.read_csv(f, index_col=[0])` to `df = pd.read_csv(f, index_col=[0], encoding = "ISO-8859-1")`, check also [this](https://stackoverflow.com/a/18172249) – jezrael Feb 04 '19 at 06:34
  • 1
    Hey jezrael, thanks for the help! I just realised I still had python set to 3.7 from a program I was working on earlier... changed it to 3.6 and works like a charm. Thanks, Sean – Cyril Feb 04 '19 at 08:44