0

So I'm creating a program which should be able to read 8 separate text files and gather the information from those files into a single files.

First file contains information about the atheletes like this:

number;name;division.

The other files contain results from individual sport events like this:

number;result.

The program should be able to gather all the information about the athletes and put into a single file like this:

number;name;division;event1;event2...;event7.

The number is atheletes participant number and all other information should be "linked" to that number.

I'm really confused whether to use dict or list or both to handle and store the information from the text files.

The program is a lot more complex than explained above but I can work out the details myself. Also the allowed import libraries are math, random and time. I know these are pretty vague instructions but like I said I don't need a complete, functional program but rather guidelines how to get started. Thanks!

CodeManX
  • 11,159
  • 5
  • 49
  • 70
  • 1
    Is it an exercise or do you need to process the output later on? Sounds like a really bad idea to de-normalize the data, throwing unrelated stuff into a single file with no hierarchical structure or annotation. I can think of many possible solutions, but it depends on what you want... do you want to achieve your goal with 'good' code that makes use of classes etc.? Or just something that works, but needs to be read twice to understand what it does and how? – CodeManX May 06 '14 at 10:12
  • show a short example of the txt files and expected output, the order of the files may determine how easy it will be – Padraic Cunningham May 06 '14 at 10:14
  • @user3452623, into a single txt file or you want to store them grouped in a dict ? – Padraic Cunningham May 06 '14 at 10:34
  • 1
    I'm just curious why `math`, `random` and `time` are allowed to import? – qwetty May 06 '14 at 11:21

3 Answers3

0

Consult this post for how to read a file line-by-line.

with open(...) as f:
    for line in f:
        <do something with line>

Consult this post on how to split each line of a CSV.

Consult this post about how to add to a dictionary. I suggest adding a tuple as each entry in the dictionary.

d['mynewkey'] = 'mynewvalue'

Then concatenate and reassign the tuples to add data from new files:

d['mynewkey']=d['mynewkey'] + (newval1, newval2, newval3)

And remember, it is the commas that make a tuple, not the parentheses.

That should get you started.

Community
  • 1
  • 1
M. K. Hunter
  • 1,778
  • 3
  • 21
  • 27
  • The commas do not make a tuple, if you write t = (1,2,3) you have a tuple, if you have one element and use t =(3,) that creates a tuple, t = 3 also makes a tuple. commas are only significant when you have one element. – Padraic Cunningham May 06 '14 at 10:32
  • if `(3,)` is too cryptic / pythonic, you can also use `tuple(3)`. If you want to create a tuple with multiple elements this way, you need an extra pair of parathesis however: `tuple((1,2,3))`. A somewhat consistent `tuple((3))` doesn't work, 'cause `tuple()` will try to loop over the argument in this case, which is a non-iterable int and will therefore raise a TypeError. – CodeManX May 06 '14 at 10:53
0

first of all

open csv file for writing, next open all of your text files.

To do this use python with statement. You can easily open all text files in one line :)

with open('result.csv', 'w') as csvfile:

    # write column headers
    csvfile.write('number;name;division;event1; ...') 

    with open('file1.txt', 'r') as f1, open('file2.txt' , 'r') as f2, open(...) as f:
        f1_line = f1.readline()
        f2_line = f2.readline()
        # rest of your login ....

        csvfile.write(';'.join(item for item in [number, name, division, event1, ...]) + '.\n')

when You open all files, read from them line by line. Collect lines from all files, extract from line what you need and write it to csv file :)

PS. I don't know how many lines your files will have, but loading everything to memory (list or dict whatever) isn't good idea....

qwetty
  • 1,238
  • 2
  • 10
  • 24
  • The questioner stated that modules other than `math`, `time` and `random` must no be used. – CodeManX May 06 '14 at 10:50
  • Question was edited by autor after or during I write my answer .... No need to press 'down' button ... – qwetty May 06 '14 at 11:13
0

You can use a dict with the athelete numbers as keys to identify them, and use a class to store all other information in a meaningful and nice way. The results can be added to a list of an athlete object, and the athlete object be identified by the number (which is the dict key).

Sample input athletes.csv:

1;Jordan;Basketball.
2;Michael;Soccer.
3;Ariell;Swimming.

Sample input athletes_events.csv:

2;23.5.
2;25.7.
3;174.5.
1;13.
1;15.
2;21.3.
3;159.9.
2;28.6
1;19.

Code:

class Athlete:
    def __init__(self, name, division):
        self.name = name
        self.division = division
        self.events = []

athletes = {}

with open("athletes.csv") as file:

    for line in file:
        number, name, division = line.strip(".\n").split(";")
        # could cast number to int, but we don't have to
        athletes[number] = Athlete(name, division)


with open("athletes_events.csv") as file:

    for line in file:
        number, result = line.strip("\n").split(";")
        result = float(result.rstrip("."))
        try:
            athletes[number].events.append(result)
        except KeyError:
            print("There's no athlete with number %s" % number)

for number, athlete in sorted(athletes.items()):
    print("%s - %s (%s):" % (athlete.name, athlete.division, number))
    for i, result in enumerate(athlete.events, 1):
        print("  Event %i = %s" % (i, result))
    print()

Result:

Jordan - Basketball (1):
  Event 1 = 13.0
  Event 2 = 15.0
  Event 3 = 19.0

Michael - Soccer (2):
  Event 1 = 23.5
  Event 2 = 25.7
  Event 3 = 21.3
  Event 4 = 28.0

Ariell - Swimming (3):
  Event 1 = 174.5
  Event 2 = 159.9

Just replace the print()s by some file writing operation.

CodeManX
  • 11,159
  • 5
  • 49
  • 70
  • I don't know if it changes anything but the files are .txt, not .csv although they are formatted like csv files. – user3452623 May 09 '14 at 09:00
  • The file extension doesn't matter, it will work for any file that is formatted like described: `;` as seperator, `.` at the end of a line, `\n` as new line character. – CodeManX May 09 '14 at 09:12
  • I created the following loop to handle all the files containing results. for lines in results: with open(tied, encoding="utf-8") as f: where results is list of the file names. now I'm getting an error: number, result = line.strip("\n").split(";") ValueError: too many values to unpack (expected 2) – user3452623 May 09 '14 at 09:18
  • The formatting is either different from what you described, or there's something wrong with the semantic in your code. Please edit your question if it doesn't describe the actual situation. – CodeManX May 09 '14 at 09:54