0

I need to create a matrix from a .txt file, the proble I'm having is that the .txt file has a LOT of information (maybe no that much, but I'm new on programming so I have problems dealing with it)... The file has:

  • Client number
  • Name
  • Last name
  • Age
  • Single or married (S or M)
  • Sex (M or F)
  • Favourite animal

like this:

1 James Gordon 35 M M Dog
...
75 Julius Harrison 48 S M Cat

I managed to read the file and create a list for each one of the persons, but also it's needed to calculate the average of age, sex... I don't know how to separate each one of the elements so I can do the math. Here's the code so far.

infile=open("db.txt","r")
list=infile.read()

matrix=[]

raw = []
with open('db.txt','r') as f:
    for line in f:
        raw.append(line.split())
Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
PrintName
  • 5
  • 5

3 Answers3

0

You are using a list of list (raw) to hold the data. You can use list index to access the data.

e.g. average age

ages = [r[3] for r in raw]
average_age = float(sum(ages))/len(ages)
Anthony Kong
  • 37,791
  • 46
  • 172
  • 304
0

Since you've created a 2D list, you can simply access the elements using X and Y coordinates. For example, to calculate the average of their ages, you can sum up the fourth element of each person's sublist. The following page offers several ways to do this:

Calculate mean across dimension in a 2D array

Community
  • 1
  • 1
Flash
  • 11
  • 1
0

Depending on what things you're interested in doing with your matrix, it may be a good idea putting everything in a numpy array. You can slice objects in either axis more easily and numpy arrays are faster than transversing lists.

import numpy as np

# read file and store the file lines in `raw` as you have done above, then
matrix = np.array(raw, dtype=object)
matrix[:,3] = matrix[:,3].astype(int)

average = np.mean(matrix[:,3])

If you want to count how many males you have, you can count how many Ms you have in the gender column.

male_no = len(np.where(matrix[:,5] == 'M'))

However, an even better way for counting items, especially for animals, where you may have more than 1-2 options, you can use Counter from the collection package.

from collections import Counter

gender_count = Counter(matrix[:,5])
for key in gender_count.keys():
    print key, gender_count[key]
Reti43
  • 9,656
  • 3
  • 28
  • 44