-2

I'm trying to count how many non-empty lines are there and my code is working but I failed to count if there is the same name in different lines, for example

john 01/2
jack 01/2

john 02/3

because I want to count lines with repeated names (and different dates) as one


def number_people(path):

    x = 0
    with open(path) as f:
        for line in f:
            if line.strip():
                x += 1
    return x
gboffi
  • 22,939
  • 8
  • 54
  • 85
jack
  • 33
  • 7
  • and there is no blank lines – jack Feb 10 '16 at 08:57
  • See http://stackoverflow.com/questions/845058/how-to-get-line-count-cheaply-in-python – Fredrik Pihl Feb 10 '16 at 08:59
  • What do you excpect to happen if there is `john 01/2` and `jack 01/2`? – java Feb 10 '16 at 08:59
  • its possible to have same person in the same text file but with different date so I'm trying to calculate them as one – jack Feb 10 '16 at 09:01
  • i am expecting it to count it as 1 because the names are same – jack Feb 10 '16 at 09:03
  • i already did the function but i am stuck with counting the same person as one @Fredrik Pih – jack Feb 10 '16 at 09:05
  • compare sub strings ... if there is a match don't count. you need to notice what you will do in a situation : john 01/2 jack 01/1 john 01/3. there is a match between 1 and 3. This is a a school question. – java Feb 10 '16 at 09:07
  • You need to start by actually *doing something* with the names rather than just counting lines. Hint: I'd use a [set](https://docs.python.org/2/library/sets.html) to remove duplicates and then just get the length of the set at the end. – SiHa Feb 10 '16 at 09:14

2 Answers2

1

If the line always looks like 'Name Date':

def number_people(path):
    x = 0
    namesList = [] # Make a list of names
    with open(path) as f:
        for line in f:
            line = line.strip()
            try:
                name, date = line.split(' ') # if there's a space in between
                if name not in names: # If the name is not in namesList...
                    x += 1
                    namesList.append(name) # put the name in namesList
            except ValueError:
                print(line)
                #pass
    return x

EDIT

Fixing ValueError. Note: it now skips lines that do not match the split condition. It now prints the lines that do not match, but you can also skip them (pass) and continue.

Nander Speerstra
  • 1,496
  • 6
  • 24
  • 29
  • 1
    well, in that case your line doesn't look like 'Name Date'. – Nander Speerstra Feb 10 '16 at 09:10
  • 1
    Try a `print(line)` on the line above the `line.split(' ')` to see what the line looks like. – Nander Speerstra Feb 10 '16 at 09:11
  • and it is not printing the whole line when i try print line – jack Feb 10 '16 at 09:15
  • Can you show the exact error message? And a sample of what it is printing? – Nander Speerstra Feb 10 '16 at 09:17
  • name, date = line.split(' ') # if there's a space in between ValueError: too many values to unpack (expected 2) – jack Feb 10 '16 at 09:19
  • printed James 11/21 12/11 instead of James 11/21 James 12/11 John 12/11 – jack Feb 10 '16 at 09:20
  • You should try something like `for line in f: print(line)` and outcomment the rest, because it almost has to be a problem in your file (probably not consistent)... – Nander Speerstra Feb 10 '16 at 09:25
  • i'm trying to count the lines only not printing it and now it is printing all of it – jack Feb 10 '16 at 09:26
  • I know. But you're code doesn't work, does it? So you'll have to figure out what's wrong first, and [that's what you do by printing](https://www.codementor.io/python/tutorial/how-to-debug-python-code-beginners-print-line). Those lines can be outcommented later, when the script works. – Nander Speerstra Feb 10 '16 at 09:28
0

The function len gives you the lenght of an iterable, here a set

Here set is applied to a generator expression, and retains only the generated items that are unique, so that len(set(...)) gives you the number of different elements that are generated by ...

The generator expression operates on the lines l of your file, discarding the empty lines, and for each line we split it, obtaining a two element list, retaining the first element only that is the name of a person.

To sum it up
sequence of names -> set -> unique names -> len -> number of unique names
and this translates to the following function definition

def number_people(path):
    return len(set(l.split()[0] for l in open(path) if l.strip()))
gboffi
  • 22,939
  • 8
  • 54
  • 85