-5

lets say i have the following text file. Let's say each color name is an account name and i want to know how many person are under it. all the account names come after a "/" or a "-". There are 3 accounts in the file I shared. It's the first word comes after "Color: ". So there are 3 accounts here. red, blue, and black. So, red/test/base, red-img-tests, red-zero-tests, and red-replication-tests are all part of account "red". And then I have to finally say how many of the person are there under red. So here it's red : 4.

---------------------------------
Color: red/test/base
  person: latest
---------------------------------
Color: red-img-tests
  person: latest
---------------------------------
Color: red-zero-tests
  person: latest
---------------------------------
Color: red-replication-tests
  person: latest
---------------------------------
Color: blue
  person: latest
---------------------------------
Color: black/red-config-img
  person: 7e778bb
  person: 82307b2
  person: 8731770
  person: 7777aae
  person: 081178e
  person: c01ba8a
  person: 881b1ad
  person: d2fb1d7
---------------------------------
Color: black/pasta
  person: latest
---------------------------------
Color: black/base-img
  person: 0271332
  person: 70da077
  person: 3700c07
  person: c2f70ff
  person: 0210138
  person: 083af8d

  person: latest
---------------------------------
Color: black/food-pasta-8.0
  person: latest

my output will be:

    red: 4
    blue: 1
    black: 17

I have thousands of line so as you can see, i can't really specify the words like 'red' or 'blue'... it has to somehow read each of them and see if they are the same as the following line.

for now i am doing the following to get the account names out.

import re
for line in f.readlines():#gives array of lines
    acc_name = re.split('; |, |\/|\-|\:', line)[1].strip()

3 Answers3

3

I have a solution using Counter for you:

import collections

data = """
---------------------------------
Color: red/test/base
  person: latest
---------------------------------
Color: red-img-tests
  person: latest
---------------------------------
Color: red-zero-tests
  person: latest
---------------------------------
Color: red-replication-tests
  person: latest
---------------------------------
Color: blue
  person: latest
---------------------------------
Color: black/red-config-img
  person: 7e778bb
  person: 82307b2
  person: 8731770
  person: 7777aae
  person: 081178e
  person: c01ba8a
  person: 881b1ad
  person: d2fb1d7
---------------------------------
Color: black/pasta
  person: latest
---------------------------------
Color: black/base-img
  person: 0271332
  person: 70da077
  person: 3700c07
  person: c2f70ff
  person: 0210138
  person: 083af8d
  """

print (data)
colors = ["black", "red", "blue"]
final_count = []
for line in data.split("\n"):
    for color in colors:
        if color in line:
            final_count.append(color)
            #break # Uncomment this break if you don't want to count
            # two colors in the same line
final_count = collections.Counter(final_count)
print(final_count)

Output

Counter({'blue': 1, 'black': 3, 'red': 5})

Here's the link to Python official documentation and a quick reference:

This module implements specialized container datatypes providing alternatives to Python’s general purpose built-in containers, dict, list, set, and tuple.

Pitto
  • 8,229
  • 3
  • 42
  • 51
0
count = {}

example = "apple apple apple apple red red green green green green green black"

for i in example.split():
    if i not in count:
        count[i] = 1
    elif i in count:
        count[i] += 1


print(count)
  • @pandaflieszeppelin updated the answer, please check it – Basavaraju US Sep 26 '19 at 07:39
  • 1
    While the solution may be correct now, there are may quite a few optimizations possible. For example using `defaultdict(int)` and not checking for dict-membership in the `elif` part. The best option is to use `Counter` as other answers have already pointed out. – rdas Sep 26 '19 at 07:44
0

You can use Counter() from the inbuilt package Collections Read about Counter() in Python 3.x here

from collections import Counter
data = "apple apple apple apple red red green green green green green black"
d = Counter(data.split())

print(d)

Dictionaries have the speciality that it doesn't store the duplicate value, so you get to get the count using this medium.

Alok
  • 8,452
  • 13
  • 55
  • 93