-4

I'm just studying Python and found some place even not as convenient as Java8, e.g. word count

At first I thought it may be very easy to implement just like

>>> {x : x**2 for x in range(10)}
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

But actually I found it is a little cumbersome

>>> sent3
['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> for w in sent3:
...     if w in word_count:
...         word_count[w] += 1
...     else:
...         word_count[w] = 1
...

But in Java8 it's very convenience to implement it,

    List<String> strings = asList("In", "the", "beginning", "God", "created", "the", "heaven", "and", "the", "earth");
    Map<String, Long> word2CountMap = strings.stream().collect(groupingBy(s -> s, counting()));

or

    word2CountMap = new HashMap<>();
    for (String word : strings) {
        word2CountMap.compute(word, (k, v) -> v == null ? 1 : v + 1);
    }

I want to know if exist some advanced usage of Python dict could implement it more easily that I do not know?

zhuguowei
  • 8,401
  • 16
  • 70
  • 106

3 Answers3

5

Here is a faster way of counting the words using Counter from collections module.

>>> from collections import Counter
>>> sent3 = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']
>>> Counter(sent3) 
Counter({'the': 3, 'In': 1, 'beginning': 1, 'God': 1, 'created': 1, 'heaven': 1, 'and': 1, 'earth': 1, '.': 1})

And if you want a dict object and not that of Counter type:

>>> dict(Counter(sent3))
{'In': 1, 'the': 3, 'beginning': 1, 'God': 1, 'created': 1, 'heaven': 1, 'and': 1, 'earth': 1, '.': 1}
P-Gn
  • 23,115
  • 9
  • 87
  • 104
anon
  • 1,258
  • 10
  • 17
3

While you can use collections.Counter() - and I recommend you use it - you can easily accomplish what your asking using a dictionary comprehension - a concept closely tied into the Python idiom:

>>> sent3 = ['In',
...  'the',
...  'beginning',
...  'God',
...  'created',
...  'the',
...  'heaven',
...  'and',
...  'the',
...  'earth',
...  '.']
>>> {word : sent3.count(word) for word in sent3}
{'.': 1,
 'God': 1,
 'In': 1,
 'and': 1,
 'beginning': 1,
 'created': 1,
 'earth': 1,
 'heaven': 1,
 'the': 3}
>>> 

You see, the problem is rarely that one programming language is less functional than the other. It simply seems that way because when learning a new language, you don't yet have the experience necessary to know of the specific language features that are suited to certain tasks, as is the case here.

However, that's not to say that all languages are the same. There are differences in each language, and each language has a different philosophy and different idioms. When learning a new language, it's better to ask "I can do X in Java this way. What is the idiomatic way to this in Python?" rather than "I can X in Java this way. In Python, it's not as convenient."

Christian Dean
  • 22,138
  • 7
  • 54
  • 87
0

You should check out collections.Counter

In [1]: from collections import Counter

In [2]: c = Counter(['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.'])

In [3]: c
Out[3]:
Counter({'.': 1,
         'God': 1,
         'In': 1,
         'and': 1,
         'beginning': 1,
         'created': 1,
         'earth': 1,
         'heaven': 1,
         'the': 3})
sbochins
  • 198
  • 1
  • 10