1

I am doing some hands-Ons online in python for nltk.

The task is to filter words from complete set of Text6 having first letter in upper case and all other letters in lower case. Print the number of words present.

Can someone please assist to tell the exact answer (as it is standard text from book of nltk) and what's wrong in the code.

I tried below code:

from nltk.book import text6
import re
pattern = '[A-Z]+[a-z]+$'
capsword= [word for word in set(text6) if re.search(pattern, word)]
print(len(capsword))

My actual output is 461. But, I am not sure about expected output as same is hidden.

Gopesh
  • 195
  • 1
  • 3
  • 17
  • Post a sample output please. – DirtyBit Apr 02 '19 at 11:32
  • 1
    Possible duplicate of [How to capitalize the first letter of each word in a string (Python)?](https://stackoverflow.com/questions/1549641/how-to-capitalize-the-first-letter-of-each-word-in-a-string-python) – Robson Apr 02 '19 at 11:34
  • I don't have any expected output. same is hidden. But when I execute this code..it is not accepting. As per the task requirement, the count(numeric value only) of words is expected. I also print the 'capsword' list...and can see all words are matching the filter. – Gopesh Apr 02 '19 at 11:34
  • @Robson: I need to count the existing words (the link provided is to do update for words) – Gopesh Apr 02 '19 at 11:40

2 Answers2

1

I changed the pattern (to include special char words like ABC! or ABC.) and it worked:

from nltk.book import text6
import re
pattern = '[A-Z][a-z*]'
a = [word for word in set(text6) if (re.search(pattern, word))]
print(len(a))
Gopesh
  • 195
  • 1
  • 3
  • 17
1

This should work for you, it worked for me on Fresco Play

from nltk.book import text6    
title_words = [word for word in text6 if word.istitle()]

print(len(title_words))

Output: 2672

As per the challenge, we are supposed to use complete set of words in text6.

istitle() method returns True for words with first letter in upper case and all other letters in lower case