How to print only if a character is an alphabet?

Question

This is from my previous post here:

How to return 1 only if the last letter of a word is a vowel? Return 0 otherwise

Here is the code I am using:

import sys
import re

pattern = re.compile("^[a-z]+$")  # matches purely alphabetic words
starting_vowels = re.compile("(^[aeiouAEIOU])")  # matches starting vowels
ending_vowels = re.compile("[aeiouAEIOU]$")  # matches ending vowels
starting_vowel_match = 0
ending_vowel_match = 0

for line in sys.stdin:
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        if len(word) == 1:
            print(word[0], 1, *((1, 1) if word[0] in 'aeiou' else (0, 0))) # * unpacks startVowel 1 endVowel 1 if word[0] is a vowel
        else:
            print(word[0], 1, 1 if word[0] in 'aeiou' else 0, 0) 
            print(*(f'{letter} 1 0 0' for letter in word[1: -1]), sep='\n')
            print(word[-1], 1, 0, 1 if word[-1] in 'aeiou' else 0)

I want this to only print if a character is an alphabet, so an example output I would like is this for a text file containing the string "It's a beautiful life":

I am currently seeing this:

I am wondering how to get rid of special characters in the output. I have tried a couple things including adding

        for letter in word:
            if pattern.match(letter):

in the for letter in word" block, but it is not returning the output I want.

You need to rewrite this: *print(*(f'{letter} 1 0 0' for letter in word[1: -1]), sep='\n')* — DarkKnight, Apr 27 '23 at 09:06

DarkKnight · Answer 1 · 2023-04-27T10:11:48.003

Not sure why the original code does some work with re as it's never used.

When analysing a word of more than 1 letter, you need to consider all characters in the [1:-1] split individually.

Something like this:

import sys
from string import ascii_lowercase as LOWER

VOWELS = set('aeiou')

def isvowel(c):
    return int(c in VOWELS)

for line in sys.stdin:
    for word in line.strip().lower().split():
        if len(word) == 1:
            print(word, '1 1', isvowel(word[0]))
        else:
            print(word[0], 1, isvowel(word[0]), 0)
            for letter in word[1:-1]:
                if letter in LOWER:
                    print(f'{letter} 1 0 0')
            print(word[-1], '1 0', isvowel(word[-1]))

Output:

the original code had re because it was part some boilerplate code given to me for a mapreduce problem I am working on. It might be needed later. — Zaku, Apr 28 '23 at 00:01

score -1 · Answer 2 · answered Apr 27 '23 at 09:08

So you want to split a string into words and every word into alphabetical letters. For each letter you want wo print:

[letter] [starting_vowel_match] [letter_vowel_match] [ending_vowel_match]

Here would be my approach to this problem:

import re

test = "It's a beautiful life"

for line in test.split("\n"):
    line = line.strip()  # removes leading and trailing whitespace
    words = line.lower().split()  # splits the line into words and converts to lowercase
    for word in words:
        for letter in re.sub(r'[^a-zA-Z0-9]', '', word):
            print(
                letter, 
                1 if word[0] in 'aeiou' else 0,
                1 if letter in 'aeiou' else 0, 
                1 if word[-1] in 'aeiou' else 0)

The result looks different than your example output but I expected the first row to contain the starting_vowel_match!

How to print only if a character is an alphabet?

2 Answers2

Linked