1

Here is the problem:

  1. Replace input string with the following: The first and last characters, separated by the count of distinct characters between the two.
  2. Any non-alphabetic character in the input string should appear in the output string in its original relative location.

Here is the code I have thus far:

word = input("Please enter a word: ")
first_character = word[0]
last_character = word[-1]
unique_characters = (list(set(word[1:-1])))
unique_count = str(len(unique_characters))
print(first_character[0],unique_count,last_character[0])

For the second part, I have thought about using regex, however I have not been able to wrap my head around regex as it is not something I ever use.

DarthOpto
  • 1,640
  • 7
  • 32
  • 59

1 Answers1

1

You can use

import re
pat = r"\b([^\W\d_])([^\W\d_]*)([^\W\d_])\b"
s = "Testers"
print(re.sub(pat, (lambda m: "{0}{1}{2}".format(m.group(1), len(''.join(set(m.group(2)))), m.group(3))), s))

See the IDEONE demo.

The regex breakdown:

  • \b - word boundary (use ^ if you test an individual string)
  • ([^\W\d_]) - Group 1 capturing any ASCII letter (use re.U flag if you need to match Unicode, too)
  • ([^\W\d_]*) - Group 2 capturing zero or more letters
  • ([^\W\d_]) - Group 3 capturing a letter at...
  • \b - the trailing word boundary (replace with $ if you handle individual strings)

In the replacement pattern, the len(''.join(set(m.group(2)))) is counting the number of unique letter occurrences (see this SO post).

If you need to handle 2-letter words like Ts > Ts, you may replace * with + quantifier in the second group.

Community
  • 1
  • 1
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I tested with a string of `T3sters` and was expecting `T34s` as the output since the non-alpha characters are supposed to remain in the string, but got my original input as the output. – DarthOpto Mar 16 '16 at 13:10
  • So, you do not expect the words to be whole words. I see. Just take out the `\b`s. But then, `T3sters` will yield `T3s3s`. See [this demo](https://ideone.com/SINwrJ). Also, the requirement for a word is that it should be *a sequence of alphabetic characters, delimited by any non-alphabetic characters.* So, `T3sters` cannot yield `T34s` as `3` is not an alphabetic character. – Wiktor Stribiżew Mar 16 '16 at 13:12