1

I am pretty new to Python and have a large data frame with a column that has variations of the terms No. and Tp.

e.g.

Title

  • ATPIA No.56 tp.12
  • BTPIB no. 35 Tp.56
  • CTPIC No. 5 Tp. 42

And I want to standardise the format (each string is of a different length) so that there the No/Tp are always capitalised and there is always a space between the full stop and the number. So the above would become:

Title

  • ATPIA No. 56 Tp. 12
  • BTPIB No. 35 Tp. 56
  • CTPIC No. 5 Tp. 42

I have been using the replace function

replacement_mapping = {
    "Tp.": "Tp. ",
    "TP. ": "Tp. ",
    "No.": "No. ",
    "no.": "No. "
}
df_new = df.replace(replacement_mapping,regex=True)

This works for the most part, however the function is also picking up the letters in other words in the string

e.g. BTPIB => Btp. B

Is there a way to prevent the replace function from picking up the '.' as an 'I' or a better method to format them?

Thank you!

  • escape character for the `.` so it becomes a literal character instead of the wildcard. So replace the dot with `\.` – Edo Akse Jun 15 '22 at 15:03
  • Thanks Edo - How would that appear in the replacement_mapping dict? If I put that character in front of '.' then it is included in the string "no.": "No\. "(sorry if I have misunderstood) – Newbiee6977 Jun 15 '22 at 15:16
  • my bad, you need double slash, please see [this answer](https://stackoverflow.com/a/36297048/9267296) – Edo Akse Jun 15 '22 at 18:17

0 Answers0