0

I tried to locate only number followed by space and a character after it.

Exemple : text = "3 R"

and want it to be like this :

Exemple : text = "3. R"

i've tried this code :

text= re.sub(r'([0-9])(?!.*\d)', r'\1. ', text)

Am getting closer but don't know what should i add to it.

Update

Text :

Évitez les conversations malsaines en utilisant les 3 R, à savoir 
‘reformuler, recentrer et réorienter’. Créez un cadre confortable en 
reformulant les phrases susceptibles de générer des émotions négatives. Vous 
pouvez également reformuler des reproches tels que : « Cela m’ennuie que tu 
passes autant de temps sur des projets de moindre importance qui ne mènent 
nulle part » en disant plutôt « J’aimerais que tu consacres les efforts que 
tu fournis dans ton travail à davantage de nouveaux projets plutôt qu’à 
quelques projets peu importants... Je suis sûr que tu disposes maintenant de 
suffisamment d’expérience pour gérer des projets inédits et ambitieux. »

my regex code :

    text= re.sub(r'\s*(?!\.[’"])([.,?:])(?!(?<=\d.)\d)\s*', r'\1 ', text)
    text= re.sub(r'\s*([-])\s*', r'\1', text)
    text= re.sub(u"\u2013", " ", text)
    text= re.sub(r'(\d)\s+(?=\d)', r'\1', text)
    text= re.sub(r'(\d)\/+(?=\d)', r'\1 ', text)
    text= re.sub(r'([0-9])\b(?!.*\d)',r'\1. ', text)

Output:

Évitez les conversations malsaines en utilisant les 3 R, à savoir 
‘reformuler, recentrer et réorienter’. Créez un cadre confortable en 
reformulant les phrases susceptibles de générer des émotions négatives. Vous 
pouvez également reformuler des 
reproches tels que:  Cela m’ennuie que tu passes autant de temps sur des 
projets de moindre importance qui ne mènent nulle part  en disant plutôt  
J’aimerais que tu consacres les efforts que tu fournis dans ton travail à 
davantage de nouveaux projets plutôt qu’à quelques projets peu importants, Je 
suis sûr que tu disposes maintenant de suffisamment d’expérience pour gérer 
des projets inédits et ambitieux.

i've tried the codes suggested by you guys but not working idk why, text is a long string.

The problem could be due to using too much regex??

I'm using python3.9

snippet

enter image description here

enter image description here

chikabala
  • 653
  • 6
  • 24

2 Answers2

1

Based on the constraints you defined (input/output) and the discussion we had you can use this snippet:

re.sub(r"(\d+)(?:\s+)(\w)", r"\1. \2", text)
Masked Man
  • 2,176
  • 2
  • 22
  • 41
  • thank you sooo much, can u please explain why the previous solution didn't work, bc in regex website it worked but not for me – chikabala Apr 08 '21 at 16:53
  • @anubhava still the only solution that worked for me, i'm already using a regex code to ignore spaces inside a number. – chikabala Apr 08 '21 at 16:55
  • @Adgogo: I have shown how the very first solution I gave is working fine. You can copy/paste my suggested in your code anywhere and test. Fact that you have many more `sub` calls is muddling the water – anubhava Apr 08 '21 at 16:58
  • @anubhava u can check in the snippet that i've tried your code but it didn't work for me, instead it's working on a separated file idk why – chikabala Apr 08 '21 at 17:01
  • No I don't see `r'(\d)(?= \D)'` anywhere in your question – anubhava Apr 08 '21 at 17:03
  • @anubhava please check line 29 on first snippet – chikabala Apr 08 '21 at 17:07
  • That is not code snippet but a screenshot. Anyway I have already explained that `Fact that you have many more sub calls is muddling the water`. Comment out 5 `sub` calls before that and then test it. – anubhava Apr 08 '21 at 17:11
  • 1
    Just FYI: `r"(\d+)(?:\s+)(\w)"` = `r"(\d+)\s+(\w)"`. It is also equal to `re.sub(r"\d+(?=\s+\w)", r"\g<0>. ", text)` – Wiktor Stribiżew Apr 08 '21 at 19:39
0

This works for me:

re.sub('(\d)\s([a-zA-Z])', r'\1. \2', text)

It replaces the 3 R with 3. R. Also works with bigger numbers, like 31789 R, and lowercase 3 r.

LuisAFK
  • 846
  • 4
  • 22