1

I have some lines in a text file :

Joëlle;Dupont;123456
Alex;Léger;134234

And I want to replace them by :

Joëlle;Dupont;123456;joelle.dupont@mail.com
Alex;Léger;134234;alex.leger@mail.com

I want to replace all characters with accents (é, ë…) by characters without accents (e, e…) but only on the mail adress, only on a part of the line.

I know I can use \L\E to change uppercase letter into lowercase letter but it's not the only thing I have to do.

I used :

(.*?);(.*?);(\d*?)\n

To replace it by :

$1;$2;$3;\L$1.$2@mail.com\E\n

But it wouldn't replace characters with accents :

Joëlle;Dupont;123456;joëlle.dupont@mail.com
Alex;Léger;134234;alex.léger@mail.com

If you have any idea how I could do this with Notepad++, even with more than one replacement, maybe you can help me.

1 Answers1

2

I don't know your whole population, but you could use the below to replace the variations of e with an e:

[\xE8-\xEB](?!.*;)

And replace with e.

[I got the range above from this webpage, taking the column names]

regex101 demo

This regex matches any è, é, ê or ë and replaces them with an e, if there is no ; on the same line after it.


For variations of o:

[\xF2-\xF6](?!.*;)

For c (there's only one, so you can also put in ç directly):

\xE7(?!.*;)

For a:

[\xE0-\xE5](?!.*;)
Jerry
  • 70,495
  • 13
  • 100
  • 144
  • They are French letters, and there are also others letters like ç, à, ô… to replace them by c, a, o… respectively. But most importantly, the thing I want to do is for example replace é by e, but only in the mail adress, without modifying the rest of the line. – Alexandre Lagane Jul 10 '14 at 10:51
  • @Minizarbi Added for the other letters and replace each with the 'main' letter. The regex only modifies the last 'column' in your data if `;` is the delimiter. If you have more than one `column` of email addresses, then you might use `(?=[^;@]*@)` instead of `(?!.*;)`. – Jerry Jul 10 '14 at 11:05
  • I didn't know the (?!…) or (?=…) and I think it's the solution to my problem, so it works. And I didn't know the two websites you linked. – Alexandre Lagane Jul 10 '14 at 12:15
  • @Minizarbi If you want me to explain how these work, let me know :) They are called 'lookarounds' in general, and more specifically, `(?= ... )` is a positive lookahead and `(?! ... )` is a negative lookahead. You can also search those terms on the internet, that's where I learned those in the first place :) – Jerry Jul 10 '14 at 12:35
  • I didn't understand in the first place so I looked here : http://perldoc.perl.org/perlre.html#Look-Around-Assertions and now I understand, so thank you :) – Alexandre Lagane Jul 10 '14 at 12:44
  • +1, Great research, why no upvotes...? Could be further simplified for legibility, for instance `[è-ë](?!.*;)` which would work in N++. – zx81 Jul 10 '14 at 23:36
  • Ah, dug this one up I answered a few days ago: [French letters](http://stackoverflow.com/questions/24623395/how-to-filter-the-special-characters-not-including-the-french-letters-using-re/24623419#24623419) Leaving it here for reference if someone feels like starting a French collection. – zx81 Jul 10 '14 at 23:43