0

I'm trying to modify some .htm files, replacing Name1 Lastname1, Name2 Lastname2 with Name1 Lastname1.

I found something here, and changed the code for my needs:

REM @echo off
setlocal disableDelayedExpansion

:Variables
set "_strFind=Titel: Name1 Lastname1, Name2 Lastname2<br>"
set "_strInsert=Titel: Name1 Lastname1<br>"
set /p PC=PC?:
set /p Name=Name?:
set InputFile=\\%PC%\C$\Users\%username%\AppData\Roaming\Microsoft\Signatures\GR.htm
set OutputFile=\\%PC%\C$\Users\%username%\AppData\Roaming\Microsoft\Signatures\GR1.htm


:Replace
">"%OutputFile%" (
  for /f "usebackq delims=" %%A in ("%InputFile%") do (
    if "%%A" equ "%_strFind%" (echo %_strInsert%) else (echo %%A)
  )
)"

This didn't do anything; what did I do wrong? and how can I find the mistake?

EDIT01:

<p class=MsoNormal><span style='font-size:8.0pt;font-family:"Arial",sans-serif'> John Doe GmbH<br> 
Blabla: John Doe, Johnny B.Good<br>
Bla bla bla<br>
Partner:<br>
Bla bla bla bla <o:p></o:p></span></p> 
</td>

And I'd like to remove the , Johnny B.Good part.

Compo
  • 36,585
  • 5
  • 27
  • 39
heyjonny
  • 9
  • 6
  • 1
    Where is your source data example and expected results? We cannot tell you what is wrong if we don't have any of this. – Gerhard May 20 '19 at 08:23
  • 1
    You might want to use `%name%` instead of `%username%` – Stephan May 20 '19 at 08:24
  • What I can tell you is that if your username is part of a longer string, it will never match the word only, the way you run it currently. – Gerhard May 20 '19 at 08:24
  • 1
    Remove the last character on the last line and the first character on the line four above it, _(hint: they're both doublequotes, **`"`**)_. – Compo May 20 '19 at 09:07
  • @GerhardBarnard The expected result is that the "Name2, Lastname2" is removed in the output. The " just avoided the html interpreter to read the ">" as a tag they are not part of the code – heyjonny May 20 '19 at 09:41
  • yes, but my point is, if you do a stright if comparison and the string does not match completely, it will not work. If I see the input example then I will be able to help. – Gerhard May 20 '19 at 09:56
  • Yeah but i think thats a usual proble with string comparison.

    John Doe GmbH
    Blabla: John Doe, Johnny B.Good
    Bla bla bla
    Partner:
    Bla bla bla bla

    And i'd like to remove the ", Johnny B.Good" part.
    – heyjonny May 20 '19 at 10:28
  • @JonasHuber please paste that into the question and format it correctly. It is impossible to guess, if/where there are line breaks when you do it in comments. – Stephan May 20 '19 at 10:33
  • @Stephan sorry i'm new to stackoverflow i edited the question. – heyjonny May 20 '19 at 10:44
  • What you appear to want is to find a line which begins with `Titel:`, is followed by a string ending with a **`,`** and ends with `
    `, and you want to replace everything from and including **`,`** with `
    `.
    – Compo May 20 '19 at 10:56

2 Answers2

0

As .htm(l) files are most likely encoded in UTF8, I wouldn't even try it with (pure) batch.

Check the encoding and try this (untested) batch:

@Echo off
set "infile=GR.htm"
set "outfile=GR1.htm"
set "Replace=, Johnny B.Good"
set "Prefix=John Doe"
powershell -NoP -c "(Get-Content '%infile%' -raw -Enc UTF8) -replace '(?<=%Prefix%)%Replace%'|Set-Content '%outfile%' -Enc UTF8"
  • 1
    That works fine so far, but there is something wrong with the encoding. Letters like ä,ö,ü are displayed with "?" – heyjonny May 20 '19 at 12:03
  • That's why I wrote check the encoding, it's important to read *and* write the proper one. You might also try default or OEM. –  May 20 '19 at 12:08
  • 1
    I thought with UTF8 its possible to display these letters? But ok i left out the -enc completely and it works now. Thank you for your help :) – heyjonny May 20 '19 at 12:16
  • Is it possible to make the Replace and the Prefix variables ? – heyjonny May 21 '19 at 14:20
  • They already *are* environment variables. It's unclear to me what you are asking for? –  May 21 '19 at 14:32
  • Yes, sorry. I'd like to fill it in while running the script. Like this set /p Prefix=Prefix: – heyjonny May 22 '19 at 10:34
0

This is a answer which uses , ( compatible), to perfrom the task.

In this version you do not know in advance any of Name1, Lastname1, Name2 or Lastname2; only Titel:, and <br>.

@PowerShell -NoP "(GC 'input.htm' -Enc UTF8) -Replace '(Titel:[^,]*).+?(?=<br>)','$1'|SC 'output.htm' -Enc UTF8"

You'd simply change the name of the input .htm file. If you have several htm files to modify, you could obviously run the PowerShell command in a loop over them, which I will leave to you to code yourself.

Compo
  • 36,585
  • 5
  • 27
  • 39