You don't need JScript in order to extract such a value from the .html file; you can do it directly with a Batch file.
If the structure of the desired line is always the same:
<span class="pro">/<span class="dip">buːm</span>/</span>
... you can do it as simple as this line:
for /F "tokens=3 delims=</>" %%a in ('findstr "\"dip\"" phoneme.html') do set "dip=%%a"
echo %dip%
If the line could change, first get the line with "dip"
value via a findstr
command, and then extract the dip
value:
for /F "delims=" %%a in ('findstr "\"dip\"" phoneme.html') do set "html=%%a"
set "dip=%html:*"dip">=%"
set "dip=%dip:<=" & rem "%"
echo %dip%
New code added
This new method was designed and extracted from OP's comments...
1- In your question you specified that you are looking for this string: "dip"
. However, in your comment it seems that the real string you want is this: "ipa dipa lpr-2 lpl-1"
. Please, note that the second string is very different than the first one because it contain spaces and most Batch commands are sensitive to spaces, so the code must be modified accordingly. BTW it is very bad "netiquette" that you provide us a certain data, test the code we wrote with different data, and then you say: "Your code not works"! Did you tested our code with the data you provided?
2- In my answer I specified: "If the structure of the desired line is always the same:"
<span class="pro">/<span class="dip">buːm</span>/</span>
However, it seems that the real line is very different:
</span><span class="pron dpron">/<span class="ipa dipa lpr-2 lpl-1">buːm</span>/</span></span> <span class="us dpron-i "><span class="region dreg">us</span><span class="daud"> converted= /span span class="pron dpron" / span class="ipa dipa lpr-2 lpl-1" buːm /span / /span /span span class="us dpron-i " span class="region dreg" us /span span class="daud"
I added: "If the line could change..." use the second code.
Why did you tested the first code if the real line is entirely different than the line you posted? You should use the second code instead... The aid could over-complicate if simple instructions are not followed...
3- In your comment you indicated that the html file is created with this line:
curl dictionary.cambridge.org/de/worterbuch/englisch/boom
When I tested such a line I got this:
<html>
<head><title>301 Moved Permanently</title></head>
<body>
<center><h1>301 Moved Permanently</h1></center>
<hr><center>nginx</center>
</body>
</html>
... but your complaint was: I just get: 'a href='https:'
I really don't know what else to say...
I prepared a test file with this contents:
Any other line...
</span><span class="pron dpron">/<span class="ipa dipa lpr-2 lpl-1">buːm</span>/</span></span> <span class="us dpron-i "><span class="region dreg">us</span><span class="daud"> converted= /span span class="pron dpron" / span class="ipa dipa lpr-2 lpl-1" buːm /span / /span /span span class="us dpron-i " span class="region dreg" us /span span class="daud"
Any other line...
This is the new code:
@echo off
setlocal EnableDelayedExpansion
REM curl dictionary.cambridge.org/de/worterbuch/englisch/boom > phoneme.html
for /F "delims=" %%a in ('findstr /C:"\"ipa dipa lpr-2 lpl-1\"" phoneme.html') do set "html=%%a"
set "dip=%html:*"ipa dipa lpr-2 lpl-1">=%"
set "dip=%dip:<=" & rem "%"
echo %dip%
... and this is the output:
buːm
It seems that the output contain an Unicode character that, of course, can not be properly managed by a Batch file... :(
PS - The Unicode character could be properly generated if chcp 65001
command is used...