how to get some text informations from a nontext file using batch commands in Windows?

Question

guys. I'm trying to get the file version from inside some nontext files. In each of them (approximately at the beginning) there are a few text lines containing informations about the file. For example:

[some nontext data (very few)]
version: 455467
build date: 23.11.2010
.....
[rest of the nontext data]

If you want I'll try to make such a file but I can't show you the original files (my company won't allow it). Sorry...

I tried this code:

@echo off
for /f "tokens=1,2" %%A in (file.dat) do if %%A==version: (set version=%%B
goto found)
echo not found
goto end
:found
echo found: %version%
:end
pause

But it works only if "file.dat" is a text file, if not I get "not found". If I replace file.dat with 'type file.dat' it does not return (processor usage 100%). If I replace file.dat with 'find /i "version:" file.dat' it works, but it's very, very slow (minutes). Since I have to process many files and I have little time I can't use it. It works a lot faster if I enter each file manually with a viewer and copy version number; but the point is that I want to do it with a cmd...

Oh, and I can't install other programs on the computer where I'm working....

The OS is Windows XP x86.

Please help me. Thank you.

Best regards, Cosmin

Later edit: I have "build" a test file so everybody can see and test: http://www.mediafire.com/download.php?r0x5702lkv14jro It's very small (real files have dozens, some even hundreds of MB).

Later later edit: the test file is useful to test IF the code finds the number but, been very small, it doesn't give you an idea about how much time is needed for a real data file. But you can do this: measure the time in which the test file is scanned and multiply by "100 MB / 2088 Bytes" = 50 219. For example this works with "find". With "type" is even slower (I think it's exponentially, not liniar).

What operational system version? Looks like a job for powershell, not plain old MS-DOS batch files. Windows XP and Windows Server 2003 - PowerShell V1 and V2 versions are downloaded and installed (effectively as an OS Patch). Windows Server 2008 - PowerShell v1 is a 'feature' and can be added. PowerShell V2 versions are downloaded and installed (effectively as an OS Patch). Windows 7/Windows Server 2008 R2 - PowerShell V2 are installed by default. — Paulo Scardine, Feb 18 '11 at 03:38
Windows XP. But, like I said, I can't install other programs on that computer, not even PowerShell. But thank you for the idea. — Cosmin, Feb 18 '11 at 03:46

score 2 · Answer 1 · edited May 23 '17 at 12:21

I used a simplified version of jeb's FC read binary technique to read the first 1024 bytes of the DAT file. ( converting a binary file to HEX representation using batch file ) I preserve only the printable ASCII characters and <LF>, the rest I throw away. I used a compare file containing 1024 <backspace> characters so that I don't have to worry about the gaps in the FC output.

I use a map I developed for my hexDump.bat routine ( http://www.dostips.com/forum/viewtopic.php?p=7038 ) to convert the hex representation back into the ASCII characters.

Then all that is left is some straight forward string manipulation to parse the version. I look for <LF>version:, strip leading spaces, and then take all printable characters up until the next <LF> as the version value.

This solution assumes the version lies within the first 1024 characters. It could be extended to support the first 8k just by increasing the size of the compare file.

The solution seems plenty fast, and the size of the DAT file should have no impact on performance.

@echo off
setlocal enableDelayedExpansion

:: Build a binary file containing 1024 <backSpace> characters
set compareFile="BS1024.DAT"
if not exist %compareFile% (
  for /f "tokens=1 delims=# " %%a in ('"prompt #$H#$E# & echo on & for %%b in (1) do rem"') do (
    <nul set/p"=%%a" >%compareFile%
  )
  for /l %%n in (1 1 10) do type %compareFile% >>%compareFile%
)

:: Create a variable containing <lineFeed> character (0x0A)
set lf=^


:: Above 2 blank lines are critical - do not remove.

:: Grab the first 1024 bytes, preserving only printable ASCII characters and <lineFeed>
set map= ^^^!^"#$%%^&'^(^)*+,-./0123456789:;^<=^>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^^^^_`abcdefghijklmnopqrstuvwxyz{^|}~
set datFile="test.dat"
set "dat="
for /f "eol=F usebackq tokens=2 skip=1 delims=:[] " %%A in (`fc /b %datFile% %compareFile%`) do (
  if "%%A"=="0A" (set "dat=!dat!!lf!") else (
    set /a "n=0x%%A-32"
    if !n! geq 0 if !n! leq 94 for %%n in (!n!) do set "dat=!dat!!map:~%%n,1!"
  )
)

:: Find the version line and get the value
set "version="
for %%C in ("!lf!") do set "dat2=!dat:*%%~Cversion:=!"
if "!dat2!" neq "!dat!" (
  for /f "tokens=* eol= delims= " %%A in ("!dat2!") do (
    set "version=%%A"
    goto :done
  )
)
:done
set version

jeb · Answer 2 · 2011-02-18T22:05:02.237

0

If there is binary data in front of "version" your IF can't work.
Because the content of %%A is something like "{binary}version:"

Give this a try, it test if the string "version" is anywhere in the line. If you have "!" in your binary data, it could fail, then the solution have to be pimped.

setlocal EnableDelayedExpansion
for /f "tokens=* delims=" %%A in ('type file.dat') do (
    set "line=%%A"
    set "version=!line:*version=!"
    if "!line!" NEQ "!version!" (
        goto found
    )
)
echo not found
goto end
:found
echo found: %version%
:end

EDIT:

for /f "tokens=* delims=" %%A in (file.dat) do (...

In a normal for-loop, the main problem is the Hex-Code 0x00, as it is found in a line, the file reading is stopped immediately.

But type or more can suppress this.

edited Feb 18 '11 at 22:05

answered Feb 18 '11 at 12:45

jeb

78,592
17
171
225

There isn't binary data in front of "version". "version" is right after "0D0A". I tested your code and it says "not found". But thank you for trying to help me. Oh, and in both our codes the problem is that it never goes after "do"... It's like it doesn't even "iterate" the file... – Cosmin Feb 18 '11 at 15:29
@Cosmin: I have improve the code, now it should work, the key seems to be the "type" or you can also try "more" – jeb Feb 18 '11 at 21:50
Quote from my question: "If I replace file.dat with 'type file.dat' it does not return (processor usage 100%)" With the test file works because it is very small but with real files it is extremely slow. – Cosmin Feb 19 '11 at 01:53

how to get some text informations from a nontext file using batch commands in Windows?

2 Answers2