My scenario is the following:
I have a huge file with 700.000+ lines. I have to work with this, I name this file now trc.txt
The structure of each line of this file is like so:
20958 191014 07:43:57.08 CCComRPC DCMSGCFW_E PID:00000864.00001F40 Data:23 < PREP_FIXED::Process 0
I have a seconde file, I call it classID.txt
with 300 lines. Each line have the following structure:
ID_Key;ClassName 720;ComEFM 721;CCComRPC 725;ComSSL 730;WOSA-CRD 731;WOSA-PIN
The aim is now to check my trc.txt
how often a specific Class
can be found.
The different possible class names are stored in the classID.txt
and the name can be found in the fourth element from the left in each line inside the trc.txt
.
My procedure right now was to save the different possible ClassNames inside a list-variable. For this I used this for-loop (oriented by this post)
set trcClasses=
for /f "tokens=2 delims=;" %%i in (classID.txt) do set trcClasses=!trcClasses!,%%i
This seems to work perfectly.
Now to cope with my aim, I thought to iterate through my big-list trc.txt
line by line and check each time if one element of the trcClasses
occur. If this is so, to count, I implement a simple counter which then increments by one and for that I am using the following code:
for /f "tokens=4 delims= " %%t in (trc.txt) do (
set "dataRow=%%~t"
set "break="
for %%l in (%trcClasses%) do if not defined break (
if not "!dataRow:%%l=!"=="!dataRow!" (
set /a kumSum%%l+=1
set "break=1"
)
)
)
I then return my values with this:
for%%l in (%trcClasses%) do (
if (!kumSum%%l! NEQ 0) echo %%l !kumSum%%l!
)
First problem: Console have problems with some items in the classID.txt
. I receive something like this:
Error: Division durch Null. Missing operator
In my opinion this is caused by some of the names inside classID.txt
like WOSA-PTR
or TCP/IP
The bigger problem: Running the code takes approx. 12 minutes!
Any suggestions would be appreciated.