0

Assuming we have this text files

*0000000000003000345800091600000000000002082019              0
*000000000000322322930002160000000DOU JIJEL                  1
*000000000000306156240007000000000TIC TAHER                  1

The header contains always what follows :

From position 1 to position 21 we have always this:

*00000000000030003458 which is an «unchangeable» value. It contains 21 characters.

From position 22 to 34, we got 13 characters which represent the sum of the amounts contained in every line the text file from position 22 to 34.

To clarify ; if you look at the header you’ll see from position 22 to 34 :

0009160000000 which is 91 600 000,00 It’s an amount of money, which is the sum of the amounts in the first and second line.

First line : 0002160000000 which is 21 600 000,00 Second line : 0007000000000 which is 70 000 000,00

21 600 000,00+70 000 000,00=91 600 000,00

« If we have in the first line 3162160000000 it means the amount in 31 621 600 000,00 If we have in the first line 0000000541000 it means the amount is 5 410,00 »

From position 35 to 41, we have seven characters, which represent the number of amounts contained in the text file. We have From position 35 to 41 0000002, and we have two lines except the header, so the sum is 2. If for example we have 714 lines, the position 35 to 41 in the header will be 0000714, and so on.

So, if I have two text files, and I want to merge them together in one file, in a way that we’ll have: Only one header and All the lines in the text files. The lines of course will be unchanged. But the header will be changed as I explained above, in addition to that and from position 42 to position 62, will always be of the values or the characters contained in the header of the text files I want to merge, which are always the same. That means that the header will be changed only from position 22 to position 41.

I've managed to remove the headers, but the new header I write it manually

@echo off    
setlocal enabledelayedexpansion    
if exist output.txt del output.txt    
set "var="    
for /r %%i in (*.txt) do (    
  if "%%~nxi" NEQ "output.txt" (    
  set "var="    
  for /f "usebackq skip=1 delims=" %%b in ("%%~i") do (    
    set var=%%b    
    if "!var!" NEQ "" Echo !var!    
))) >> output.txt

this code will remove the header of the text files

So I expect the new header to be calculated automatically

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
  • 2
    That are quite big numbers to do math with. Math is done with `set /a`, but it can only handle [INT32](https://stackoverflow.com/questions/24817181/why-int32-maxvalue-2147483648). Either do your own calculation subroutines (using string manipulation) or switch to another language (recommended). – Stephan Aug 14 '19 at 19:55

2 Answers2

1
@ECHO OFF
setlocal enabledelayedexpansion    

if exist output.txt del output.txt 
set "headerproduced="
set "header="
set "var="    
:pass2
for /r %%i in (*.txt) do ( 
 if "%%~nxi" NEQ "output.txt" (
  set "var="    
  if defined headerproduced (
   if !headerproduced!==0 set /a headerproduced=1&echo !header!
   for /f "usebackq skip=1 delims=" %%b in ("%%~i") do (    
    set "var=%%b"
    if "!var!" NEQ "" Echo !var!
   )
  ) else (
   rem header not yet produced
   for /f "usebackq delims=" %%b in ("%%~i") do if not defined var (
    if defined header (
     rem subsequent headers - accumulate
     set "var=%%b"
     set /a var1=1!header:~34,7! + 1!var:~34,7!
     set /a var2=1!header:~28,6! + 1!var:~28,6!
     set /a var3=1!header:~21,7! + 1!var:~21,7!
     if !var2! geq 3000000 set /a var3+=1
     set "header=!header:~0,21!!var3:~-7!!var2:~-6!!var1:~-7!!header:~41!"
    ) else (
     rem very first header
     set "header=%%b"
     set "var=1"
    )
   )
  )
 )
) >> output.txt
if not defined headerproduced set /a headerproduced=0&goto pass2

GOTO :EOF

The syntax SET "var=value" (where value may be empty) is used to ensure that any stray trailing spaces are NOT included in the value assigned. In your posted code, there appears to be trailing spaces on lines - especially the set var=%%b which would generate 4 extra spaces at the end of each dataline.

This code works by using header to contain the new header for the file.

At first, no header has yet been encountered. When a file is read, var is set to nothing and each line of the file is processed.

When the first file is read, if defined header is FALSE so the header line is recorded in header and var set to some value so that it is defined. Subsequent lines of the file are ignored courtesy of the if not defined var gate.

On reading the remaining files, header is now defined, so we need to accumulate data. The accumulated data is assigned back to header in the appropriate spots. We then need to deal with adding the two fields. This is where we encounter batch's quirky maths.

Batch uses 32-bit signed-integers for its maths operations. That's fine for the shorter number, but the longer needs to be split into two - I chose the leading 7 digits and the trailing 6 digits.

Next minor matter is that batch regards a numeric string starting 0 as Octal, so we simply poke a 1 in front of each, add them up and use the last 6 or 7 digits of the result. In the case of the 6-digit portion, we can have an overflow - 1999998 + 1000003 = 3000001 - 3000000 or greater means we have overflow and need to increment the 7-digit portion.

Once all of the files have been read, header contains the required value but no header has yet been generated. We return to pass2 having set headerproduced to a significant value.

On the second pass, headerproduced now has a value. If that value is 0, we echo out the headerline and alter headerproduced to prevent multiple accumulated-headerlines being produced.

After that, output each line bar the first of each file as before.

Magoo
  • 77,302
  • 8
  • 62
  • 84
1

The 32-bits integer values that set /A command can manage only allows to correctly add two numbers up to 9 (decimal) digits, but is very simple to overpass this limitation: just split a large number in two (or more) parts and add them separately. Be aware that when the result of any part exceeds its number of digits, such "overflow digit" (called Carry) must be passed (in the units position) to the next part.

@echo off
setlocal EnableDelayedExpansion

rem Initialize variables to manage first line
rem 01234567890123456789012345678901234567890123456789012345678901
rem /---unchangeable----\/amnt7\/amC6\/lins7\/----unchangeable---\
rem *0000000000003000345800091600000000000002082019              0
set /A amount7=10000000, amountC6=10000000, lines7=10000000

rem Process all files. Group all output to same file
(for /R %%i in (*.txt) do (

   rem Input from each file
   < "%%i" (

   rem Read the first line and accumulate it
   set /P header=
   set /A amountC6+=10!header:~28,6!
   set /A amount7+=1!header:~21,7!+!amountC6:~1,1!, amountC6=10!amountC6:~2!, lines7+=1!header:~34,7!

   rem Copy the rest of lines
   findstr "^"

   )

rem Send all "rest of lines" to temporary file
)) > output.tmp

rem Add result header to temporary file, and delete it
(
echo %header:~0,21%%amount7:~1%%amountC6:~2%%lines7:~1%%header:~41%
type output.tmp
) > output.txt
del output.tmp
Aacini
  • 65,180
  • 12
  • 72
  • 108