This is because the type
command will preserve the UTF-8 BOM, so when you combine multiple files which have the BOM, the final file will contain multiple BOMs in various places in middle of the file.
If you are certain that all the SQL files that you want to combine, start with the BOM, then you can use the following script to remove the BOM from each of them before actually combining them.
This is done by piping the output of type
. The other side of pipe will consume the first 3 bytes (The BOM) with the help of 3 pause
commands. each pause
will consume one byte. The rest of stream will be send to the findstr
command to append it to final script.
Since the SQL files are encoded UTF-8 and they may contain any characters in the Unicode range, certain code pages will interfere with the operation and may cause the final SQL script to be corrupted.
So this has been taken into account and the batch file will be restarted with code page 437 which is safe for accessing any binary sequence.
@echo off
setlocal DisableDelayedExpansion
setlocal EnableDelayedExpansion
for /F "tokens=*" %%a in ('chcp') do for %%b in (%%a) do set "CP=%%~nb"
if !CP! NEQ 437 if !CP! NEQ 65001 chcp 437 >nul && (
REM for file operations, the script must restatred in a new instance.
"%COMSPEC%" /c "%~f0"
REM Restoring previous code page
chcp !CP! >nul
exit /b
)
endlocal
set "RemoveUTF8BOM=(pause & pause & pause)>nul"
set "echoNL=echo("
set "FinalScript=C:\FinalScript\AllScripts.sql"
:: If you want the final script to start with UTF-8 BOM (This is optional)
:: Create an empty file in NotePad and save it as UTF8-BOM.txt with UTF-8 encoding.
:: Or Create a file in your HexEditor with this byte sequence: EF BB BF
:: and save it as UTF8-BOM.txt
:: The file must be exactly 3 bytes with the above sequence.
(
type "UTF8-BOM.txt" 2>nul
REM This assumes that all sql files start with UTF-8 BOM
REM If not, then they will loose their first 3 otherwise legitimate characters.
REM Resulting in a final corrupted script.
for %%A in (*.sql) do (type "%%~A" & %echoNL%)|(%RemoveUTF8BOM% & findstr "^")
)>"%FinalScript%"