4

I have created a batch script to copy SQL files from a folder into one big SQL script. The problem is when I run this one SQL script it comes up with the error

Incorrect syntax near ''

I copied the one SQL script into Notepad++ and set the encoding to ANSI. I see this symbol  (BOM) on the lines where the error is happening.

Is there anyway I can automatically remove this within my batch script. I don't want to keep manually remove this every time I run this task.

Below is the batch script I have currently

@echo off

set "path2work=C:\StoredProcedures"
cd /d "%path2work%"

echo. > C:\FinalScript\AllScripts.sql

for %%a in (*.sql) do (

    echo. >>"C:\FinalScript\AllScripts.sql"
    echo GO >>"C:\FinalScript\AllScripts.sql"
    type "%%a">>"C:\FinalScript\AllScripts.sql"
    echo. >>"C:\FinalScript\AllScripts.sql"
)
user10127407
  • 98
  • 1
  • 1
  • 5
  • 1
    "ANSI" does not have a BOM. "" is what you get when you interpret a UTF-8 file with BOM as if it were ANSI. Even then, it should ONLY appear at the very begin of the file. But you say that you saw "" on the lines (plural) where it is happenign, so not just at the very begin of the very first line. In that case, it is **not** a BOM, but a non-breaking zero-width space. – MSalters Sep 11 '18 at 10:25

4 Answers4

5

This is because the type command will preserve the UTF-8 BOM, so when you combine multiple files which have the BOM, the final file will contain multiple BOMs in various places in middle of the file.

If you are certain that all the SQL files that you want to combine, start with the BOM, then you can use the following script to remove the BOM from each of them before actually combining them.

This is done by piping the output of type. The other side of pipe will consume the first 3 bytes (The BOM) with the help of 3 pause commands. each pause will consume one byte. The rest of stream will be send to the findstr command to append it to final script.

Since the SQL files are encoded UTF-8 and they may contain any characters in the Unicode range, certain code pages will interfere with the operation and may cause the final SQL script to be corrupted.

So this has been taken into account and the batch file will be restarted with code page 437 which is safe for accessing any binary sequence.

@echo off
setlocal DisableDelayedExpansion


setlocal EnableDelayedExpansion
for /F "tokens=*" %%a in ('chcp') do for %%b in (%%a) do set "CP=%%~nb"
if  !CP! NEQ 437 if !CP! NEQ 65001 chcp 437 >nul && (

    REM for file operations, the script must restatred in a new instance.
    "%COMSPEC%" /c "%~f0"

    REM Restoring previous code page
    chcp !CP! >nul
    exit /b
)
endlocal


set "RemoveUTF8BOM=(pause & pause & pause)>nul"
set "echoNL=echo("
set "FinalScript=C:\FinalScript\AllScripts.sql"

:: If you want the final script to start with UTF-8 BOM (This is optional)
:: Create an empty file in NotePad and save it as UTF8-BOM.txt with UTF-8 encoding.
:: Or Create a file in your HexEditor with this byte sequence: EF BB BF
:: and save it as UTF8-BOM.txt
:: The file must be exactly 3 bytes with the above sequence.
(
    type "UTF8-BOM.txt" 2>nul

    REM This assumes that all sql files start with UTF-8 BOM
    REM If not, then they will loose their first 3 otherwise legitimate characters.
    REM Resulting in a final corrupted script.
    for %%A in (*.sql) do (type "%%~A" & %echoNL%)|(%RemoveUTF8BOM% & findstr "^")

)>"%FinalScript%"
sst
  • 1,443
  • 1
  • 12
  • 15
3

TypeWithoutBOM.bat

@echo off
set "RemoveUTF8BOM=(pause & pause & pause)>nul"
type %1|(%RemoveUTF8BOM% & findstr "^")

This batch file works like the type command but removes the first 3 bytes of the file which is shown.
Usage: TypeWithoutBOM UTF8-file.txt > newfile.txt

Michael Hutter
  • 1,064
  • 13
  • 33
2

As MSalters alreadyx mentioned in his comment, according to wikipedia  is the ANSI representation of an UTF8 BOM.

PowerShell is much better suited to the task dealing with encodings than batch:

## Q:\Test\2018\09\11\SO_522772705.ps1
Set-Location 'C:\StoredProcedures'
Get-ChildItem '*.sql' | ForEach-Object {
    "`nGO"
    Get-Content $_.FullName -Encoding UTF8
    ""
} | Set-Content 'C:\FinalScript\AllScripts.sql' -Encoding UTF8

To be on topic with the tag batch-file a batch invoking powershell for the essential part:

:: Q:\Test\2018\09\11\SO_522772705..cmd
@echo off
set "path2work=C:\StoredProcedures"
cd /d "%path2work%"

powershell -NoProfile -Command "Get-ChildItem '*.sql'|ForEach-Object{\"`nGO\";Get-Content $_.FullName -Enc UTF8;\"\"}|Set-Content 'C:\FinalScript\AllScripts.sql' -Enc UTF8"
1

You just need to change the encoding to UTF-8 without BOM and save the file

Notepad++ BOM

Note that the menu items are a little bit different on older Notepad++ versions

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 1
    Is there no automated way of doing this? This batch file is ran in the middle of a build server process. – user10127407 Sep 11 '18 at 10:14
  • are you generating batch files automatically? if yes then config the generator to stop emitting BOMs instead. In that case you need to provide the code for the generator – phuclv Sep 11 '18 at 10:22