0

I have a bat file where I encode some CSV files to UTF-8, and then import the files to a SQLite database. This is the code I have:

echo Codificando ficheros...
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\prefijo.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\materiales.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\zonas.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\responsables.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\lideres.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\pass.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\capacidadLotes.csv
powershell -Command "&{ param($Path); (Get-Content $Path) | Out-File $Path -Encoding UTF8 }" CSVs\\boletosIndirectos.csv
@echo OFF
echo Generando fichero de base de datos _datosPrecargados.db...
.\sqlite3.exe BaseDatos\_datosPrecargados.db < Sentencias_sql\Sentencias_DatosPrecargados.sql
@echo OFF
echo Generando fichero de base de datos _inventario.db...
.\sqlite3.exe BaseDatos\_inventario.db < Sentencias_sql\Sentencias_Inventario.sql
pause

Before I set the PowerShell lines to encode the files it worked fine. But before encoding I have a problem. The data base seems to be ok but when I do a select sentence looking for the first column of the first record, it is not found.

E.G. Table "zones":

zone     name
----     ----
Z001     Zone 1
Z002     Zone 2
Z002     Zone 3

If I execute the sentence "SELECT * FROM zones", it lists all the records, ok.

If I execute the sentence "SELECT * FROM zones WHERE zone="Z002"", the record is listed, ok.

But if I execute the sentence "SELECT * FROM zones WHERE zone="Z001"", the record is not found.

If I don't set the PowerShell line in the bat file, it doesn't happens and it works fine, but I need to encode the files because I have special characters inside the CSV files like "ñ" or "ó".

I don´t know what can I am doing wrong.

Thanks in advance!!

javiazo
  • 1,892
  • 4
  • 26
  • 41
  • I'm thinking some non-printable chars are being stored in there, it might be storing the BOM data in there. I wonder if more than 4 is returned from `Select LEN(zone) as Length From zones where name='Zone 1'` – Andy Arismendi Jul 17 '13 at 09:55
  • thats right, i get one character more than i can see, I hace tried encoding with ASCII and the problem dissapear but i am not able to use special characters ;( – javiazo Jul 17 '13 at 10:11
  • i am going to try with other encode types – javiazo Jul 17 '13 at 10:12
  • I wonder what would happen if you use notepad++ and save one as UTF8 (no BOM) and one with UTF8 (with BOM) and import both and see if the one without BOM is OK or not. – Andy Arismendi Jul 17 '13 at 10:23
  • it works, is the first thing i though, but it have to be a bat file and the csv are generated with excel – javiazo Jul 17 '13 at 10:28

2 Answers2

0

If the characters in the files were broken before, transcoding won't fix them, so your PowerShell calls are unlikely to help with the underlying problem. I rather suspect that you need to change your codepage instead:

chcp 1251 >nul

The command changes the codepage of the current console (i.e. the current CMD instance) to the given value (see the documentation). >nul is just to suppress the output, which is inconsequential to the rest of the script.

Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
0

Before you do the import make sure both the console and the files are encoded in what you want

Copy and past the script in your Powershell windows

 foreach($FileNameInUnicodeOrWhatever in get-childitem)
 {
    $FileName = $FileNameInUnicodeOrWhatever.Name

    $TempFile = "$($FileNameInUnicodeOrWhatever.Name).ASCII"

    get-content $FileNameInUnicodeOrWhatever | out-file $FileNameInUnicodeOrWhatever -Encoding ASCII 

    remove-item $FileNameInUnicodeOrWhatever

    rename-item $TempFile $FileNameInUnicodeOrWhatever

    write-output $FileNameInUnicodeOrWhatever "converted to ASCII ->" $TempFile
}

I would also set your encoding on the console, if you follow the link below

How to bulk convert unicode file names to Ascii file names windows CMD or powershell

Community
  • 1
  • 1
Transformer
  • 6,963
  • 2
  • 26
  • 52