1

I have an auto-generated file that is encoded in UTF-16LE and I want to write a Ruby script that searches for a version number via a regular expression and replaces it with a new version number. Here is what I initially was trying to use:

File.open(filepath,"rb:UTF-16") do |file|
  file.each do |line|
    line.gsub!(/FILEVERSION \d\.\d/,FILEVERSION)
  end
end

This however did not work as I was getting an error message that said "incompatible encoding regexp match(US-ASCII regexp with UTF-16 string)". I tried to force encode my FILEVERSION string in UTF-16 but got the same error. One of my coworkers said that you can't effectively use regexes in UTF-16 encoding. Is there a workaround to this problem?

Stew C
  • 697
  • 3
  • 10
  • 24

1 Answers1

2

It should work if you're careful to encode everything in UTF-16LE.

re = Regexp.new('FILEVERSION \d\.\d'.encode('UTF-16LE'))
File.open(filepath,"rb:UTF-16LE") do |file|
    file.each do |line|
        line.gsub!(re, FILEVERSION.encode('UTF-16LE'))
    end
end
pdw
  • 8,359
  • 2
  • 29
  • 41
  • I don't get errors anymore but it isn't changing anything in the file. Not sure what's wrong, the file isn't read-only – Stew C Jul 21 '14 at 18:52
  • You still need to write modified lines back somewhere, of course. That won't happen automatically. I thought this question was just about the encoding problem. – pdw Jul 21 '14 at 18:55
  • Ahh ok. I forgot that I opened the file for reading. Thanks! – Stew C Jul 21 '14 at 18:58