I'm trying to iterate over a remote nginx log file (compressed .gz file) in Rails and I'm getting this error at some point in the file:
TTPArgumentError: invalid byte sequence in UTF-8
I tried forcing the encoding too although it seems the encoding was already UTF8:
logfile = logfile.force_encoding("UTF-8")
The method that I'm using:
def remote_update
uri = "http://" + self.url + "/localhost.access.log.2.gz"
source = open(uri)
gz = Zlib::GzipReader.new(source)
logfile = gz.read
# prints UTF-8
print logfile.encoding.name
logfile = logfile.force_encoding("UTF-8")
# prints UTF-8
print logfile.encoding.name
logfile.each_line do |line|
print line[/\/someregex\/1\/(.*)\//,1]
end
end
Really trying to understand why this is happening (tried to look in other SO threads with no success). What's wrong here?
Update:
Added exception's trace:
HTTPArgumentError: invalid byte sequence in UTF-8
from /Users/T/workspace/sample_app/app/models/server.rb:25:in `[]'
from /Users/T/workspace/sample_app/app/models/server.rb:25:in `block in remote_update'
from /Users/T/workspace/sample_app/app/models/server.rb:24:in `each_line'
from /Users/T/workspace/sample_app/app/models/server.rb:24:in `remote_update'
from (irb):2
from /Users/T/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/railties-4.2.5/lib/rails/commands/console.rb:110:in `start'
from /Users/T/.rbenv/versions/2.2.3/lib/ruby/gems/2.2.0/gems/railties-4.2.5/lib/rails/commands/console.rb:9:in `start'