2

With this zip file, this Node script successfully outputs the files:

const child_process = require('child_process')
const util = require('util')
const exec = util.promisify(child_process.exec)
exec(`unzip -Z1 metamorpR.zip`).then(zip_contents => {
    if (zip_contents.stderr) {
        throw new Error(`unzip error: ${zip_contents.stderr}`)
    }
    console.log(zip_contents.stdout)
})
metamorpR.z5
Варианты Прохождения.txt
Интерактивная Литература.pdf

But when I run the script from within Docker, it doesn't.

Using this Dockerfile:

FROM node:16-alpine
RUN apk add --no-cache unzip
COPY . .
ENTRYPOINT ["node", "unzip.js"]

Build and run (substitute in your container image name):

docker build .
docker run --rm 1dc072

Output:

metamorpR.z5
??????? ????????.txt
???????????? ??????????.pdf

I think this means the locales aren't set correctly within the Docker image? Any ideas how to fix this?

curiousdannii
  • 1,658
  • 1
  • 25
  • 40
  • 1
    This is [related to missing `locales`](https://github.com/sgerrand/alpine-pkg-glibc/issues/5). You can [apply this patch](https://github.com/sgerrand/alpine-pkg-glibc) during the build, generate the locales, but even then `unzip` on alpine doesn't appear to respect the locales. This [so post](https://stackoverflow.com/a/37835009/1423507) is related. – masseyb Dec 19 '21 at 14:00

3 Answers3

2

TL;DR

unzip on alpine doesn't appear to support locales. unzip on debian doesn't appear to support locales either. unzip on ubuntu supports using locales (however there exists no official node ubuntu image).


On ubuntu:

FROM ubuntu:18.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        locales \
        unzip && \
    apt-get clean
RUN sed -i -e 's/# ru_RU.UTF-8 UTF-8/ru_RU.UTF-8 UTF-8/' /etc/locale.gen && \
    locale-gen && \
    update-locale LANG=ru_RU.UTF-8 LC_ALL=ru_RU.UTF-8 && \
    ldconfig
ENV LANG=ru_RU.UTF-8
COPY metamorpR.zip /metamorpR.zip
CMD ["unzip", "-l", "metamorpR.zip"]

... there are no issues in the unzip file name output: locales on ubuntu

... however the same build FROM node:16-bullseye won't produce the same results: locales on debian node image

You could apply this patch during the build, then generate the locales, however unzip doesn't appear to use the locales:

FROM node:16-alpine
RUN apk add --no-cache unzip wget
RUN wget -q -O /etc/apk/keys/sgerrand.rsa.pub https://alpine-pkgs.sgerrand.com/sgerrand.rsa.pub && \
    wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-2.34-r0.apk && \
    wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-bin-2.34-r0.apk && \
    wget https://github.com/sgerrand/alpine-pkg-glibc/releases/download/2.34-r0/glibc-i18n-2.34-r0.apk && \
    apk add glibc-2.34-r0.apk glibc-bin-2.34-r0.apk glibc-i18n-2.34-r0.apk && \
    rm /glibc-2.34-r0.apk /glibc-bin-2.34-r0.apk /glibc-i18n-2.34-r0.apk && \
    /usr/glibc-compat/bin/localedef -i ru_RU -f UTF-8 ru_RU.UTF-8
ENV LANG=ru_RU.UTF-8
COPY metamorpR.zip /metamorpR.zip
CMD ["unzip", "-l", "metamorpR.zip"]

locales on alpine

masseyb
  • 3,745
  • 1
  • 17
  • 29
  • 1
    Ugh, this is quite the mess. There's a musl-locales Alpine package, which might do roughly the same as your final Dockerfile, but it also has no effect. I tried p7zip too, but no luck there either. I might try a npm zip package, but I had wanted to use unzip directly because when I last looked none of the npm zip packages were very high quality... – curiousdannii Dec 20 '21 at 00:05
  • 1
    The unzip packages continue to look bad to me. (There was one that looked okay, until I got to the known issues: UTF-8 file names!) But installing node into an Ubuntu image is working. Thank you so much! – curiousdannii Dec 20 '21 at 00:49
0

Thanks to @masseyb's answer, I was able to get it working with this Dockerfile, which basically just installs Node manually into an Ubuntu image. The main downside is the image is twice the size, but it's comparatively simple so that's an acceptable downside to me.

FROM ubuntu:20.04
RUN apt-get update && \
    apt install -y curl locales unzip && \
    curl -fsSL https://deb.nodesource.com/setup_16.x | bash - && \
    apt install -y nodejs && \
    rm -rf /var/lib/apt/lists/* && \
    localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.UTF-8
COPY . .
ENTRYPOINT ["node", "unzip.js"]
curiousdannii
  • 1,658
  • 1
  • 25
  • 40
-1

Apparently some versions of unzip that is available from Ubuntu repositories can handle automatic decoding of filenames if you specify the -a switch.

  • Isn't the `-a` flag for file contents, not names? It made no difference at least when I tried it. – curiousdannii Dec 19 '21 at 23:53
  • It's hard to verify without knowing version of unzip which you using. Try unzip -O cp866 (Replace with your locale) Archive.zip or piping to iconv command - iconv -f cp1252 -t cp850 I think it can be done without fiddling with locales and doubling the image size – Joel Tenta Dec 20 '21 at 03:47
  • @JoelTenta the `-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives` option is not available for `unzip` on `alpine`. `docker run --rm --entrypoint sh alpine:3 -c "apk add unzip && unzip"`. – masseyb Dec 21 '21 at 11:00