1

I am trying to alter some .html files whilst building a docker image. I have created a add_style.py which alters the html to add a line of css. First let me show you the Dockerfile:

FROM golang:1.15.2-buster as ClaatSetup
RUN CGO_ENABLED=0 go get github.com/googlecodelabs/tools/claat

FROM alpine:3.10 as ClaatExporter
WORKDIR /app
COPY --from=ClaatSetup /go/bin/claat /claat
COPY docs/ input/
RUN /claat export -o output input/**/*.md


FROM alpine:3.10 as AppCompiler
RUN apk add --update git nodejs npm make python gcc g++ && \
    npm install -g gulp-cli

WORKDIR /app

# against caching when there is a new commit
ADD "<..>" /tmp/devalidateCache
RUN git clone <..>
WORKDIR /app/codelabs-site

# Install dependencies
RUN npm install && npm install gulp

# Copy exported codelabs from previous stage
COPY --from=ClaatExporter /app/output codelabs/

# Build everything
RUN gulp dist --codelabs-dir=codelabs

# Replace symlink in with actual content (see below for description)
WORKDIR /app/codelabs-site/dist
RUN rm codelabs
COPY --from=ClaatExporter /app/output codelabs/



FROM caddy:alpine as Deployment
WORKDIR /app

COPY --from=AppCompiler /app/codelabs-site/dist/ .

#injecting of css to widen codelabs
RUN apk add --no-cache python3 py3-pip
RUN pip3 install bs4
COPY add_style.py codelabs/add_style.py
RUN python3 codelabs/add_style.py

EXPOSE 80
EXPOSE 443
CMD ["caddy", "file-server"]

Then the add_style.py itself:

import os
from bs4 import BeautifulSoup, Doctype

for root, dirnames, filenames in os.walk(os.getcwd()):
    for filename in filenames:
        if filename.endswith('index.html'):
            fname = os.path.join(root, filename)
            print('Filename: {}'.format(fname))
            with open(fname, 'r+', encoding = 'utf-8') as handle:
                soup = BeautifulSoup(handle.read(), 'html.parser')
                head = soup.head
                head.append(soup.new_tag('style', type='text/css'))
                head.style.append('.instructions {max-width: 1000px;} #fabs {max-width: 1200px;}')
                   
                #write altered html
                handle.write(str(soup.prettify()))
                handle.flush()

The latter script runs perfect, no errors. Also if you output soup.prettify() it shows the altered html with added CSS. Now if i read the file again in the same script it shows the old version.. I tried executing this script at different stages of the docker build but it just doesn't write the file. No errors are shown..

Sjaak Rusma
  • 1,424
  • 3
  • 23
  • 36
  • If you are not using volumes, but copying files during the build of the image, the changes you are doing to those files are only kept while the container is running and are lost once it stops. Could it be that? – Way Too Simple Nov 09 '21 at 13:53
  • I am not sure if this is true. Copying seems to work fine, altering is just not working. – Sjaak Rusma Nov 09 '21 at 14:29
  • 1
    maybe you can try handle.seek(0) before handle.write(...)? Since you read the file, the pointer might be at the end of it when you write. Check https://stackoverflow.com/questions/6648493/how-to-open-a-file-for-both-reading-and-writing – Way Too Simple Nov 09 '21 at 17:18
  • @Armando oh wow! i was just about to come up with a different solution but this fixed it! Thanks!! – Sjaak Rusma Nov 09 '21 at 17:32

1 Answers1

1

As by @Armando's comment. The solution is to add seek and truncate:

   #write altered html
   handle.seek(0)
   handle.write(str(soup.prettify()))
   handle.truncate()
   handle.flush()
Sjaak Rusma
  • 1,424
  • 3
  • 23
  • 36