0

I generate HTML file programmatically. As imagined its quite ugly but works perfectly. I was wondering if there is a github action or a workflow that I can write that will convert the file into a pretty looking html file.

Writing a workflow that uses Python is fine too. However I must point out that BeautifulSoup fails to correctly indent my file(output misses some tags - perhaps because the generated html is untidy due to line breaks etc) - moreover it uses a single space indenting system, I need 4 spaces.

Some other tools I looked into -

  • html5print - Isn't maintained it seems - idle since 5 years
  • HTML Tidy - Doesn't seem to work with Python 3.X

Don't know if I will be able to run the following in a workflow file via actions -

I haven't explored other languages, but I am open to them, especially Go and Ruby.

jar
  • 2,646
  • 1
  • 22
  • 47

1 Answers1

0

You can install tidy via apt and just run it directly in CI pipeline. So, assuming that you have some python script that generates page.html for you. Here is the configuration that runs tidy with generated file:

page.html ("generated" file)

<!DOCTYPE html><html><head><script src="bundle.js"></script><title>My Page</title></head><body><div id="app"><p>Paragraph 1</p></div></body></html>

.github/workflows/test.yml

---
name: Test

on: push

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2.1.0
      # here is some step that generates HTML file with python
      - run: sudo apt install tidy
      - run: tidy page.html > page2.html
      - name: Print generated page
        run: cat page2.html

Result

enter image description here

tidy is pretty much customizable, so you can configure it for your needs. Just run man tidy or follow official documentation to see possible options.

fabasoad
  • 1,613
  • 18
  • 20
  • I understand that its supposed to work, but when I run it on my rendered HTML, it just left aligns everything. No indentation, nothing whatsoever. In the head, it also places things in 2 lines instead of one. I don't think it works well. – jar May 09 '20 at 19:22
  • Did you play with `tidy` options? [This](https://stackoverflow.com/a/8937448/470214) solution works for me perfectly, shows formatted HTML with indentation - [Print generated page](https://github.com/fabasoad/business-card/runs/674156604). – fabasoad May 14 '20 at 10:56
  • Looks like you are onto something. Please give me sometime. Bogged down with a few things here. Will check and let you know soon. – jar May 14 '20 at 11:01
  • `tidy --indent auto --indent-spaces 2 --tidy-mark no --force-output yes -o index_output.html index.html` seems to work for me now. – jar May 19 '20 at 14:04