10

I have a small local web-application of 2 HTML files, 6 CSS files and 11 JS files.

  1. Would the web-application still work if all of these files were (properly) copy-pasted in a single HTML file, e.g. putting the JS in <script> tags in the header, and putting the CSS in <style> tags?

  2. Does anyone know of a tool that could automatically and safely merge a collection of JS, CSS and HTML files into a single HTML?

Searching online, I only found tools that can combine or minify files of one type at a time, but not create the merged HTML file (e.g. AIOM+, HTMLcompressor. I did find this application called Inliner, but it seems it runs on Node.js, with which I'm not familiar and don't currently use.

In short, I'm looking for either a simple standalone tool that could read all the linked files in the HTML, and rewrite the HTML by appending those files' content. If that's asking too much, then just a confirmation that manually doing the job would result in a working file, or any tips to think about when doing so. Thanks!

sc28
  • 1,163
  • 4
  • 26
  • 48
  • 1
    It is possible, you put the ` – marcellothearcane Jun 20 '17 at 07:30
  • 1
    Possibly of interest: https://stackoverflow.com/questions/3135929/html-javascript-css-compact-tool, https://stackoverflow.com/questions/28195447/merge-html-css-js-and-assets-to-one-file, https://stackoverflow.com/questions/7454050/put-css-and-javascript-in-files-or-main-html – marcellothearcane Jun 20 '17 at 07:31
  • 1
    The answer to your first question is YES. As for your second question, asking for tools is against SO policy. – Racil Hilan Jun 20 '17 at 07:35
  • Thanks for these comments, some interesting stuff in the linked questions, but not quite what I need either. I'll attempt the manual copy paste following these recommendations and see if it works out. – sc28 Jun 20 '17 at 08:08
  • @sc28 did you find a solution? – mzuba Nov 25 '21 at 11:24

3 Answers3

6

I wrote a simple python script for it.

This is my tree:

root-folder
├── index.html
├── build_dist.py
├── js
│   └── somescript.js
├── css
│   ├── styles1.css
│   └── styles2.css
└── dist

I run the script:

cd root-folder
python build_dist.py

And a oneindex.html file is created in the dist folder.
This file contains all the js and css from the files specified with link and script tags in index.html.
You can use this file anywhere.

Note:

  1. The HTML file must be "index.html" in the root folder.
  2. It only works for a single HTML file. I don't know what you want to do with multiple HTML files.

build_dist.py code:

# build_dist.py

from bs4 import BeautifulSoup
from pathlib import Path
import base64

original_html_text = Path('index.html').read_text(encoding="utf-8")
soup = BeautifulSoup(original_html_text)

# Find link tags. example: <link rel="stylesheet" href="css/somestyle.css">
for tag in soup.find_all('link', href=True):
    if tag.has_attr('href'):
        file_text = Path(tag['href']).read_text(encoding="utf-8")

        # remove the tag from soup
        tag.extract()

        # insert style element
        new_style = soup.new_tag('style')
        new_style.string = file_text
        soup.html.head.append(new_style)


# Find script tags. example: <script src="js/somescript.js"></script>
for tag in soup.find_all('script', src=True):
    if tag.has_attr('src'):
        file_text = Path(tag['src']).read_text()

        # remove the tag from soup
        tag.extract()

        # insert script element
        new_script = soup.new_tag('script')
        new_script.string = file_text
        soup.html.body.append(new_script)

# Find image tags.
for tag in soup.find_all('img', src=True):
    if tag.has_attr('src'):
        file_content = Path(tag['src']).read_bytes()

        # replace filename with base64 of the content of the file
        base64_file_content = base64.b64encode(file_content)
        tag['src'] = "data:image/png;base64, {}".format(base64_file_content.decode('ascii'))

# Save onefile
with open("dist/oneindex.html", "w", encoding="utf-8") as outfile:
    outfile.write(str(soup))
Shalom Craimer
  • 20,659
  • 8
  • 70
  • 106
Noam Nol
  • 570
  • 4
  • 11
2

You could consider using webpack. It is not easy to understand at first but this is a good tutorial to start with.

itacode
  • 3,721
  • 3
  • 21
  • 23
1

1.

Generally, yes

2.

I don't know of merging multiple html files, but
Here is a Python script (Github) for merging css/js/images into one single html file. In addition to Noam Nol's answer..

  • ... it does not have external dependencies
  • ... it will also handle non-png images properly.

Usage: python3 htmlmerger yourfile.html

Code from github: htmlmerger.py

Below is the content from the file on Github.

from html.parser import HTMLParser
import os
import sys
import base64


gHelp = """
Merge JS/CSS/images/HTML into one single file
Version: 1.0

Usage:
  htmlmerger inputfile [optional: outputfile]

"""


def getFileContent (strFilepath):
  content = ""
  with open (strFilepath, "r") as file:
    content = file.read ()
  return content



def getFileContentBytes (strFilepath):
  content = b""
  with open (strFilepath, "rb") as file:
    content = file.read ()
  return content


class HtmlMerger(HTMLParser):
  """
    Call "run(htmlContent, basedir)"  to merge
    script/css/images referenced withing htmlContent
    into one single html file.
  """
  def __init__(self):
    super().__init__()
    self._result = ""
    self._additionalData = ""
    self._baseDir = ""
    self.messages = []



  def _addMessage_fileNotFound(self, file_asInHtmlFile, file_searchpath):
    self.messages.append ("Error: Line " + str (self.getpos ()[0]) +
                        ": Could not find file `" + str (file_asInHtmlFile) +
                        "`; searched in `" + str (file_searchpath) + "`." )



  def _getAttribute (self, attributes, attributeName):
    """Return attribute value or `None`, if not existend"""
    for attr in attributes:
      key = attr[0]
      if (key == attributeName):
        return attr[1]
    return None


  def _getFullFilepath (self, relPath):
    return os.path.join (self._baseDir, relPath)


  def handle_starttag(self, tag, attrs):

    # Style references are within `link` tags. So we have to
    #  convert the whole tag
    if (tag == "link"):
      href = self._getAttribute (attrs, "href")
      if (href):
        hrefFullPath = self._getFullFilepath (href)
        if (not os.path.isfile (hrefFullPath)):
          self._addMessage_fileNotFound (href, hrefFullPath)
          return
        styleContent = getFileContent (hrefFullPath)
        self._result += "<style>" + styleContent + "</style>"
        return

    self._result += "<" + tag + " "

    for attr in attrs:
      key = attr[0]
      value = attr[1]

      # main work: read source content and add it to the file
      if (tag == "script" and key == "src"):
        #self._result += "type='text/javascript'"
        strReferencedFile = self._getFullFilepath (value)
        if (not os.path.isfile (strReferencedFile)):
          self._addMessage_fileNotFound (value, strReferencedFile)
          continue
        referencedContent = getFileContent (strReferencedFile)
        self._additionalData += referencedContent

        # do not process this key
        continue

      if (tag == "img" and key == "src"):
        imgPathRel = value
        imgPathFull = self._getFullFilepath (imgPathRel)
        if (not os.path.isfile (imgPathFull)):
          self._addMessage_fileNotFound (imgPathRel, imgPathFull)
          continue

        imageExtension = os.path.splitext (imgPathRel)[1][1:]
        imageFormat = imageExtension

        # convert image data into browser-undertandable src value
        image_bytes = getFileContentBytes (imgPathFull)
        image_base64 = base64.b64encode (image_bytes)
        src_content = "data:image/{};base64, {}".format(imageFormat,image_base64.decode('ascii'))
        self._result += "src='" + src_content + "'"

        continue



      # choose the right quotes
      if ('"' in value):
        self._result += key + "='" + value + "' "
      else:
        self._result += key + '="' + value + '" '

    self._result +=  ">"

  def _writeAndResetAdditionalData(self):
    self._result += self._additionalData
    self._additionalData = ""

  def handle_endtag(self, tag):
    self._writeAndResetAdditionalData ()
    self._result += "</" + tag + ">"


  def handle_data(self, data):
    self._result += data

  def run(self, content, basedir):
    self._baseDir = basedir
    self.feed (content)
    return self._result



def merge(strInfile, strOutfile):

  if (not os.path.isfile (strInfile)):
    print ("FATAL ERROR: file `" + strInfile + "` could not be accessed.")
    return

  baseDir = os.path.split (os.path.abspath (strInfile))[0]

  #read file
  content = getFileContent (strInfile)

  parser = HtmlMerger()
  content_changed = parser.run (content, baseDir)

  # log errors
  if (len (parser.messages) > 0):
    print ("Problems occured")
    for msg in parser.messages:
      print ("  " + msg)
    print ("")

  # debug:
  if (False):
    print (content_changed)
    exit ()


  # write result
  with open (strOutfile, "w") as file:
    file.write (content_changed)



def main():
  args = sys.argv[1:] # cut away pythonfile
  if (len (args) < 1):
    print (gHelp)
    exit()

  inputFile = args[0]

  # get output file name
  outputFile = ""
  if (True):
    outputFile = os.path.splitext (inputFile)[0] + "_merged.html"

    if (len (args) > 1):
      outputFile = args[1]

    if (os.path.isfile (outputFile)):
      print ("FATAL ERROR: Output file " + outputFile + " does already exist")
      exit ()

  # run the actual merge
  merge (inputFile, outputFile)


main()
DarkTrick
  • 2,447
  • 1
  • 21
  • 39