15

A python program, which I compiled with pyInstaller, turned out to be over 400 MB. The program's GUI is based on htmlPY, which is "a wrapper around PySide's QtWebKit library." The large size of the program partly owes to the fact that it utilizes numpy, scipy, and nltk, and in part due to the graphics libraries.

To minimize the size of the program, I installed UPX. This decreased the size of the program to slightly over 100MB, which is large, but acceptable.

The first problem is that pyInstaller didn't detect htmlPy, and didn't include it in the compiled program. This can be fixed by copying the htmlPy module from my Python installation, into the 'dist' directory created by pyInstaller. After doing this, the version of the program compiled without UPX, was working fine.

After adding htmlPy to the 'dist' directory, running the executable crashes the program at the point when the GUI is created. I'm not sure if this is due to a problematic interaction between UPX and QT, or between UPX, QT, and htmlPy. The Windows "Problem Signature" is the following:

Problem signature:
  Problem Event Name:   APPCRASH
  Application Name: main.exe
  Application Version:  0.0.0.0
  Application Timestamp:    00000000
  Fault Module Name:    QtCore4.dll
  Fault Module Version: 4.8.7.0
  Fault Module Timestamp:   561e435a
  Exception Code:   c0000005
  Exception Offset: 000000000010883a

Any ideas as to what's going on here, and how to fix it?

EDIT:

These are the contents of my .spec file:

# -*- mode: python -*-

block_cipher = None

added_files = [
     ( 'htmlPy/binder.js', 'htmlPy' ),
     ( 'templates/*', 'templates' ),
   ]
a = Analysis(['main.py'],
             pathex=['C:\\..\\My_App'],
             binaries=None,
             datas=added_files,
             hiddenimports=[],
             hookspath=[],
             runtime_hooks=['rthook_pyqt4.py'],
             excludes=[],
             win_no_prefer_redirects=False,
             win_private_assemblies=False,
             cipher=block_cipher)
pyz = PYZ(a.pure, a.zipped_data,
             cipher=block_cipher)
exe = EXE(pyz,
          a.scripts,
          exclude_binaries=True,
          name='My_App',
          debug=False,
          strip=False,
          upx=True,
          console=True )
coll = COLLECT(exe,
               a.binaries,
               a.zipfiles,
               a.datas,
               strip=False,
               upx=True,
               name='My_App')

These are the contents of rthook_pyqt4.py:

import sip

sip.setapi(u'QDate', 2)
sip.setapi(u'QDateTime', 2)
sip.setapi(u'QString', 2)
sip.setapi(u'QTextStream', 2)
sip.setapi(u'QTime', 2)
sip.setapi(u'QUrl', 2)
sip.setapi(u'QVariant', 2)

Edit 2:

Here's some of the initialization code (standard htmlPy fare):

app.static_path = path.join(BASE_DIR, "static/")
print "Step 1"
app.template_path = path.join(BASE_DIR, "templates/")
print "Step 2"
app.template = ("index.html", {"username": "htmlPy_user"})
print "Step 3"
...

The program crashes before it gets to Step 3.

Boa
  • 2,609
  • 1
  • 23
  • 38
  • Can you show us your `setoop.py` ? How add your module as import or additional ? – dsgdfg Sep 29 '16 at 06:10
  • @dsgdfg Sorry - what's "setoop.py," and to which imported module(s) are you referring? – Boa Sep 29 '16 at 15:54
  • setup.py code , how to add a library ? – dsgdfg Sep 30 '16 at 05:31
  • 1
    Although it sometimes works, copying the module in the pyinstaller folder is not a good idea. Try to add the missing module using the [`--hidden-import`](https://pythonhosted.org/PyInstaller/usage.html#what-to-bundle-where-to-search) option and see if there is still a problem. – Repiklis Sep 30 '16 at 12:04
  • @Repiklis: I tried `--hidden-import`, and I'm still required to copy and paste the htmlPy folder and the `binder.js` file it contains (the python files in the folder can be omitted), manually into the executable program's directory, because UPX omits it. Once again, after I manually copy/paste in the htmlPy directory (this time, only containing `binder.js`), and run the program, it crashes. Could the problem have something to do with UPX omitting/not compressing htmlPy's `binder.js` file? Is there a way to force UPX to include this folder in the distribution directory that it generates? – Boa Sep 30 '16 at 19:40
  • @dsfdfg - there's no setup.py file involved in any part of the compilation process. – Boa Sep 30 '16 at 19:43
  • I assume that "no setup.py" means no [`.spec`](https://pythonhosted.org/PyInstaller/spec-files.html) file. I am afraid that you need one in this case to add the `binder.js` as a [data file](https://pythonhosted.org/PyInstaller/spec-files.html#adding-data-files). I'll add a detailed answer below. – Repiklis Oct 01 '16 at 21:12
  • @Repiklis - I've managed to use the .spec file to include the necessary data files, and they appear to have been added successfully. As before, the program crashes at startup. – Boa Oct 02 '16 at 02:39
  • Can you run from the command line and post the error? It would be better if you add a small piece of code that reproduces the problem. – Repiklis Oct 02 '16 at 09:51
  • @Repiklis - I was running from the command line. The Python interpreter doesn't throw an Exception at any point. The only error message is the Windows Problem Signature which I posted above. I edited the comment to specify the point at which the program crashes. – Boa Oct 02 '16 at 15:10
  • 1
    Thanks for the upvotes, everyone. I'd like to award the bounty points (there are 23 hours left to do that), but we don't yet have any answers suggesting possible diagnostic measures, nor a likely path to a solution. I guess I'll try to get in touch with the UPX and pyInstaller guys, to see if they can offer any suggestions. – Boa Oct 04 '16 at 20:11
  • Are you using a virtualenv? Also, have you tried adding the path of the QtCore4.dll to "added files" -- For instance, adding `('C:\\Python35-32\\Lib\\site-packages\\....\\Qt4Core.dll', '.')` – sytech Oct 13 '16 at 14:33
  • @Gator_Python - I'm not using a virtualenv. I have not tried adding the QT files to `added_files`, but they are present (including QT4Core.dll) in the generated program's main folder. – Boa Oct 13 '16 at 19:15
  • If you're not using a virtualenv, you should point to the system's dlls to make sure the runtime hooks work appropriately -- Those, most likely, are the dlls being used when you execute it in Python. I vaguely recall seeing a similar issue in the past and this was a solution, but it may not be your case. Weird stuff happens when you copy/paste into new directories -- It's a common cause of that Windows error. I think @Repiklis was on the right track with his previous comment. A virtualenv is also recommended, according to PyInstaller. – sytech Oct 13 '16 at 19:18
  • @Gator_Python - I pasted the contents of the `rthook_pyqt4.py` file above - if I'm not mistaken, doesn't the `runtime_hooks=['rthook_pyqt4.py']` clause in the specs file ensure that the correct versions of the QT dlls are included? – Boa Oct 13 '16 at 23:55
  • I also had some troubles with pyinstaller, some of them similar to yours. This is summarized in https://stackoverflow.com/questions/46818993/compiling-python-with-pyinstaller – Stéphane Nov 10 '17 at 12:56

1 Answers1

2

Your two big concerns relate to:

  1. correctness - app with UPX won't run
  2. performance - 400 MiB is "too big" and 100 MiB lets you address a bigger set of users

The app might be more useful to more people if it's smaller, but it's useful to no one if it won't run. You suspect that UPX improves concern 2 but its interactions impact concern 1.

It would be interesting to build a simple HelloWorld app, package it with pyInstaller + UPX, and keep embellishing it with additional dependencies (like Qt) until you see it break in a way like the current breakage.

It might be more productive to abandon UPX in favor of other approaches, including NSIS. You might use a tool like strace() to monitor which of your distributed files are actually used during system testing runs, and prune unused files during packaging. Proxying requests through FUSE would yield similar information. You might list dependencies for your published app, and rely on pip or conda to download dependencies in parallel, if "elapsed install time" is really what drives your desire to shrink 400 down to 100 MiB.

J_H
  • 17,926
  • 4
  • 24
  • 44