-2

I have made I simple web-scraping script with some email sending, and tried to made it into .exe app using Pyinstaller, but I did not expect that the final size of the .exe version will be so huge, its around 360MB. I used this line to convert it into .exe in cmd -> pyinstaller --onefile scriptName.py

Is there any way how could I reduce the file size of the .exe file? Its literally just web-scrape script which is displaying scraped text on cmd.

my imports:

from bs4 import BeautifulSoup
import pandas as pd 
import time
import smtplib


Lmfao
  • 23
  • 4
  • 2
    Show us your code, especially the part of the imports. If you use packages like OpenCV, it might happen that it blows up in size, i assume. Also i advice not to use the --onefile flag if you can not bring down your package-size, to prevent long loading times. – mnikley Nov 06 '21 at 17:19
  • I edited my post and added my imports – Lmfao Nov 06 '21 at 17:29

1 Answers1

0

I tested a minimal example based on your dependencies and just by using a few functions out of the imports i already got a package with 65 MB size. So i think it is pretty common that your package is that large. As far as i know, pyinstaller checks all your imports and used functions/classes from that modules and only includes those, so i am not sure if you can reduce the size any further without changing your code.

However, based on this post you should create the pyinstaller package from within a virtual environment which contains only the absolutely necessary packages. Furthermore, maybe you can try to eliminate unnecessary functions and try to re-use others.

Other than that, try packaging it without --onefile and pack it as a zip file afterwards, or even use Inno Setup to shrink your package size - there are tutorials out there how to use it. I achieved to get a package from i think ~600-700mb down to an installer of 150mb. However this requires some additional work and might not be what you want. This is the test-code i used for my 65mb pyinstaller package:

from bs4 import BeautifulSoup
import pandas as pd 
import time
import smtplib

soup = BeautifulSoup("<p>Some<b>bad<i>HTML")
print(soup.prettify())

def some_function():
    print("Starting function")
    time.sleep(1)
    print("Ending function")

some_function()


try:
    with smtplib.SMTP("domain.org") as smtp:
        smtp.noop()
        print("Did something with smtplib.SMTP")
except:
    print("Couldnt connect via smtplib")

mydataset = {
  'cars': ["BMW", "Volvo", "Ford"],
  'passings': [3, 7, 2]
}

myvar = pd.DataFrame(mydataset)

print(myvar)
mnikley
  • 1,625
  • 1
  • 8
  • 21