2

I built a flask frontend for a rather detailed app and it runs perfectly locally after a lot of coding. So now I want to make it visible to the public, so have read that it needs an apache or nginx frontend. I worked out the size of instance I needed and switched it on on AWS, and ran through these two tutorials which prove that the base case is working. Once the Apache setup is done, the Apache welcome page shows when I navigate to the IP of the AWS instance. Then when I temporarily substitute the "Hello World" code for my flask app and refresh the browser with the IP, the hello world appears rendered. Also, if I substitute the flasskapp.py code from here it perfectly serves across apache to the IP typed into the browser a list of modules found by pip including Numpy and Pandas which it otherwise complains about not finding! This script, could however easily be killed by adding import numpy at the top, but it gives me the following error messages:

ImportError: cannot import name 'multiarray'

and

Importing the multiarray numpy extension module failed.  Most
likely you are trying to import a failed build of numpy.
If you're working with a numpy git repo, try `git clean -xdf` 
(removes all files not under version control).  
Otherwise reinstall numpy.

both in the apache log.

The trouble happens when I put my app back in instead of hello world. It just refuses to import pandas here is the last section of the Apache log:

Mon Jul 03 19:42:14.691081 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560] Traceback (most recent call last):
[Mon Jul 03 19:42:14.691126 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]   File "/var/www/html/flaskapp/flaskapp.wsgi", line 30, in <module>
[Mon Jul 03 19:42:14.691130 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]     from similar_items_frontend import app as application
[Mon Jul 03 19:42:14.691137 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]   File "/var/www/html/flaskapp/similar_items_frontend.py", line 4, in <module>
[Mon Jul 03 19:42:14.691140 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]     from item_similar_to_doc import similar_to_doc
[Mon Jul 03 19:42:14.691145 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]   File "/home/ubuntu/find-similar/frontend/../item_similar_to_doc.py", line 1, in <module>
[Mon Jul 03 19:42:14.691148 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]     from PullVectors import p
[Mon Jul 03 19:42:14.691153 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]   File "/home/ubuntu/find-similar/frontend/../RunSimilarity.py", line 5, in <module>
[Mon Jul 03 19:42:14.691156 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]     import pandas as pd
[Mon Jul 03 19:42:14.691161 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]   File "/home/ubuntu/.pyenv/versions/miniconda3-latest/envs/find_similarProject/lib/python3.6/site-packages/pandas/__init__.py", line 19, in <module>
[Mon Jul 03 19:42:14.691164 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560]     "Missing required dependencies {0}".format(missing_dependencies))
[Mon Jul 03 19:42:14.691180 2017] [wsgi:error] [pid 12300:tid 140131159766784] [client 93.21.05.132:52560] ImportError: Missing required dependencies ['numpy']

I feel like I've tried everything it's been many painful hours, hair pulling time..

Here is 000-default.conf :

WSGIPythonPath /home/ubuntu/.pyenv/versions/miniconda3-latest/envs/vtenv4YTproject/lib/python3.6/site-packages/

<VirtualHost *:80>
        # The ServerName directive sets the request scheme, hostname and port that
        # the server uses to identify itself. This is used when creating
        # redirection URLs. In the context of virtual hosts, the ServerName
        # specifies what hostname must appear in the request's Host: header to
        # match this virtual host. For the default virtual host (this file) this
        # value is not decisive as it is used as a last resort host regardless.
        # However, you must set it for any further virtual host explicitly.
        #ServerName www.example.com

        ServerAdmin webmaster@localhost
        DocumentRoot /var/www/html

        WSGIDaemonProcess flaskapp threads=5
        WSGIScriptAlias / /var/www/html/flaskapp/flaskapp.wsgi

        <Directory flaskapp>
              WSGIScriptReloading On
              WSGIProcessGroup %{GLOBAL}
              WSGIApplicationGroup %{GLOBAL}
              Order allow,deny
              Allow from all
        </Directory>



        # Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
        # error, crit, alert, emerg.
        # It is also possible to configure the loglevel for particular
        # modules, e.g.
        #LogLevel info ssl:warn

        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined

        # For most configuration files from conf-available/, which are
        # enabled or disabled at a global level, it is possible to
        # include a line for only one particular virtual host. For example the
        # following line enables the CGI configuration for this host only
        # after it has been globally disabled with "a2disconf".
        #Include conf-available/serve-cgi-bin.conf
</VirtualHost>

# vim: syntax=apache ts=4 sw=4 sts=4 sr noet

I should mention that my flask app runs perfectly (to the point of flask serving the page) in its virtualenv when I am on ssh into the AWS instance, of course was a bit of tinkering to get that to work, but it finds all its dependencies, not crashing at the pandas import.

To try and fix it I looked here, here, added some code from here, went through the extensive docs here, my flaskapp.wsgi file looks like this:

import sys
import logging
import os.path
sys.path.append(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))
import config
import site

prev_sys_path = list(sys.path) 

site.addsitedir('/home/ubuntu/.pyenv/versions/miniconda3-latest/envs/find_similarProject/lib/python3.6/site-packages')

new_sys_path = []

for item in list(sys.path): 
    if item not in prev_sys_path:
        new_sys_path.append(item) 
        sys.path.remove(item)
sys.path[:0] = new_sys_path 

logging.basicConfig(stream=sys.stderr)
sys.path.insert(0, "/var/www/html/flaskapp")

from similar_vids_frontend import app as application
application.secret_key = config.WSGI_KEY

There is just no reason I can find why pandas / numpy imports are failing with apache but working fine when just run in the virtualenv. I wonder if there's something to do with permissions and groups, as per the docs:

Do be aware that the user that Apache runs your code as will need to be able to access the Python virtual environment. On some Linux distributions, the home directory of a user account is not accessible to other users. Rather than change the permissions on your home directory, it might be better to consider locating your WSGI application code and any Python virtual environment outside of your home directory.

But is Ubuntu one of those distributions, and what then, install the virtual environments and project into the / folder?

Grateful for any tips or a solution.

cardamom
  • 6,873
  • 11
  • 48
  • 102
  • Does [this question](https://stackoverflow.com/questions/12931013/wsgipythonpath-is-not-working) help you? – syntonym Jul 03 '17 at 21:11
  • Thanks. I saw that but didn't have a clue what "daemon mode" is or whether I was using it. Am beginning to think it is that I am using https://github.com/pyenv/pyenv-virtualenv but had not actually done `sudo apt install virtualenv` although everything runs unless apache tries to run it – cardamom Jul 03 '17 at 21:21
  • You are using `WSGIDaemonProcess` so I think you are indeed using the daemon mode the answer is talking about. Have you tried the fix the answer suggests? – syntonym Jul 03 '17 at 21:24
  • Yes, I took away that line at the top WSGIPythonPyth and added onto a line in the middle `WSGIDaemonProcess flaskapp threads=5 python-path=/home/ubuntu/.pyenv/versions/miniconda3-latest/envs/find_similarProject/bin` but the problem is still there `ImportError: Missing required dependencies ['numpy']` – cardamom Jul 03 '17 at 21:42
  • The `python-path` option should point to your sitepackages, so probably `WSGIDaemonProcess flaskapp threads=5 python-path=/home/ubuntu/.pyenv/versions/miniconda3-latest/envs/find_similarProject/lib/python3.6/site-packages`. – syntonym Jul 03 '17 at 21:56
  • Thanks I tried that. Is probably something with pyenv / virtualenv [and Python 3.6](https://stackoverflow.com/questions/29950300/what-is-the-relationship-between-virtualenv-and-pyenv) not mixing with this wsgi / apache thing will see if I can solve it – cardamom Jul 04 '17 at 08:39
  • As what user does apache run, `apache`? If you run your application as that user from the terminal, does it still work? – syntonym Jul 04 '17 at 09:28
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/148296/discussion-between-cardamom-and-syntonym). – cardamom Jul 04 '17 at 09:58
  • You have specified a daemon process for mod_wsgi but are not using it, as you aren't setting ``WSGIProcessGroup`` correctly, it should name the daemon process group and not be ``%{GLOBAL}``. You are also setting up the whole virtual environment with mod_wsgi in a way which is not recommended. Go read http://modwsgi.readthedocs.io/en/develop/user-guides/virtual-environments.html – Graham Dumpleton Jul 04 '17 at 11:16
  • You also can't force mod_wsgi compiled for a specific version of Python, to use a Python virtual environment created with a different Python version or installation. If you want to use a non system Python installation, you will need to compile mod_wsgi from source code for the Python installation you want to use. Suggest you not use system packaged mod_wsgi and use ``pip`` method of installing it as outlined in https://pypi.python.org/pypi/mod_wsgi – Graham Dumpleton Jul 04 '17 at 11:18
  • Thanks @GrahamDumpleton I am a bit suspicious when it tells me `mod_wsgi/4.3.0 Python/3.5.2 configured` when I compiled it for Python 3.6. I for now got rid of the virtualenv and am just using miniconda3-latest in pyenv but that hasn't helped anything. I will have a read about the *daemon process group* and reset it. – cardamom Jul 04 '17 at 11:25
  • So you are telling me NOT to set it to `%{GLOBAL}` but I can see [back in February](https://stackoverflow.com/a/42406858/4288043) you told someone with a similar problem to do exactly that, something else must be different.. – cardamom Jul 04 '17 at 11:37
  • 1
    The value given to ``WSGIProcessGroup`` should match the name of the ``WSGIDaemonProcess`` directive. That other post was about ``WSGIApplicationGroup`` which is different. – Graham Dumpleton Jul 04 '17 at 11:41
  • Does WSGI have a problem with Python 3.6.1 ? I thought of trying `anaconda3-4.4.0`in case it did but then my own code wouldn't run locally anymore, crashed with a horrible stack of error messages. – cardamom Jul 04 '17 at 14:18
  • 1
    Python 3.6 works fine, but mod_wsgi must have been compiled for it. You can't as I said before try and force mod_wsgi compiled for 3.5, to use a virtual environment created with 3.6. Use procedures in http://modwsgi.readthedocs.io/en/develop/user-guides/checking-your-installation.html for validating how your mod_wsgi was built. – Graham Dumpleton Jul 04 '17 at 21:18
  • I was struggling with setting up mod_wsgi with Apache and Python on Windows. All I was missing was `WSGIProcessGroup %{GLOBAL}`. Thanks for posting the question. Note: I put the config not in Directory section but just below WSGIScriptAlias. – Vikram3891 Nov 27 '19 at 12:42

1 Answers1

2

This now works, mainly thanks to Graham's recipe here and also this answer about loading modules into apache. The proof that it works is that the following variation of the hello world app:

from flask import Flask
import numpy
app = Flask(__name__)

@app.route('/')
def hello_from_np():
  a = numpy.array([4,5,6])
  return str(a)

if __name__ == '__main__':
  app.run()

gives you this in the browser when you go to the IP:

enter image description here

...rather than a bunch of errors in the apache log about numpy. It's all set up for Python 3.6:

AH00489: Apache/2.4.7 (Ubuntu) mod_wsgi/4.5.15 Python/3.6 configured

Wanted to record a few details as a note to self and in case anyone is having trouble and trying to reproduce this, maybe save you from days of barking up the wrong tree..

So I switched from Ubuntu Server 16.04 to 14.04.5 LTS in case that was part of the issue but also because pyenv was performing poorly, crashing half way through the installation of many older Python versions I was testing. Also chose to leave out pyenv-virtualenv as there were no multiple projects planned for this instance and it was one less thing to go wrong.

From the clean install of Ubuntu, in addition to installing apache2, pyenv, libapache2-mod-wsgi, activating it with a2enmod, I deliberately ran the following lines before installing the requirements with pip, which included numpy, so it doesn't complain:

sudo apt-get install gcc
sudo apt-get install g++

This is also needed:

sudo apt-get install apache2-dev

Then inside the environment, after the requirements,

pip install mod_wsgi
mod_wsgi-express module-config

The second line returned:

LoadModule wsgi_module "/home/ubuntu/.pyenv/versions/miniconda3-latest/lib/python3.6/site-packages/mod_wsgi/server/mod_wsgi-py36.cpython-36m-x86_64-linux-gnu.so"
WSGIPythonHome "/home/ubuntu/.pyenv/versions/miniconda3-latest"

Then with loading the module, the docs encourages you to paste into a file httpd.conf which at least in my version of apache and ubuntu does not exist. After a lot of reading, I avoided creating one and also avoided pasting it into apache2.conf which seems to be related. Rather, following the instructions for ubuntu linked to at the top,

edit this file:

sudo vi /etc/apache2/mods-available/wsgi.load

paste this line into it and save:

LoadModule wsgi_module /path/to/mod_wsgi-py36.cpython-36m-x86_64-linux-gnu.so

then run this and restart the apache server if necessary:

sudo a2enmod wsgi

000-default.conf looks like this:

WSGIPythonHome "/home/ubuntu/.pyenv/versions/miniconda3-latest"

<VirtualHost *:80>
    # The ServerName directive sets the request scheme, hostname and port that
    # the server uses to identify itself. This is used when creating
    # redirection URLs. In the context of virtual hosts, the ServerName
    # specifies what hostname must appear in the request's Host: header to
    # match this virtual host. For the default virtual host (this file) this
    # value is not decisive as it is used as a last resort host regardless.
    # However, you must set it for any further virtual host explicitly.
    #ServerName www.example.com

    ServerAdmin webmaster@localhost
    DocumentRoot /var/www/html
        WSGIDaemonProcess helloapp threads=5 python-path=/var/www/html/frontend/  
        WSGIScriptAlias / /var/www/html/frontend/helloapp.wsgi

        <Directory flaskapp>
        WSGIProcessGroup helloapp
        WSGIApplicationGroup %{GLOBAL}
        Order allow,deny
        Allow from all
        </Directory>

    # Available loglevels: trace8, ..., trace1, debug, info, notice, warn,
    # error, crit, alert, emerg.
    # It is also possible to configure the loglevel for particular
    # modules, e.g.
    #LogLevel info ssl:warn

    ErrorLog ${APACHE_LOG_DIR}/error.log
    CustomLog ${APACHE_LOG_DIR}/access.log combined

    # For most configuration files from conf-available/, which are
    # enabled or disabled at a global level, it is possible to
    # include a line for only one particular virtual host. For example the
    # following line enables the CGI configuration for this host only
    # after it has been globally disabled with "a2disconf".
    #Include conf-available/serve-cgi-bin.conf
</VirtualHost>
cardamom
  • 6,873
  • 11
  • 48
  • 102
  • 1
    While for the purposes of the original question, this 000-default.conf works, that is Numpy imports, well, this is stranger than fiction, but scipy does NOT import! Moreover, it does not even throw an error message like the `import numpy` did! To fix this, see [here](https://stackoverflow.com/questions/16823388/using-scipy-in-django-with-apache-and-mod-wsgi) the `WSGIApplicationGroup %{GLOBAL}` line needs to be moved from where it is to just under the `WSGIScriptAlias` line – cardamom Jul 07 '17 at 13:11
  • 1
    Where you have ```` it should be ````. As it was you wouldn't even be using daemon mode. That Apache didn't fail the request due to access forbidden suggests your permissions elsewhere in Apache are lax and letting Apache serve up stuff from anywhere on the file system. Suggest you add ``WSGIRestrictEmbedded On`` outside of ``VirtualHost`` to cause failure if embedded mode used by mistake. – Graham Dumpleton Jul 08 '17 at 21:11
  • Your command (placed below the bottom VirtualHost) did indeed break it after it has been working perfectly since Saturday: `Embedded mode of mod_wsgi disabled by runtime configuration: /var/www/html/frontend/flaskapp.wsgi` I need it online now, have shared the link around, so hopefully nobody finds a way to hack it in the mean time, removed `WSGIRestrictEmbedded On` again. Probably granting *execute* rights to the files in the frontend folder was enough, didn't need to add the whole Python repo to www-data group. Will find out what safe settings are and restore in a quiet time of the day. – cardamom Jul 11 '17 at 11:30
  • Should cause any security issue, using embedded mode has performance and recoverabilty issues, if Apache server uses typical configuration suited more to static files and PHP applications. Read http://blog.dscpl.com.au/2012/10/why-are-you-using-embedded-mode-of.html – Graham Dumpleton Jul 11 '17 at 11:48
  • Thanks for the link *not necessarily going to work very well for a dynamic web application with a large memory footprint that performs better when kept persistent in memory*. Good it won't be hacked, it uses a lot of memory and has crashed once so far. Will try to convert it properly to daemon mode within a week or so. – cardamom Jul 11 '17 at 11:57