1

I have a kind of proxy server running on a WebServer module and I noticed that this server is being killed due to its memory consumption.

Every time the server gets a new request it creates a child client process, the problem I see is that the process remains alive indefinitely.

Here is the server I'm using:

server.js

I thought response.close() was closing and killing client connections, but it is not.

Here is the list of child processes displayed on htop:

Processes enter image description here

(Those process are even more, it is just a fragment of the list)

I really need to kill those processes because they are using all the free memory. Am I missing something?

I could simply restart the server, but the memory will still be wasted.

Thanks you !

EDIT:

The processes I mentioned before are threads and no independient processes as I thought (check this).

Every http request creates a new thread, and that's ok, but this thread is not being killed after the script ends.

Also, I found out that no new threads are created if the request handler doesn't run casper (I mean casper.run(..)).

So, new threads are created only if the server runs a casper instance, the problem is that this instance doesn't end after run function does.

I tried casper.done() as mentioned below, but it kill the whole process instead of the current running thread. (I did not find any doc for this function).

When I execute other casper scripts, outside the server in the same machine, the instanced threads and the whole phantom process ends successfully. What would be happening?

I am using Phantom 2.1.1 and Casper 1.1.1 versions.

Please ask me anything if you want more or specific information.

Thanks again for reading !

Community
  • 1
  • 1
Alstrat
  • 145
  • 13

1 Answers1

1

This is a well known issue with casper:

https://github.com/casperjs/casperjs/issues/1355

It has not been fixed by the casper guys and is currently marked as an enhancement. I guess it's not on their priority list.

Anyways, the workaround is to write a server side component e.g. a node.js server to handle the incoming requests and for every request run a casper script to do the scraping in a new child process. This child process will be closed when casper terminates it's job. While this is a workaround, it is not an optimal solution as the cost of opening a child process for every request is not cheap. it will be hard to heavily scale an approach similar to this. However, it is a sufficient workaround. More on this fully sensible approach is in the link above.

Mostafa Ali
  • 196
  • 1
  • 7
  • Hi, I tried this but it kills my server. I also tried other commands like casper.die(), but all of them ends with the server down. I've updated the question. – Alstrat Jun 01 '16 at 05:43
  • What url do you use to test this server if ran locally ? I got this running locally and played around a little to simplify the logic in the process_request function to use casper to open a few pages and monitored with htop .. I do not see multiple threads being opened .. I am testing using apache bench to send requests concurrently .. I do not see this strange behaviour .. How do you test this server please provide me with a url you use.. – Mostafa Ali Jun 01 '16 at 15:19
  • I can reproduce now .. I'll come back with a solution when I figure it out .. it's very strange how lots of threads are opened with each incoming request .. the server fails quickly indeed ! – Mostafa Ali Jun 01 '16 at 15:38
  • I modified my answer for your convenience, this should be a better guide now. I'll have a go at fixing this issue myself and creating a pull request for the casper guys. I'll update here when I'm done :) Cheers! – Mostafa Ali Jun 01 '16 at 16:06
  • I'm trying to run it from a PHP server, calling the system. Unfortunately, this program is not recognizing phantomjs's binary. As you mentioned, I think that calling casperjs as a normal system command will solve my memory problems. Thanks you so much. I'll update this question soon with my results. – Alstrat Jun 09 '16 at 05:01
  • Now I'm using Casper from PHP without any memory problem ! Thanks you again. PhantomJS WebServer was the problem. – Alstrat Jun 13 '16 at 00:45