3

I certainly can't solve this problem by myself after a few many days already trying. This is the problem:

We need to display information on the screen (HTML) that is being generated in real time inside a PHP file.

The PHP is performing a very active crawling, returning huge arrays of URLs, each URL need to be displayed in real time in HTML, as soon as the PHP captures it, that's why we are using Ob_flush() and flush methods to echo and print the arrays as soon as we got them.

Meanwhile we need to display this information somehow so the users can see it while it works (since it could take more than one hour until it finishes).

It's not possible to be done, as far as I understand, with AJAX, since we need to make only 1 request and read the information inside the array. I'm not either totally sure if comet can do something like this, since it would interrupt the connection as soon as it gets new information, and the array is really rapidly increasing it's size.

Additionally and just to make the things more complex, there's no real need to print or echo the information (URLs) inside the array, since the HTML file is being included as the User Interface of the same file that is processing and generating the array that we need to display.

Long story short; we need to place here:

<ul>
    <li></li>
    <li></li>
    <li></li>
    <li></li>
    <li></li>
    ...
</ul>

A never ending and real time updated list of URLS being generated and pushed inside an array, 1,000 lines below, in a PHP loop.

Any help would be really more than appreciated. Thanks in advance!

Chris Russo
  • 450
  • 1
  • 7
  • 21
  • 1
    Would it work to write the PHP array to a file and parse it with Javascript? You could ajax the file any time you want to, and it will have all of the results while the PHP continuously adds to it. – Jon Egeland Sep 14 '12 at 20:54
  • Hi Jon, and thanks a lot for your quick responce, It's a good approach, and we already been thinking about that, however it's not possible since the load is already really huge. – Chris Russo Sep 14 '12 at 20:58
  • was curious if sockets worked out for you.... – Stephen Sep 18 '12 at 13:30

5 Answers5

3

Try web-sockets.

They offer real-time communication between client and server and using socket.io provide cross-browser compatibility. It's basically giving you the same results as long-polling / comet, but there is less overhead between requests so it's faster.

In this case you would use web sockets to send updates to the client about the current status of the processing (or whatever it was doing).

See this Using PHP with Socket.io

Community
  • 1
  • 1
Jamund Ferguson
  • 16,721
  • 3
  • 42
  • 50
  • I've been forced by a lot here, not just posting a link as an answer, I suggest you do the same, by explaining why WebSockets would be more ideal then the answers already posted, and you are right, that is the best solution of them all .. ;) – dbf Sep 14 '12 at 21:01
  • I'm familiar with them, and sounds like a good possibility as well, I will take a look at this, is there any solution that you would suggest to make this implementations? Thanks a lot! – Chris Russo Sep 14 '12 at 21:10
  • Thanks, we're proceeding with this approach!! – Chris Russo Sep 14 '12 at 21:27
0

I think the best way to do this would be to have the first PHP script save each record to a database (MySQL or SQLite perhaps), and then have a second PHP script which reads from the database and outputs the newest records. Then use AJAX to call this script every so often and add the records it sends to your table. You will have to find a way of triggering the first script.

The javascript should record the id of the last url it already has, and send it in the AJAX request, then PHP can select all rows with ids greater than that.

If the number of URLs is so huge that you can't store a database that large on your server (one might ask how a browser is going to cope with a table as large as that!) then you could always have the PHP script which outputs the most recent records delete them from the database as well.

Edit: When doing a lot of MySQL inserts there are several things you can do to speed it up. There is an excellent answer here detailing them. In short use MyISAM, and enter as many rows as you can in a single query (have a buffer array in PHP, which you add URLs to, and when it is full insert the whole buffer in one query).

Community
  • 1
  • 1
gandaliter
  • 9,863
  • 1
  • 16
  • 23
  • Hi Gandaliter, thanks a lot for your comment! it's the fastest implementation that we have done in the system, we already tried this approach, but still strongly impacts in the performance since it's a huge amount of insert queries. And additionally it also implies creating a second file just to read the information that is being processed in the same origin. Does anything else comes to your mind? – Chris Russo Sep 14 '12 at 21:03
0

Suppose you used a scheme where PHP was writing to a Memcached server..

each key you write as rec1, rec2, rec3

You also store a current_min and a current_max

You have the user constantly polling with ajax. For each request they include the last key they saw, call this k. The server then returns all the records from k to max.

If no records are immediately available, the server goes into a wait loop for a max of, say 3 seconds, checking if there are new records every 100ms

If records become available, they are immediately sent.

Whenever the client receives updates or the connection is terminated, they immediately start a new request...

Writing a new record is just a matter of inserting max+1 and incrementing min and max where max-min is the number of records you want to keep available...

Stephen
  • 3,341
  • 1
  • 22
  • 21
  • Thanks Stephen, it's an interesting approach. Do you think it should work faster than the database or the socket options? – Chris Russo Sep 14 '12 at 21:17
  • Certainly faster than a db option as the memcached insert/deletes would be O(1) and stored completley in memory.. Web Sockets are probally optimal for sending the data but they are somewhat inconsistenly supported for browsers at the moment. FF briefly removed them from FF4 and put them back in for FF5 because they found a security concern which was present at the standards level. – Stephen Sep 14 '12 at 21:18
  • Thanks a lot Stephen, a great solution as well. – Chris Russo Sep 14 '12 at 21:29
0

An alternative to web sockets is COMET

I wrote an article about this, along with a followup describing my experiences.

COMET in my experience is fast. Web sockets are definitely the future, but if you're in a situation where you just need to get it done, you can have COMET up and running in under an hour.

Definitely some sort of shared memory structure is needed here - perhaps an in-memory temp table in your database, or Memcached as stephen already suggested.

Tim G
  • 1,812
  • 12
  • 25
0
If I were you , I try to solve this with two way .
First of all I encode the output part array  with json and with the setTimeout function with javascript I'll decode it and append with <ul id="appendHere"></ul> so 

when list is updated , it will automatically update itself . Like a cronjob with js .

The second way , if u say that I couldn't take an output while proccessing , so using data insertion to mysql is meaningless I think , use MongoDb or etc to increase speed . By The way You'll reach what u need with your key and never duplicate the inserted value .

Emre Karataşoğlu
  • 1,649
  • 1
  • 16
  • 25