2

I just joined a project, and have been going over the code. We need to export a lot of data out to Excel for our internal users. There are roughly 5 people who would have access to this functionality at a given time. In order to output to Excel, here's what I found:

  • retrieve data from DB, store in $_SESSION
  • show HTML page view of data

  • when the user wants to export

    • retrieve the DB data from $_SESSION
    • create a string in memory of a CSV
    • print the HTTP Headers with Excel as the filetype
    • print out the CSV formatted strings

This storage in $_SESSION is happening even when the user is not going to export. This strikes me as terribly inefficient, since the $_SESSION variable could explode in size, since each of the DB table retrievals can be up to 30MB per table, and the expiration on $_SESSION is set to 24 hours. So potentially, there could up to 5 users in the system, with up to 150MB $_SESSION variables. Sound about right?

Has anyone else ever seen something like this? Is this like a bootleg Memcache? Wouldn't it be best to just write the data to a flat-file that's updated once every few hours?

Apollo Clark
  • 806
  • 9
  • 16

4 Answers4

2

I do store database some data in session, like ID or small object that I use on every page.

But when it come to larger dataset that I can't extract on the fly for each page load, I often prefer to store them in a MEMORY/HEAP table ( or a temporary file ), and just store an ID in the session so I'll be able to extract them easily.

You might want to take a look at this question about session max size: Maximum size of a PHP session

Community
  • 1
  • 1
FMaz008
  • 11,161
  • 19
  • 68
  • 100
0

I have seen this as well and it is a bad idea. Sometimes you want to display on screen a table of data but also make it available for export, but there is no good reason for stuffing into session memory. If the OS needs to swap and the session gets written to file then you have file IO speed issues, so in some cases it is likely slower than a fresh query to the database.

Duane Gran
  • 490
  • 4
  • 9
0

$_SESSION in your example is being used to store data which is not needed in order to ensure consistency across page views, so this is pointless. It's there to store stuff like last_page_viewed, not to duplicate the DB itself. The only reason to do it like that is if the DB calls to get the data are so hideously expensive that even with the storage inefficiency you describe, it improves performance. This is very unlikely and it sounds like lazy coding.

If they want to export, there should be a zip function that reads all the data using the same SQL and packages it into an excel file on demand. Ideally using MVC so that the same code can be fed into the HTML or the zipper function ;)

Matt Gibson
  • 14,616
  • 7
  • 47
  • 79
0

Your solution could work if your database not update frequently otherwise your users may get outdated data. (And I don't think it's worth to store in the session data anyway.)

As you explained here, I think you are going to use this in a LAN and you don't have more than 5 concurrent users. If I'm right about that why don't you just read database straightly from the database and show it on HTML(I guess you can use paging and don't want to show all 30MB data in a single HTML page) again Export all the data to Excel straight from the DB when user request it :)

Dasun
  • 3,244
  • 1
  • 29
  • 40
  • Good points! Ultimately, this isn't my code, this some WTF code that I've found on my new project. I can understand why they decided to cache is, since some of these queries are very complex and take a long time to run (upwards of several minutes). I think I'll refactor this over to a MEMORY table in MySQL. – Apollo Clark Feb 14 '12 at 18:16