1

file: data.txt (11617 lines)

   user  datetime
   23   2015-03-01 08:04:15 
   15   2015-05-01 08:05:20 
  105   2015-05-01 08:07:10 
   15   2015-06-01 08:08:29 
  105   2015-06-01 08:12:48 

I only need data in 2015-06, I'm using fget and check each line's datetime but really slow, more than 50s.

    $d='data.txt';
    import($d);
    function import($d){
        $handle = fopen($d, "r") or die("Couldn't get handle");
        if ($handle) {
            while (!feof($handle)) {
                $buffer = fgets($handle, 4096);
                $line=explode("\t",$buffer);
                if(date("Y-m",strtotime($line[1])=="2015-06"){
                   mysql_query("INSERT INTO `table` ....");
                }
                else{
                   //break? when month>6
                }
            }
            fclose($handle);
        }
    }

SOLUTION: less than 2s!!!! (thanks to Kevin P. and Dragon)

            if(substr($line[1],0,7)=="2015-06"){
               $sql.=empty($sql)?"":","."(`".$line[1]."`.........)";
            }
            elseif(substr($line[1],0,7)>"2015-06"){
               break;// when month>6
            }
            mysql_query("INSERT INTO `table` ....".$sql);
xtr3mz
  • 73
  • 1
  • 10
  • for the date compare, just use `substr()` `if( substr($line[1],0,7)== "2015-06"){ ...` –  Aug 11 '15 at 01:34
  • 1
    @jpw: if it's a web application, making a user wait 50 seconds is huge. – Amadan Aug 11 '15 at 01:37

3 Answers3

2

Can't be helped, use something faster than PHP. For instance, you can use grep or awk to read the file and filter it quickly. For example:

$lines = explode("\n", `awk '$2 ~ /^2015-06/ { print }' data.txt`);

EDIT: Also, fgets is not guaranteed to give you whole lines. You are getting 4096 bytes at a time; the boundary could be in the middle of a line, which will make the line not match if you are lucky, or break your code due to missed assumptions (such as the length of the $line array when constructing the SQL) if not.*


*) Or vice versa - it would be better for it to break completely, that is at least an obvious error yelling to be fixed; as opposed to silent data droppage.

Amadan
  • 191,408
  • 23
  • 240
  • 301
1

Maybe insert multiple entries in to the DB at once instead of calling it every time you find a desired time?

In which case it's similar to this

Community
  • 1
  • 1
Kevin P.
  • 401
  • 4
  • 13
0

Maybe you should use grep to filter out the lines you do not need.

GerBawn
  • 301
  • 1
  • 4
  • 13