I have a database with ~100000 rows and ~150 columns of data in the format:
data_id data_name monthly_data1 monthly_data2 monthly_data3 "" ""
0 product1_data-item1 20 30 10 "" ""
1 product1 data-item2 10 30 20 "" ""
2 product1 data-item3 9 10 23 "" ""
3 product2 data-item1 40 13 12 "" ""
4 product2 data-item2 31 12 32 "" ""
5 product2 data-item3 23 49 23 "" ""
The data set above is a basic sample of the data, in reality, there are over 2000 products with 50+ data-items each (approx 100,000 rows) and approx 140 columns of data which is annual data.
I need to search the database for each data item - for each product (i.e. each row) and determine if the value in the columns month_data1 through month_data140 for each data-item is within a pre-determined minimum/ maximum range for that specific data item.
Here is the format of my code as is, it is working, just very slowly, approx 20 seconds to complete all 50 checks for each product for each year.
$numberProducts = 2000;
$numberLineItems = 50;
$numberMonths = 140;
for($m=0;$m<$numberMonths;$m++){
for($p=0;$p$numberProducts;$p++){
$dataMonth = 'data_month'.$m+1;
$q="SELECT $dataMonth FROM product_table WHERE data_id='".($p*$numberLineItems)."'";
$q=mysql_query($q);
while($row=mysql_fetch_assoc($q)){
$dataVal = $row[$dataMonth];
}
mysql_free_result($q);
if(($dataVal>=$dataMin1)&&($dataVal<=$dataMax1)){
$q="SELECT $dataMonth FROM product_table WHERE data_id='".($p*$numberLineItems+1)."'";
$q=mysql_query($q);
while($row=mysql_fetch_assoc($q)){
$dataVal = $row[$dataMonth];
}
mysql_free_result($q);
if(($dataVal>=$dataMin2)&&($dataVal<=$dataMax2)){
$q="SELECT $dataMonth FROM product_table WHERE data_id='".($p*$numberLineItems+2)."'";
$q=mysql_query($q);
while($row=mysql_fetch_assoc($q)){
$dataVal = $row[$dataMonth];
}
mysql_free_result($q);
if(($dataVal>=$dataMin3)&&($dataVal<=$dataMax3)){
.
.
.
AND SO ON. All the way to the 50th data item for each product for each month, checking to see if the monthly value for that data item for that product is within a predetermined range --- The pre-determined range ( dataMin/dataMax) is different for each separate data item per product but the same per specific data item between products.
I am looking for a way to speed up the code, different query combination, server setting, loop style, etc. which may help to omptimize things and get down to just a few seconds required for output. ANY ideas would be appreciated.
My first thought was to change the select statement to select the entire database $q = "SELECT * FROM product_table"; and put the data into a multidimensional array to do the min/max checks and avoid the 14,000,000 queries but I get hit the "out of memory" limit.
There has got to be a better way...