1

$networks is the string with multiple values seperated with Commas. Example :

$networks = "Programming,Movies,Hollywood". (There can be around 150 values or more.)    

Now the Mysql query goes here :

$query = "SELECT * FROM table 
          WHERE ( isActive = 1 AND 
                  isDeleted = 0 AND 
                  onid = '0' AND 
                  question = 0 AND 
                  ( ".$networkqur." )
                ) 
ORDER BY id DESC
LIMIT 0,100;

Here $networkqur is :

$network = explode(',',$networks);

$networkqur = '';
for( $i = 0; $i < count($network); $i++ ){
    $networkqur .= 'networks LIKE "%'.$network[$i].'%"';
    if( $i != count($network) - 1 ){
        $networkqur .= ' OR ';
    }
}

So effectively $networkqur becomes :

$networkqur = "networks LIKE "%Programming%" OR 
               networks LIKE "%Movies%" OR 
               networks LIKE "%Hollywood%";

And the Effective Query Becomes :

$query = "SELECT * FROM posts
          WHERE ( isActive = 1 AND 
                  isDeleted = 0 AND 
                  onid = '0' AND 
                  question = 0 AND 
                   ( networks LIKE "%Programming%" OR 
                     networks LIKE "%Movies%" OR 
                     networks LIKE "%Hollywood% ) 
                   ) 
          ORDER BY id DESC
          LIMIT 0,100;

Now as we can see there can be 150 OR Conditionals in the query and multiple AND conditionals with Sorting at the end.
Can there be any replacement of this particular query or any other faster way to execute it?

RiggsFolly
  • 93,638
  • 21
  • 103
  • 149
jackkorbin
  • 491
  • 7
  • 22
  • 3
    You need to break `networks` out into its own table and start using `JOIN`s and `IN` – sjagr Dec 08 '14 at 20:13
  • ^ But don't you think breaking into other table would increase the complexity and also the time? –  Dec 08 '14 at 20:15
  • I don't think, there is any quicker solution possible. You can use regexp, but I think it's even a bit slower (but easier to write in this case). – Rikudou_Sennin Dec 08 '14 at 20:15
  • @Namrata123 Increase the complexity from a multitude of `LIKE` conditions with wildcards through repeated row data in a single column?? Hell no! There's a reason this technology is called a _relational database_. – sjagr Dec 08 '14 at 20:16
  • 4
    You're using `LIKE '%...%'` There is pretty much NOTHING you can do to speed up those queries - they cannot use indexes. You should switch to a full-text search instead, which at least CAN be indexed. – Marc B Dec 08 '14 at 20:21
  • take a look at this previous post: http://stackoverflow.com/questions/13791656/mysql-query-run-faster – unixmiah Dec 08 '14 at 20:23
  • @MarcB Can you please elaborate you solution? – jackkorbin Dec 08 '14 at 20:42
  • 2
    This is yet another "I'm breaking [First Normal Form](https://en.wikipedia.org/wiki/First_normal_form) and now it's making my life miserable. How do I make it faster/easier/less complex?" question. The answer, as always, is *don't break First Normal Form*. – Bacon Bits Dec 08 '14 at 20:43
  • http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html – Marc B Dec 08 '14 at 20:43
  • ^ i guess yes. Because the "networks" column also has the values separated by commas. So we have to match many to many and check if one exist or not. – jackkorbin Dec 08 '14 at 20:45
  • Check this question: http://stackoverflow.com/questions/22225820/mysql-query-with-multiple-or-statements – SaidbakR Dec 08 '14 at 20:53
  • I would create a new table so you can do `=/IN` comparisons on these tags. In the short term, you could mirror them from your `networks` table by running a cron update, until such time as no code writes to the old location. You can then drop the column. – halfer Dec 08 '14 at 21:20
  • Where is @Bill Karwin? – HddnTHA Dec 08 '14 at 21:40

2 Answers2

2

You have made a terrible mistake in the design or your database. There are six rules of database normalization which should be followed when designing a database and you've broken the first and most important one -- store one and only one piece of information in each column.

There is no way to optimize your current design short of implementing a complex full-text indexing system on top of your database. In addition, there's no way to perform relational integrity (checking that only permitted network values are stored. You cannot index those values and you cannot use them in JOINs. Furthermore you have an embedded string issue in which searching for (for instance) "Holly" will match "Holly", "Hollywood", and "Holly Hunter".

You should instead break that networks column into a separate table name post_networks with columns post_id and network. Add one row per post & network combination, add a PRIMARY KEY on (post_id, network) and possibly one on (network, post_id) and you will be able to do this query (and lots more) more quickly and securely.

Larry Lustig
  • 49,320
  • 14
  • 110
  • 160
  • 1
    As i said a user can have 150 networks , that means in the separate table of user-network-relationship there can be 100k fields for 1k users. Dont you think it can be trouble iterating 100k rows? – jackkorbin Dec 08 '14 at 21:08
  • Just because your table will be pretty small (100,000 rows) is no reason not to create the structure correctly. In fact, I would create another table, even smaller (150 rows) to list the allowable networks and then "protect" the `network` field in `post_networks` as a FOREIGN KEY into the smaller lookup table. – Larry Lustig Dec 08 '14 at 21:10
0

If your list of terms is relatively fixed (i.e. not user generated) then one solution here would be to build your own indexing table and then do your checks on that. If you did the touch points on building the index only when you are adding/editing the main data this could greatly improve the speed of your SELECT statements. So your index table would have your main key per record (ideally a PRIMARY KEY) and then the reference to what you're looking for. You could then add INDEX to the term column and your speed should increase dramatically because you're doing a lookup on an indexed column.

Ideally, you should never do lookups on unindexed data.

Machavity
  • 30,841
  • 27
  • 92
  • 100