1

Objective: I'm trying to determining a best approach for table structure and queries; namely, how best to create composite index keys to optimise a four table join with about 500k rows in each table.

The Mental Block: It's the building of the tables and filling them with test data that I am lousy at understanding (SQL rookie). This SO post describes 75% of what I'm trying to solve and I think I could finish off my understanding if I could just get the answer's code to run on my localhost (apache2, mysql5.1, phpMyAdmin).

The Request for help: Jake Feasel's sqlfiddle site is a great tool, but according to him it does not scale to that level. But, can someone use sqlfiddle to show me how I might build this environment on my local machine? I hope to extrapolate your sqlfiddle post to replicate the example code I mentioned above.

An assumption: Most of my SQL experience has been tweaking an ADOdb Database Abstraction Library, but I realise MYSQL code will differ slightly in the testing environment. Still, I assume the optimising knowledge I would gain by knowing how to setup a local SQL testing environment and the subsequent db structure and queries will be applicable and beneficial when implemented in an ADOdb environment.

Community
  • 1
  • 1
Ricalsin
  • 950
  • 9
  • 28
  • You're going to need to provide more detail as to the data structure you're trying to represent, as well as the result set you'd like to produce. – Jake Feasel May 28 '12 at 18:21
  • @Jake_Feasel, I'm just trying to understand how to setup an environment to run the posted example code to the SO question I linked to. Once I get that then if I still need help solving my specific problem then I can come back to SO with a link to my own sqlfiddle example. – Ricalsin May 28 '12 at 18:25
  • Sorry, I hadn't read closely enough before. See my answer below. – Jake Feasel May 28 '12 at 18:44
  • BTW, sqlfiddle is a great tool to help formulate queries that work. Thank you for making this site and for posting your code on gitHub, I plan on checking out your twitter bootstrap usage; you're site does look, feel and work great. :) – Ricalsin May 29 '12 at 19:50

2 Answers2

1

What you really need is a data generator tool that will help you populate a database with thousands or millions of records, and after you have a bloated database with meaningful data you can start your performance tests experimenting the best relationship, index and joins that will help you detect what really needs to be optimized.

One that I personally used in the past, was: GenerateData But there are others.

jfoliveira
  • 2,138
  • 14
  • 25
0

http://sqlfiddle.com/#!2/b771b/2

Here, you can see that I've copied your table definitions and pasted them into the schema panel. I've also added some dummy data into each of them. Using that basic structure, I've executed your query as well; you can see that it produces the same execution plan that your pastie is displaying (expand the "+ View Exection Plan" on the fiddle).

You can see that it does run fine on SQL Fiddle. That addresses the question in your title ("How to create a SQL test environment"). What part of your other request "how I might build this environment on my local machine?" are you having trouble with?

Jake Feasel
  • 16,785
  • 5
  • 53
  • 66
  • So if I want to add 500k names into the poster table field - in order to test the access times - then I would use either a random number generator and loop through it or use an editor like vim to generate a comma delimited text string and upload it into the Insert statement (, and if so how)? I'm just wondering what the best approach is to basically do what you just did but with 500k rows and not commenting out the where clause. I realise sqlfiddle does not scale to that level, but if I can see how it's done for 5 rows then I can extrapolate that into my local setup. – Ricalsin May 28 '12 at 19:55
  • @Ricalsin generating random data for your inserts would certainly work. Just realize that your execution plans are going to be the same regardless of 5 rows or 500k. Considering this is such a simple query, however, I suspect you'll get most of your performance gains by optimizing hardware and operating system settings, rather than query tuning. – Jake Feasel May 28 '12 at 20:03
  • Jake, this isn't my query. My query will be joining 4 tables using multiple composite indexes (described in the first line of my post). My need was to understand how to build a similar test environment (locally) and I used an SO post to request how to setup an environment with 500k rows inside a table. Apparently it's not as easy as it seems. I've read where you can: `INSERT INTO MyTable(ID, Name) SELECT ID, NAME FROM OtherTable` but that requires you to have that info in your db - which I do not right now. I cannot test access times using 5 rows, I need to study the efficiency of a plan. – Ricalsin May 28 '12 at 20:11