4

I am fairly new when it comes to designing databases and for a project I need to set one up now. I'm not sure if my question makes sense but I have two pieces of data for example, daily weather and the other list of cities.

Every day(or more often) daily weather will get updated with new weather for each region, but also as new cities get added the cities database will grow also(not as quickly as we collect daily weather but it can grow as we add more cities and even countries)..each Table has other tables to enrich it, such as cities have other tables with stats on the city(population, current mayor, etc..)

The question is as the data gets larger and larger, is it better to have weather in one table and cities in another table or to have one database each for these two and keep everything sperate. The reason I am asking is for two reasons. One as both get larger, is it easier to move one database to a different physical location(i.e. different server) or can I do it with tables as well? Also someone told me for future performance you need to keep your indexes as small as possible and not too larger or it'll be slower, I thought of breaking data up between databases as a way of doing it. And lastly, in the future I may want to put out an API that gives people access to one and not the other(i.e. only weather but no city stats or vise versa) or give api access to both. I'm kind of more concerned about scale as if things go well then by the nature of this system, the data will always keep growing.

Sorry for this long question..I'm just a bit confused on which approach to take. Lastly, if I do go down the wrong road..how easy is it to recover in mysql? say I create one database but later realize I need two or vise versa.

Lostsoul
  • 25,013
  • 48
  • 144
  • 239
  • What sizes are you dealing with here? –  Mar 14 '11 at 20:27
  • Well right now we're estimating about 5 gigs but as we add more data sources and columns, the size might snowball. – Lostsoul Mar 14 '11 at 20:33
  • What is your snowball estimate? You need to consider how big your database will be in the next *n* months or years to design it properly, or ask others how they plan to design it. That will dictate all your other measures. –  Mar 14 '11 at 20:35
  • before you start prematurely optimising your design take a moment to read this to see what you can do with a well crafted normalised innodb engine based design http://stackoverflow.com/questions/4419499/mysql-nosql-help-me-to-choose-the-right-one-on-a/4421601#4421601 – Jon Black Mar 14 '11 at 20:40
  • To be honest, I'm not sure how it'll grow. We plan to add data sources based on clients requests. So Client says "your info is great but it'd be awesome if you had state information in addition to cities" then we'll go and grab that and our database will all of a sudden grow. We plan to continually add new data sources to our database. This is the confusing part because there's an unknown growth component. With the existing data model, it'll probably grow slowly(daily numeric info), until we add another gig from another source then it might be slow again then more and so on.. – Lostsoul Mar 14 '11 at 20:50

1 Answers1

2

In MySQL, your database is the same as your schema. Your tables are the entities that contain columns and rows.

You want to normalize your tables. Certainly, read the Wikipedia information on database normalization. However, to make it easier, i'll condense database normalization to one easy to remember rule.

I solemnly swear that each data element relies on the key, the whole key, and nothing but the key, so help me Codd.

As far as to whether your tables should be in more than one database or schema, I'd suggest that no, it's not necessary to have your tables in multiple schemas or databases.

Gilbert Le Blanc
  • 50,182
  • 6
  • 67
  • 111