2

I have a database that's formatted like this:

URLID   USERNAME   URL
0       jake       abc.com
1       jake       123.com
2       po         google.com
3       po         googl3.com
4       po         ranoutoffakeurlslol.com

However, I want it to be something like this:

USERNAME    URLS
jake        {abc.com, 123.com}
po          {google.com, googl3.com, ranoutoffakeurlslol.com}

How would I do it? If it's not possible, then, how can I make method 1 more efficient? Thanks in advance!

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Shahar
  • 1,687
  • 2
  • 12
  • 18

2 Answers2

1

Since this is a one-to-many relationship (i.e. one user can be associated with many URLs) you should create a separate table for storing the URLs:

URLID   USERNAME
1       jake
2       po

URLID   URL
1       abc.com
1       123.com
2       google.com
2       googl3.com
2       ranoutoffakeurlslol.com
weenoid
  • 1,156
  • 2
  • 11
  • 24
  • It's no less efficient than extracting the values via some sort of string manipulation. To keep the list sorted you should create a clustered index on the URL column. – weenoid Aug 06 '13 at 19:46
  • Ohhh a clustered index. Thanks! – Shahar Aug 06 '13 at 19:47
  • 1
    It is usually not a good idea to think of RDBMS's in the same way you would think of imperative programming languages. In languages such as Java, array accesses are fast and look-ups in collections are slower. RDBM's, however, are especially designed and optimized for table reference lookups and are pretty good at them. The rationale that column arrays are faster in an RDBMS than a table reference lookup is very likely to be incorrect. MySQL, last I knew, offered no support for the SQL specification on column arrays. PostgreSQL and HSQLDB both do support them. – scottb Aug 06 '13 at 20:45
0

I personally would not want to set up a database in the latter format, as it introduces a good deal of formatting issues whenever you want to pull data from that.

Normally, what I would do, is split the data into either two or three tables, depending on whether didn't usernames can be associated with the same URL or not. If they can't (one to many relationship) I would build two tables:

USERS
USERID  USERNAME 
0       jake      
1       po    

URLS
URLID   USERID   URL
0       0      abc.com
1       0      123.com
2       1      google.com
3       1      googl3.com
4       1      ranoutoffakeurlslol.com

Or, if urls can have different users (many to many), USERS would be the same, but URLS would be:

URLS
URLID     URL
0        abc.com
1        123.com
2        google.com
3        googl3.com
4        ranoutoffakeurlslol.com

and there would be a third table to join them

USERURLS
USERID    URLID
0         0
0         1
1         2
1         3
1         4

This way looks a bit complicated, but repeats less data (only ints instead of strings), and is much easier to expand later on, say if you wanted to add address, phone number, password to the usernames, or anything else. I would look into the Normal Forms (NF) of relational databases, if you aren't familiar with them. (wikipedia article)

David M
  • 131
  • 8
  • Would this be better than the method described above? If I just have the following two tables: USERID | USERNAME and=> USERID | URL, that seems a bit more space-efficient. – Shahar Aug 06 '13 at 19:53
  • For one-to-many, they are the same solutions. Can your data have a url repeat between unique users? For instance, Jake has abc.com. Is it possible po also has abc.com? – David M Aug 06 '13 at 20:17
  • I actually have two pieces of data and for one of them a url repeat is possible. For the one with the url repeat, I'll most likely use your method. For the other, I'll use the other one. Thanks! – Shahar Aug 07 '13 at 00:00