5

I'm in a fix here. I'm building a home reservation site and the client requires a filter search facility to allow visitors to search and filter through properties based upon criteria. The thing is that his list of criteria is exceedingly long, largely boolean values and calculable values stuff like:

Attached bathroom, balcony, smoking, alcoholic, maximum number of occupants, cable TV, Internet, carpeted, airconditioning, central heating, room service, etc., etc., etc...

I'm thinking of having to create a field for each of these, but there's a very strong chance that the number of preferences might even go up. I dished out the idea of storing everying in a serialised object as a string as then it would be impossible to search using an SQL query. Do I have any options apart from setting up individual fields for each preference here?

Thanks. I'm using PHP MySQL.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Ali
  • 7,353
  • 20
  • 103
  • 161

5 Answers5

5

Could you have a many-to-many table that ties the ID of the property to a reference table that has Attributes? The lookup table would contain a row for every attribute that the property has?

Main Table

ID PropertyName OtherCols

10 CottageA Stuff

20 CottageB OtherStuff

Attributes

100 Smoking

200 AirConditioning

300 Room Service

400 TV

500 Internet

Lookup Table

ID AttributeID

10 100

10 200

10 300

20 100

20 400

20 500

Brett McCann
  • 2,469
  • 2
  • 27
  • 44
  • Yes. You could do that. But if the Attribute monetary values change (because you don't want duplicate attributes), you will need to go deeper in your normalization. – OldWest Jan 18 '11 at 18:24
  • That sounds interesting, but then how would I be able to run a search query - I'm trying to avoid heavy joins. – Ali Jan 18 '11 at 18:27
  • @OldWest I didn't get that? How could there be the chance of duplicate attributes? – Ali Jan 18 '11 at 18:28
  • That's true but do the individual attributes have monetary values? I didn't think that was part of the equation. Is that was was meant by calculable values? Or are these strictly search criteria? Either way, something to consider. – Brett McCann Jan 18 '11 at 18:30
  • @Ali Well you have a many (properties) to many (attributes) relationship. Without de-normalizing the data structure (and adding all those columns) you'll be faced with a join. This is where you have to balance performance with design. – Brett McCann Jan 18 '11 at 18:36
  • @Brett SO ideally whats more important, the thing is that based upon expected usage, entries of this nature are not as intense as the number of retrieval queries that would be executed. So the aim is to have a solution that is faster in search and retrieval. – Ali Jan 19 '11 at 09:51
1

I did exactly the same search a couple years back for a hotel catalogue. We used a BitMask for that, e.g. we stored a single number representing all the possible values, e.g.

HotelTable 
ID Name         …    Features
 1 SuperHotel   …    5

Features
ID Name
 1 Balcony
 2 Shower
 4 Barrier-Free
 8 Whatever
 … …

In the example above SuperHotel would have a Balcony and be Barrier-Free (4+1).

While this worked well, I am not sure I'd handle it this way again. Basically, all these features are the same as tags, so you could just as well use the known approaches to create a tagging table.

Community
  • 1
  • 1
Gordon
  • 312,688
  • 75
  • 539
  • 559
  • Thats interesting, whats the maximum limit of values I could stretch using this – Ali Jan 18 '11 at 18:29
  • @Ali See the discussion below my answer to the linked BitMask question for that please. – Gordon Jan 18 '11 at 18:38
  • @Gordon: How did you handle maintaining data integrity when deleting/adding new amenities? – webbiedave Jan 18 '11 at 19:52
  • @webbiedave Adding is not an issue. You just add them with the next power of 2 value. We never had to delete or modify any values. I guess you'd have to recalculate all values to do so. – Gordon Jan 18 '11 at 20:21
  • @Gordon - I seem to get the idea a bit but I dont know how could I be able to program the code to retrieve and enter the details, plus this means the criteria field would be a binary one right? – Ali Jan 21 '11 at 07:17
  • @Ali as for the How, check out the links in the first linked question. There is example implementations. Shouldn't be hard to find more via Google. No, the field would be an integer field. – Gordon Jan 21 '11 at 07:40
  • @Gordon - ok, one more thing would it make sense to store the features values in a single table or hard code them in a config file eg: config[features] = array(1=>shower, 2=>balcony...) – Ali Jan 21 '11 at 07:47
  • @Ali whatever you think is more maintainble :) – Gordon Jan 21 '11 at 08:08
1

I would suggest creating a field for each criteria. This is going to allow you to have a the fastest search capability. With that being said you could always create a TEXT or a MEDIUMTEXT and store JSON in that field. So you could use json_encode on an array such as:

$amenities['bathroom'] = 1;
$amenities['balcony'] = 1;
$amenities['smoking'] = 0;

Then if you were looking for a home that had a balcony you could do:

SELECT * FROM `homes` WHERE `json_field` LIKE '%balcony: 1%'

And instead of a LIKE you could always use FULLTEXT searches if you have that capability on your server.

jb1785
  • 722
  • 1
  • 7
  • 19
  • I thought so too but thats what I was hoping to avoid i.e having to make a separate field for each attribute. Since that would mean that everytime I have to add a new attribute I would need to change my table structure and adjust my model classes as well... – Ali Jan 18 '11 at 18:26
1

You could do a mapping like suggested before but include a value in the mapping

Hotel Table

create table hotel_table (
 id int(4) unsigned not null auto_increment primary key,
 hotel_name varchar(40) not null,
 ...other row info
);

Hotel Criteria

create table hotel_criteria (
 id int(4) unsigned not null auto_increment primary key,
 criteria_name varchar(40) not null
);

Hotel Criteria Map

create table hotel_criteria_map (
 id int(4) unsigned not null auto_increment primary key,
 hotel_id int(4) unsigned not null,
 criteria_id int(4) unsigned not null,
 string_data varchar(20) null, #use this to add in extra string information for criteria
 decimal_data decimal(6,2) null, #use this to add in extra decimal information for criteria
 #... either of the above or other extra info, just giving examples ...
 unique key (hotel_id,criteria_id),
 foreign key (hotel_id) references hotel_table(id),
 foreign key (criteria_id) references hotel_criteria(id)
);

You could then select those values:

select * from hotel_table where id={your hotel id}; #hotel info

select m.*,c.criteria_name from hotel_criteria_map m, hotel_criteria c where m.criteria_id=c.id and hotel_id={your hotel id}; #criteria info

There may be a better way to do this but just a suggestion. You would only enter the criteria for a certain hotel into the map if it pertained to that certain hotel (essentially not having any criteria map rows that would be bool 0).

thekoehl
  • 66
  • 5
0

You should read up on database normalization. IMHO the goal is to structure your tables in such a way that allow these queries to be run efficiently and with less logic on the php side. For example, if each of the above mentioned search criteria have a value associated with them, the values should be stored in a separate table, and if there are many options to many values, you will need to setup a different type of relationship. This initial time looking at the database from all angles will save you a lot of make-shift php code and possible prevent an entire rebuild of the database once you realized you've bottlenecked. Hope this was at all helpful. Good luck!

OldWest
  • 2,355
  • 7
  • 41
  • 61