12

I'm designing a multilingual e-commerce site. Products have different properties. Some properties are different for each language (like color), other properties are the same for all languages (like SKU). Properties are not predefined, for example cars have other properties than espresso machines.

I'd like to design the database schema such that:

  1. Searching and representing all products from category x in language y is fast
  2. The amount of duplicate data is low
  3. I don't want to use files with translations

I'm thinking about using a schema like this:

{
 _id: ObjectID("5dd87bd8d77d094c458d2a33"),

 multi-lingual-properties: ["name", "description"],

 name: { en: "Super fast car",
         nl: "Hele snelle auto"},

 description: { en: "Buy this car",
                nl: "Koop deze auto"},

 price: 20000,

 sku: "SFC106X",

 categories: [ObjectID("4bd87bd8277d094c458d2a43")]
}

Is there a better alternative to this schema? What problems will I encounter when I use this schema?

Mukesh Soni
  • 1,119
  • 3
  • 13
  • 23
i.amniels
  • 1,781
  • 1
  • 25
  • 37
  • 3
    In my experience, ecommerce systems tend to have highly relational database schemas - are you sure MongoDB is right for this? – Neville Kuyt Sep 23 '11 at 12:41
  • @Neville K Yes: http://spf13.com/post/mongodb-ecommerce-a-perfect-combination and http://kylebanker.com/blog/2010/04/30/mongodb-and-ecommerce/ – i.amniels Sep 23 '11 at 16:19
  • I am totally prepared to accept I'm a cynical old goat, but the fact proponents of MongoDB are in favour of MongoDB in this context wouldn't be enough to sway me. I quite like the idea of NoSQL for the catalogue part of an ecommerce site - products are notoriously polymorphic. I'm not sure I'd want to do the business logic part - cart, check-out, payment, addressing, fulfillment - without my relational comfort blanket... – Neville Kuyt Sep 25 '11 at 19:54
  • @Neville K I agree, I only want to use MongoDB for the catalog and to describe the contents of orders and invoices. This are parts which rely a lot on EAV in traditional SQL apps. The payments, check-out, stock management will be build on SQL. Afaik you are not a cynical old goat ;-) Critics are important for every project. – i.amniels Sep 26 '11 at 09:05
  • 1
    Are you using plain-vanilla Mongo queries, or some object-document mapper like Doctrine Mongo ODM or similar? I'm working on something similar now (using Doctrine) and will post something when I came up with a solution proposal. – Jakub P. Sep 26 '11 at 11:48
  • @Jakub P. I'm thinking about using Yii framework (which is awesome). I haven't decided on which ODM I am going to use. There is an Active record like extension available for Yii, but I'm open for anything right now. I need something to handle the localization, I'm afraid current available PHP ODM's don't offer support for this. An ODM which offers some functionality for combining SQL and Mongo and which offers localization support would be perfect. – i.amniels Sep 26 '11 at 15:10

1 Answers1

7

Later than I expected, but here's what we're implementing now...

Background: Our system is supposed to be able to import product inventory from multiple ecommerce sites, so flexibility & i18n are important.

EAV model:

db.products.find()

{ "_id" : ObjectId("4e9bfb4b4403871c0f080000"), 
"name" : "some internal name", 
"sku" : "10001", 
"company" : { /* reference to another collection */ }, "price" : 99.99,
"attributes": { 
  "description": { "en": "english desc", "de": "deutsche Beschreibung" },
  "another_field": { "en": "something", "de": "etwas"}, 
  "non_i18n_field": { "*": xxx } 
 }
}

We also need metadata for attributes, which includes admin editing tips (what input forms to use) and i18n for names of the attributes. Example:

db.eav.attributes.find()

{ "_id" : ObjectId("127312318"), 
"code" : "description", 
"labels" : { "en" : "Description", "de": "Beschreibung" }, 
"type" : "text-long", 
"options" : [], 
"constraints" : [ ]
}

The idea is that attribute metadata will be quite large and won't get copied. Most of the time operations will be done using values of the (dynamic) attributes. If attribute metadata is necessary to display UI etc. it can be loaded & cached separately, and referenced by the attribute code.

So by default everything that's an attribute is i18n-enabled.

Queries for i18n-enabled-attributes are simple:

db.products.find({ attributes.description.en: "searched val" })

Non translated attributes may be a hassle though since they'd need special treatment in queries:

attributes.description.*

Not sure what we'll do with those attributes yet. E.g. numeric values don't require translation. Any thoughts on that are welcome.

We're still not using this structure though, so these are really pre-implementation ideas. I'm expecting more issues to surface while we start using this in practice, i.e. doing UI for CRUD operations etc.

Jakub P.
  • 5,416
  • 2
  • 21
  • 21
  • 1
    FWIW just an extension of the above solution: Most attributes we actually used don't require internationalization because they are numerical or cross-language codes (e.g. UPC/EAN product codes). So to make things easier we don't use the "*" as a fake locale to avoid unnecessary nesting. So for a non-i18n attribute like "weight" we'll have just { ... attributes: { weight: 100; ...} } and only for actual text multilang attributes like 'description' we'll have additional nesting, e.g. {... atributes: { desc: { 'en': ..., 'de':...} } }. HTH – Jakub P. May 09 '12 at 22:48