PHP MVC: Data Mapper pattern: class design

Question

I have a web MVC application with domain objects and data mappers. The class methods of the data mappers contain all database querying logic. I'm trying to avoid mirroring any database structure and, therefore, to achieve the maximum flexibility in constructing the sql statements. So, in principle, I'm trying to not make use of any ORM or ActiveRecord structure/pattern AT ALL.

Let me give you an example: Normally, I could have an abstract class AbstractDataMapper inherited by all specific data mapper classes - like the UserDataMapper class. And then I could define a findById() method in AbstractDataMapper, to fetch a record of a specific table - like users - by a given id value, e.g. user id. But this would imply that I'd always fetch a record from a single table, without the possibility to use any left joins to also fetch some other details from some other tables corresponding to the given id - user id.

So, my question is: Under these conditions - to which I myself obliged to, should I implement an abstract data mapper class, or each data mapper class should contain its own completely "proprietary" implementation of the data-access layer?

I hope I could express my idea clear. Please tell me, if I was somehow unclear or you have any questions.

Thank you very much for your time and patience.

As someone who has offered 500 bounty myself, here is a free advice: don't. I had to ask mods to delete that post later, because the answers I received were crap. It will scare away people, who are not all that confident. And, if the issue isn't of the ASAP variety, you would get better results by offering 100 point bounty for two weeks. Hell, with 500 point budget, you can pin a post in "bounties" section for more than a month. — tereško, Jul 01 '17 at 10:40
@tereško Thank you for your comment and your good will advice. I didn't knew that. Therefore I reedited the bounties part. In the end, my goal is to attract good answers indeed. — , Jul 01 '17 at 12:28

score 5 · Accepted Answer · edited Jun 20 '20 at 09:12

5

If I understood your point ...

Having all your concrete mappers inheriting SQL from a common class has several issues that you have missed:

parameter names in your domain objects depend on the names of columns
there is a "fetching method" in mappers, that don't have a corresponding table
you still have configuration (table name), that is expected by superclass
the DB schema must have id as name for all of your PRIMARY KEY columns

Now, I'm gonna try to unpack each of those.

Parameter and column names

To create a shared findById() method, the only pragmatic approach is to build it around something like this:

"SELECT * FROM {$this->tableName} WHERE id = :id"

The main issue actually is the wildcard * symbol.

There are two major approaches for populating an entity using a data mapper: use setters or use reflection. In both cases the "names" of a parameters/setters is implied by columns, that you have selected.

In a normal query you can do something like SELECT name AS fullName FROM ..., which lets you to use the query for re-naming the fields. But with a "unified approach", there are no good options.

Each mapper can fetch data by `id`?

So, the thing is, unless you have a mapper-per-table structure (in which case an active record starts look like pragmatic option), you will end up with few (really common) "edge case" scenarios for your mappers:

only used to save data
deals with collection and not singular entities
aggregates data from multiple tables
works with a table, that has composite key
it's actually not a table, but an SQL view
... or combination of the above

Your original idea would work just fine in a small scale project (with one or two mappers being an "edge case"). But with a large project, the usage of findById() will be the exception not the norm.

Independent parenting?

To actually get this findById() method in the superclass, you will need a way to communicate the table name to it. Which would mean, that you have something like protected $tableName in you class definition.

You can mitigate it by having abstract function getTableName() in your abstract mapper class, which, when implemented, returns a value of global constant.

But what happens, when your mapper need to work with multiple tables.

It seems like a code smell to me, because information actually crosses two boundaries (for lack of better word). When this code breaks, the error will be shown for SQL in the superclass, which isn't where the error originated from (especially, if you go with constants).

Naming the primary key

This is a bit more controversial opinion :)

As far as I can tell, the practice of calling all primary columns id comes from various ORMs. The penalty, that this incurs, applies only to readability (and code maintenance). Consider these two queries:

SELECT ar.id, ac.id 
  FROM Articles AS ar LEFT JOIN 
       Accounts AS ac ON ac.id = ar.account_id 
 WHERE ar.status = 'published'

SELECT ar.article_id, ac.account_id 
  FROM Articles AS ar LEFT JOIN 
       Accounts AS ac USING(account_id)
 WHERE ar.status = 'published'

As the DB schema grows and the queries become more complex, it gets harder and harder to actually keep track of, what the "id" stands for in what case.

My recommendation would be to try same name for column, when it is a primary as when it is a foreign key (when possible, because in some cases, like for "closure tables, it's not viable). Basically, all columns that store IDs of same type, should have the same name.

As a minor bonus, you get the USING() syntax sugar.

TL;DR

Bad idea. You are basically breaking LSP.

edited Jun 20 '20 at 09:12

Community

1
1

answered Jul 01 '17 at 10:34

tereško

58,060
25
98
150

I just read your answer and I discovered that I need more time - beeing a bit tired right now. I just wanted to say: Thank you very much for it, tereško. I shall definitely read it in-depth later. – Jul 01 '17 at 12:35
Hi. I read your answer, actually many times. Wow :-) I'd like to ask you two things regarding your answer, if I may. You built all your arguments around `findById()` example. By that you meant to show me just one case, but appliable also to other mapper methods/situations. Have I understood right? If that's the case, then yes, you understood my point very good. I was afraid that I'll receive answers only regarding the use of a `findById()` method, so without taking the whole data mapper structure/design in consideration. – Jul 01 '17 at 16:04
My second question is about part (3): by "_code smell... information actually crosses two boundaries_" you meant the fact that I could fetch data from multiple tables at once, or you meant something about the class structure(s)? – Jul 01 '17 at 16:15
And my last question, in order to be sure: In TL;DR did you mean _Bad idea to implement an abstract data mapper class, because you are basically breaking LSP by using it_? I'm sorry for bothering you with such questions, but they are ones in which I couldn't correlate my understanding with your own perspective. Thanks again for a great answer. – Jul 01 '17 at 16:18
The `findById()` was just a convenient example for any SQL code, that is moved to superclass, even methods like `delete()` or `exists()` (which do not suffer from the "wildcard-issue". As for that "two boundaries" part .. well .. it was really hard to pin down it even seems like an issue. What it boils down to is - debugging. If there is a change in DB schema and someone does not know all the parts, where this change would affect the PHP code (common problem in larger project where you actually have a team). – tereško Jul 01 '17 at 17:46
(cont.) Then, when an error occurs, the error message will just point you to an SQL syntax error from in a generic superclass. The person, who gets tasked with fixing it, will have to go through at least two other files to find the correct location for the fox. And, if you work in a team, how sure are, that, when an intern gets assigned, the "fix" will be in the correct place. Those boundaries are are separation between where the error was caused and where the mistake was made. – tereško Jul 01 '17 at 17:48
(cont.) As for the TL;DR bit - there is nothing wrong with having a superclass for all the mappers. I use them myself (even multiple levels, because not all mappers need to work with SQL). The LSP violating part is that not all concrete implementation will need that `findById()` method. You should not move the persistence logic itself in the superclass (in your case - the SQL bits). – tereško Jul 01 '17 at 17:53
Thanks! I'm a bit confused, because I never worked with data mappers for things other than database access. I hope I understood right: you actually tried to show me that, when it gets to methods dealing with the database access I should keep these ones (findById, delete, insert, select, update, etc) inside each specific data mapper class, without the need of an abstract data mapper class. And that it's better so. >> – Jul 01 '17 at 18:09
>> And, if Iater arise the need for other tasks to be made by the data mappers, then I could inherit the specific data mapper classes from a parent class (be it abstract or not), but still preserving all database access methods inside the specific data mapper classes. Have I understood right? – Jul 01 '17 at 18:10
I would be endlessly grateful, Tereško, if you would give me an answer to my last (understanding) question from my last two comments. Because your answer to this question will be the closing key to the image I made myself from your explanations on the data mapper principle and use-cases. I know it looks silly, but I'm not a native English speaker and that's why sometimes I have understanding problems with phrases in English. Thank you very much for your patience. – Jul 02 '17 at 18:37
Ugh.. Yes, the point was, that for persistence logic to be in any way maintainable, it should be kept in the the concrete implementations. That does not mean, that you should avoid using abstract classes. Just don't try putting SQL in them, – tereško Jul 02 '17 at 21:20
@aendeerei And I said nothing about one concrete mapper inheriting from another concrete mapper. If they have different SQL code, then they most likely are completely unrelated. And your mappers don't need to all have the same interface. You can have one mapper only contain `fetch()` and `store()` methods, while another has methods `exists()`, `fetch()` and `delete()`. And you can have multiple mappers for the same entity/domain object. In large projects the "which mapper to use" even goes in a different type of classes - repositories. – tereško Jul 02 '17 at 21:25
THANK YOU, Teresko! Now I understand all perfectly, my image is very clear now. Let me be honest: I'm impressed by your experience and vision on the subject. – Jul 02 '17 at 22:04
And no no, I also didn't meant about a concrete mapper inheriting from another concrete mapper. And I didn't understood so from you neither. I wanted to say, that the parent class can be `abstract class AbstractDataMapper`, or just `class AbstractDataMapper`. But for sure having the role of an abstract data mapper parent. Now I can study the problem in-depth by applying it in the way you described and I already imagining since some long time, but without a certitude. There were many pieces which just didn't fit... Yes, I already use repositories, but they depend on how I build my mappers. – Jul 02 '17 at 22:07

PHP MVC: Data Mapper pattern: class design

1 Answers1

Parameter and column names

Each mapper can fetch data by `id`?

Independent parenting?

Naming the primary key

TL;DR

Linked

PHP MVC: Data Mapper pattern: class design

1 Answers1

Parameter and column names

Each mapper can fetch data by id?

Independent parenting?

Naming the primary key

TL;DR

Linked

Each mapper can fetch data by `id`?