0

Summary

How do you get objects and their related objects using pure PHP and PDO without running into N+1 hell?


I'm trying to understand the fundamental way to instantiate objects and their relations without using a framework or ORM. Typically when I ask and provide examples, I get responses like "you would never use results that way", etc and it blows my mind considering the examples are so common in the real world I use these types of queries everyday in procedural code...

So, let's say you have a User class, and a PhoneNumber class. A User can have many PhoneNumbers. Well, that's a perfectly normal real-world use-case. You want to get all Users and related PhoneNumbers. How would you retrieve results from PDO, instantiate the User and populate each User's private $phoneNumbers = []; property? My greenhorn OOP mind would say foreach over an associated result set, sorting it by user_id and then looping over that result set instantiating a new User(), setting its properties and looping over each user, instantiating a new PhoneNumber() and pushing it to the property, but that most definitely seems "wrong".

When people say those types of examples are not common-place (still shocked), I always go back to the Amazon/e-commerce example. You have an Order History page. That would have an Order class and an OrderItem class. You can view all of your Orders and the OrderItems purchase on all orders on a single page.

Since I don't want to get pounced on for not having any code, this is definitely incorrect, but this is what I'm trying to understand the "how" of how to instantiate results of object's, and related results for original object's.

In procedural I could do this in a one-query solution. I'd expect at maximum two queries in an OOP solution. One that gets all Orders for a user and one that gets all OrderItems. In the below snippet. $clientOrders contains all order's from a client. Great, I now have all order_id's I would need to get OrderItems from but how to get those results and populate them against each Order?

$orderRepo = new OrderRepository(); // This is a must.
$orderItemsRepo = new OrderItemRepository(); // Unsure if this would be needed/used.

$clientOrders = $orderRepo->byClientId($clientId);

foreach ($clientOrders as $clientOrder) {

$orderItemsRepo->byOrderId($clientOrder->getId()) // definitely wrong

}

So, I'm fairly confident the foreach is not proper, I thought it'd be used to instantiate OrderItems and push them to each $clientOrder, either way I definitely know I'm not going to be querying the DB in an iteration.

halfer
  • 19,824
  • 17
  • 99
  • 186
alturic
  • 53
  • 2
  • 8
  • ActiveRecord or DataMapper the question is still the same. Your second sentence is exactly what I'm asking. You have a single query that returns all Orders and their OrderItems, so you now have a ~50 row result set. Those 50 rows are across 5 orders. How do you instantiate 5 new Order(), and push their related new OrderItem() to each? – alturic Feb 02 '19 at 15:43
  • Subjective, really? I don't mean that facetiously. It seems pretty common (well, nowadays uncommon to not use an ORM heh) to pull results from a DB and those results also have related results you want to use. I thought it would be a straightforward answer if I could make sure I got the question across properly. :-( Purely curious, is the factory what ORM's do basically? – alturic Feb 02 '19 at 16:03
  • I'm trying to understand what you mean by "how" and "wrong" here. Why are they in quotes? Typically one adds quotes to words in order to emphasise they don't have their normal meaning. In this case however, you seem to be using the ordinary meaning. – halfer Feb 02 '19 at 21:45
  • Correct, however when someone doesn't know if something is right or wrong (see: proper, or normal) you would say - I did X, but it feels "wrong". As to the "how", it was meant as I know **how** to get the data, I don't know **how** (as in proper, since foreach'ing seems wrong) to populate the relations. An example using quotes outside of literal quoting of a word or phrase. Dominos has terrible "pizza", which is implying their pizza is terrible. Apologies for confusing you, it's very common in the US to quote words this way. – alturic Feb 02 '19 at 21:51
  • Oh, and just to reiterate, the reason it feels wrong/doesn't seem right is because like Ray's example, it did not seem right to iterate over a result set, and instantiate items on each iteration as I've been told numerous times you should never instantiate objects in a loop. – alturic Feb 02 '19 at 21:58
  • Right, it must be US-centric - in the UK if you said that someone was serving "pizza" you'd be implying it's not actually pizza (possibly because it is terrible). I am not sure that applies here anyway, since your "how" is the same as a normal "how"! I've tried to adjust it for an international audience. – halfer Feb 02 '19 at 22:15

1 Answers1

-1

What you're trying to avoid is the looping query for each order's order items.

You want to have a constant number of queries, not N+1 where the 1 is the original query for the list of orders and the N are subqueries required to get each orders order items as you iterate over the orders.

I'd suggest instead of one query per $clientOrder:

  • Batch all the $clientOrder id's together in an array, then combine to make comma separated list
  • Create a select in that allows you to pass in your list of id's
  • Assuming the order items have a client order id, then loop through the results of order items and add them to the appropriate $clientOrder manually

Whether it's orders to items or users to phone numbers I'm assuming you've got 2 tables where the it's a one-to-many type relationship based on the id of the outer object (user or order).

In clauses aren't easy last time I checked PDO, but here's a link to show an example: https://stackoverflow.com/a/14767651/722263

If you do this, you've now go 1 query to pull the client orders and 1 query to pull all the order items (or users to phone numbers as earlier in your post).

Here's some psuedo PHP:

$orders = [];

for(getAllOrders() as $order) {
   $orders[$order->getId()] = $order;
}

for(getAllOrderItemsByOrderIds(array_keys($orders)) as $orderItem) {
   $order[$orderItem->getOrderId()].addOrderItem($orderItem);   
}

Where getAllOrders() calls a single PDO query to get all the orders using fetch class and getAllOrderItemsByOrderIds() creates a single PDO select to return all the order items also with fetch class.

You could even combine the two queries into a single one (as you note), return a massive result set, and manually build each class instead of using PDO FETCH_CLASS, but I find this sort of thing a micro optimization, especially when you get more and more relationships to deal with. I try to keep queries tied to the object they're constructing, not return multiple different entities in giant flat result sets unless there is a major performance need involve. Makes it easier to deal with caching as well.

halfer
  • 19,824
  • 17
  • 99
  • 186
Ray
  • 40,256
  • 21
  • 101
  • 138
  • Oh, I know the "how" as to how to get the results, implode client_id's to placeholders, etc. It's the third bullet-point I'm lost on. Say I do have to use two queries, one to get all the Orders, and one to get all Order's OrderItems. How do you push the correct OrderItem to it's related Order. Does that make sense? – alturic Feb 02 '19 at 15:46
  • Yes, does the order item table have the id of the order? The fact you can look it up by order id seems to imply that. – Ray Feb 02 '19 at 15:47
  • Just loop over all your orders, stick them in an associateive array $id => $order, then loop over the order items, lookup from the order associative array the order, and do somehting like `$order->addOrderItem($orderItem)` where the Order instance has a collection of items you can append. The n+1 hell you're trying to avoid are queries to the DB to get the data, not necessarily loops to build objects. – Ray Feb 02 '19 at 15:48
  • Correct. I imagine I'd have to foreach over the result set that contained the OrderItems, how do you "push" (populate) those objects to the correct Order objects? Just to make sure we're on the same page, I'm expecting an Order class to have an `$orderItems = [];` property. So that I can do `$order->getOrderItems` and that would return the array of OrderItem's. I could also iterate over each `$order->getOrderItems as $orderItem` and be able to $orderItem->productName, etc. – alturic Feb 02 '19 at 15:51
  • So because I'm not using an ORM, I would indeed be foreaching a lot, and that's not technically wrong, as the ORM (kinda/maybe/does) do that behind the scenes? That's what I was hung up on. It felt weird to be foreach'ing. It doesn't help of course that Doctrine or Eloquent are black boxes of magic to me as I'm *not* experience enough with OOP to be able to see the hydration part of instantiating and populating models. – alturic Feb 02 '19 at 15:53
  • I'm assuming orderItem has an orderId. If that's the case 2 queries, one for orders, then take all order id's and get all the orderItems. Then iterate through the order items and add to the matching order. – Ray Feb 02 '19 at 15:54
  • @alturic see the psuedo code I added ot my answer – Ray Feb 02 '19 at 16:01
  • That does indeed make sense, I truly thought having the foreach's would be wrong/felt weird. Same with having a method on Order for `->addOrderItem()`, or User `->addPhoneNumber()`, but again I'm sure ORM's are indeeding doing this type of "black magic" behind the scenes. *Black magic meaning I'm not experienced enough to read the code of Doctrine to "see" this. :P – alturic Feb 02 '19 at 16:11
  • Nothing is wrong using for each to construct objects in a factory, there's no magic. Like I noted, the `n+1` issue you're trying to avoid is looping where you a query on each iteration of the loop for each set of sub objects. Then to build 10 orders you need to make 11 queries if you do it naive `n+1` way, one to get the list of orders, then 10 as you query for each order's order items. SQL queries and other I/O are likely the most expensive operations, so you're trying to minimize these. – Ray Feb 02 '19 at 16:17
  • As funny as it sounds, I'm actually pretty proficient with procedural PHP (maybe more SQL as this point, ha) but yea whenever someone comes across an N+1 situation, that's almost always time to pause IMO because you're doing something wrong. I appreciate your help and marked this as the answer. – alturic Feb 02 '19 at 16:20