Fully separating data manupulation logic in SQL in order to increase performance

Question

I've recently come across this idea that separation of data manipulation logic fully in the SQL code would result in a performance gain, but I'm unsure to what extent this is true and for that reason I'm posting this question.

Assuming that we have a database (let's say a small gallery) in which we have galleries, photos, and tags. Each photo can be member of many galleries, and each photo or gallery can have many tags. I create a single stored procedure for handling photo and gallery management (i.e. PhotoCreateOrUpdate) that can accepts an array of galleries (in serialized text, i.e. GalleryA:Info:Tag1|Tag2), photo information, the same for photo tags, and an array for galleries or tags to be deleted when updating the record. If gallery exists, relations will be added, else they will be created, and same for tags and other things.

I've decided to do this instead of relying on Entity Framework because I believe that this way I can send only one request to the database from the Web-End and this can result a performance gain as I avoid a lot of foreach loops and certain joins can be done faster.

However, when deserialzing the formatted texts, I have to use Cursors a few times which would be equal to the loops when using Entity Framework.

I also think that another advantage of this approach would be to segregate concerns into their own context and keep the ASP.Net code clean.

I want to know to what extent my assumptions are true and this is the point of the question.

Were you experiencing any sort of performance problems? Because sometimes optimizing code that has no performance issues can lead you into worse design and logical issues. — Vojtěch Dohnal, Aug 21 '17 at 09:29
@VojtěchDohnal: Yes, the scenario I described is simple, but in some real world cases, a single transaction (based on a friendly UI design) may need making many queries in EF, or doing heavy joins, and sending many updates to the server. On the other hand, if the same Db needs to be consumed by another application such as a Java EE app, then the same logic has to be re-written over and over. — Arnold Zahrneinder, Aug 21 '17 at 09:33
There are very few instances where cursors are an appropriate solution to anything. If you feel the need to rewrite EF code in plain SQL, but you're using cursors, this almost certainly means you don't know SQL better than EF does. In particular, the idea that joins are "heavy" and need your help to optimize them into micro-operations is flawed. With EF, you may need to pay close attention to how related entities are pulled in, and you might use stored procedures to optimize some statements if EF can't hack it, but cursors -- just say no. — Jeroen Mostert, Aug 21 '17 at 09:38
Apropros "serialized text" -- table-valued parameters, XML or (from SQL Server 2016 onwards) JSON are better ways to pass structured data than separated values, since the server can actually do something with those. How you store the data relationally is another matter. — Jeroen Mostert, Aug 21 '17 at 09:42
@JeroenMostert: So how can I avoid cursor in SQL, particularly when I have an array which I need to loop over it? — Arnold Zahrneinder, Aug 21 '17 at 10:23
@Arrrr: (T-)SQL doesn't have arrays. TVPs, XML and JSON all allow processing without cursors. Arbitrary strings [need a separate split function](https://stackoverflow.com/q/2647/4137916) (from SQL Server 2016 onwards, `STRING_SPLIT` finally does this natively). In all instances, think tables and set-based operations, not cursors. The only instance where cursors are unavoidable is when you must execute a statement that doesn't operate on tables and doesn't accept multiple values. Those are rare. — Jeroen Mostert, Aug 21 '17 at 10:48
@JeroenMostert: Do you mean that I should consider everything as a table, then instead of an array-like data structure, I turn it into a temporary virtual table, get the number of rows and then select based on the row number in a loop? — Arnold Zahrneinder, Aug 21 '17 at 14:30
@Arrrr: No, I'm suggesting you shred the data in tables if they're not already tables and then issue the appropriate `INSERT` and `UPDATE` statements to insert all rows at once using those tables as sources (wrapping things in a transaction if multiple steps are required). No loops, anywhere. (If you insist on storing the data as separated strings in the database itself, you're honestly better off doing the processing entirely on the client side or delegate to a NoSQL document store, because you'll never get good performance out of SQL that way.) — Jeroen Mostert, Aug 21 '17 at 14:42
At this point, we're getting very specific and SO is warning me to quit it with the comments (and rightly so). If you need advice about a particular, practical scenario that you have, and there's an actual, demonstrable *problem* (in terms of performance or code structure) post that as a question so we can go into specifics. All I'm pointing out is that cursors are neither necessary nor desirable. — Jeroen Mostert, Aug 21 '17 at 14:46
@JeroenMostert: Thank you very much indeed, now I understood what you have been talking about. Thanks again because I learned a totally new approach today. — Arnold Zahrneinder, Aug 21 '17 at 14:52

Fully separating data manupulation logic in SQL in order to increase performance

0 Answers0