9

In SAP HANA I am used to create Calculation Views.

Previously I learned that Calculation Views (which after compilation are column-views) are to be prefered over Database-SQL-Views. Now with CDS-Views I am not sure if this is still the case. Especially with regards to performance.

Also what is now the difference between a table function (which replaced scripted calculation views) and CDS Views?

Suncatcher
  • 10,355
  • 10
  • 52
  • 90
Thorsten Niehues
  • 13,712
  • 22
  • 78
  • 113

4 Answers4

11

Ok, this is a question that I believe requires some background to be answered.

A long, long time ago...

When SAP HANA was first developed, it heavily reused concepts and technology from other, already existing SAP products (TREX, P*TIME, MaxDB, Business Warehouse Accelerator).
One of the fundamental elements of the high query performance was (and is) the column store data-storage, which came in large parts from the TREX/BWA products. These products, in turn, had been solutions to very specific problems (full-text search for catalogs and speed-up of analytical queries from the SAP Business Warehouse data warehouse product).
Especially the BWA use case reflects in the column views of SAP HANA. Due to the limited use case of supporting SAP BW queries, no general SQL/relational query support was required (e.g. no arbitrary join-chain optimizations, no SQL features beyond SQL:92 etc.) whereas other, rather exotic features (like "vertical join") that could be used by SAP BW, were built into a query tool/engine ("engine" clearly was a very popular term with the SAP developers).

Once HANA proved successful as a platform to run SAP BW on, the next step was to add flexibility and make more general platforms like SAP Netweaver (the software that SAP's business solution products run on/with) working on SAP HANA. Now, SQL features were added and those required additional capabilities from the query optimizer and execution "engines". Query optimization had to be flexible and fast and should lead to query performance that would still beat the existing RDBMS vendors' offering (which had been around for 40+ years). This, clearly, is a hard problem and throwing is operational aspects of DB development (scaling, solution deployment, data federation, etc.).

This led to an overlapping development of different tools addressing different aspects of DB development.
SQL support and the underlying SQL optimizer were made more powerful, so much so, that (some) SQL queries could be as fast or faster than those modeled in calculation views. And since both of these "query frontends" eventually had to talk to the same internal data structures (row/column store) it was desirable to have just a single query optimizer, that would support all the different use cases.
Somewhere around HANA 1 SPS11/12 most calculation views started to be "unrolled" internally to feed into the common optimizer (that was what the "Execute in SQL Engine" flag was about).

I'd say, since then, the performance argument for using calculation views only holds in very specific circumstances.

I mentioned the overlapping developments and CDS (core data services) is one of them. The idea here is a very different one from SQL. While SQL gives you "the way to talk to the database", CDS wants to give your application a single data definition, that is used by the UI, the program logic and the data storage/query execution.

SQL != CDS

This probably needs some context (again): a major usage pattern of how SQL databases are used by application developers is that the application is written in some form of OO-implementation and the talking to the DB is left to a mapping layer/library (e.g. O/R-mappers). This means, that the knowledge of what the application is about (aka business process knowledge), is spread out in the application.

There is some information about it in the UI (labels, formatting, visibility, ...), some of it is in the application-object model (object dependencies, hierarchies, value domains...) and then some of it is in the queries against the database.

Such scattered knowledge/definition makes it hard to make changes consistent, which in turn, slows the development process and in turn prolongs the time until the application can run and deliver some positive outcome.
"Time-to-value" is the thing under optimization here as this is important for companies that give the promise of "success through innovation".

Ok, so this CDS thing is now part of the development models proposed by SAP and nearly en-passant also addresses topics like schema evolution and deployment of the data model. It is, in fact, independent of the actual database platform as shown in the CDS for ABAP variety.

How does this lead back to query performance? It does not really.

CDS' advantage is that one can provide more information about the data model than what is possible in HANA SQL.
Associations and joins with cardinality declaration (albeit now retrofitted to plain SQL) can enable the optimizer to use additional optimizations. Yet, the same optimizer and the same query execution "engines" are used here.

So, from a (query execution) performance point of view, it does not make a big difference, as long as no query semantics are required for which CDS does not have syntax (e.g. some window functions).

The main point of CDS really is about application development process performance and whether that works well with how you do development really depends on how much of it you can use.

Now for the question "scripted calc view" vs. "table function" vs. "CDS view".

Looking at these different object types from the point of "what can I do with them functionally?" will result in the observation "basically, the same". The difference lies in how these can be optimized (scripted calc views cannot be generally unrolled into the global query to be optimized), and what one can do with the object once created.
Table functions allow for very easy reuse across multiple views and queries. They also provide the option to provide parameters into the function (similar to parameterized views) and in addition allow for imperative coding. Functionally speaking, table functions are a kind of swiss-army knife; one can do nearly anything with them and they still can be part of global query optimization.

CDS views, as mentioned above, are nothing "special" in terms of query runtime or optimization. The main reason why CDS views are "a thing" is that with HANA SAP started to develop development models (such as XS, XSA, CAM) that revolve around "virtual data models".

The idea for those is that the structure of tables very often is stable and changes only little over time.
In a way, this is the "write-schema" of applications that enter the data into tables.
The "read-schema" is most of the time different from that. Queries re-combine the normalized data into records that the application can map into objects. This allows applications to look at the data differently than the original application. With "virtual data models" these queries are baked into tangible development artifacts (the views) that can be reused across the application. In fact, these can be treated as if this was the database with its tables, presented in a way that makes sense for the application.

Once again, if that is something that is beneficial for your application development depends on how your application development looks like.

Can you use HANA without CDS? Absolutely, and there are many areas where CDS lacks (i.e. the limited syntax and feature mapping to HANA features) but it does have its merits.

Should you abandon calculation views?

I would not necessarily change existing developments if they still serve their purpose, but calculation views certainly are an odd development object. Training folks in using those and SQL most likely is overly expensive compared to just sticking to SQL.

Personally, I prefer the code-based SQL development (better tooling available, allows for easier comparison with other DBMS, doesn't require WEB IDE/HANA Studio).
The only thing, SQL based development does not provide is the extended annotations/semantic information used by the SAP analytic frontend tools (SAC & BO) - these really are specific to CDS and Information Models (calculation views) but barely used by other analytic tools.

And that's my take on it.

Lars Br.
  • 9,949
  • 2
  • 15
  • 29
  • 1
    Thx for your answer - and even including development of SAP HANA - this helps to understand things better – Thorsten Niehues Feb 11 '20 at 13:48
  • `Associations and joins with cardinality declaration (albeit now retrofitted to plain SQL)` could you please explain what means "retrofitted to plain SQL"? internally converted by SQL optimizer? – Suncatcher Nov 18 '21 at 07:49
  • I meant that the SQL syntax in HANA allows to specify join cardinalities. That’s not standard SQL but a HANA-specific extension that was not available in earlier releases. Thus “retrofitted” as in “built in later, to the already existing design of HANA SQL syntax”. – Lars Br. Nov 18 '21 at 08:14
2

I would add that

  • Calculation Views are semantically richer. A SQL View does not know about measures, dimensions, hierarchies. https://blogs.sap.com/2019/08/26/what-is-the-difference-calcview-versus-sql-view/
  • The difference from the execution plan point of view is getting less and less. In Hana 2.0 SP4 most graphical calc views are turned internally into a single SQL statement to be executed by the SQL engine. So in that sense, using a CalcView gives you the additional information about the model plus the query performance of the SQL engine.

Lars' explanation of CDS is perfect. Nothing to add there.

Werner Daehn
  • 577
  • 6
  • 16
0

But Imagine the situation when you can't create a table function because of limited license (aka runtime version). Just stay with scripted views.

The main advantage of Hana artifacts over CDS at present is the ability to use input parameters in complex cases to optimize resources and query performance - when your logic is pushed down into DB instead of AS / app. But many native SQL features are still not available in graphical views (for example - exists, JOIN on BETWEEN), so I think that 10 years later HANA artifacts will become "very rare".

So learn CDS syntax :)

refeline
  • 39
  • 3
  • I’m not aware of any license that would allow creating scripted views but not table functions.That’s certainly not true for “runtime” licenses. – Lars Br. Feb 28 '20 at 10:42
0

Always a glad experience reading an article or pov from Lars, on any media (StackOverflow, SAP blog, article, twitter).

I just want to point out that another thing that I miss from the SQL scripting (SP, TF, SF) is the join optimization and SQL propagation that Information View has.

This is for me the focus to flexible models (apart from dynamic join that is only relevant for certain scenarios), to deliver one view that will perform depending on which columns the user or app will request. For the semantics use, I can simply expose a TF inside an information view to add some.

You can tell me that CDS have both options available (join optimization, SQL propagation, and annotation) but for advanced or complicated scenarios (window functions not present at CDS), and also for non-SAP developers, it will be more simple and the go-to approach for beginners