6

It would be nice to create ORM like Active Record or Hibernate, it should process chained queries like this:

User = User:new():for_login(«stackoverflow_admin»):for_password(«1984»):load().

How can we do this? Or just like that, in one line - or at least similar in spirit and meaning.

Maybe there are some preprocessing tools that can help in this?

P_A
  • 1,804
  • 1
  • 19
  • 33
Oleg Chirukhin
  • 970
  • 2
  • 7
  • 22
  • 1
    It seems that this problem requires a lot more research. Therefore, I accepted the answer, which indicates the direction of research. Thanks to all the participants, it was very informative. – Oleg Chirukhin Jan 14 '16 at 04:37

4 Answers4

9

"Chaining" is a confused form of functional composition, sometimes relying on return values, sometimes operating directly over state mutations. This is not how Erlang works.

Consider:

f(g(H))

is equivalent to:

G = g(H),
f(G)

but may or may not be equivalent to:

g(H).f()

In this form functions are stacked up, and the operations proceed "toward" the return-value side (which is nearly always the left-hand side in most programming languages). In languages where OOP is forced on the programmer as the sole paradigm available (such as Java), however, this function-flowing form is very often not possible without excess noise because of the requirement that all functions (class methods) be essentially namespaced by a class name, whether or not any objects are involved:

F.f(G.g(h.get_h()))

More typically in Java operations over the data are added to the object itself and as much data as possible is held in object instances. Transform methods do not have a "return" value in quite the same way, they instead mutate the internal state, leaving the same object but a "new version" of it. If we think of a mutating method as "returning a new version of the object" then we can think of the dot operator as a bastardized functional composition operator that makes values sort of "flow to the right" as the mutated object is now having additional methods invoked, which may continue to mutate the state as things move along:

query.prepare(some_query).first(100).sort("asc")

In this case the execution "flow" moves to the right, but only because the concept of functional composition has been lost -- we are "ticking" forward along a chain of mutating state events instead of using actual return values. That also means that in these languages we can get some pretty weird flip-flops in the direction of execution if we take stacking too far:

presenter.sort(conn.perform(query.prepare(some_query)).first(100), "asc")

Quick, at a glance tell me what the inner-most value is that kicks that off?

This is not a particularly good idea.

Erlang does not have objects, it has processes, and they don't work the quite way you are imagining above. (This is discussed at length here:Erlang Process vs Java Thread.) Erlang processes cannot call one another or perform operations against one another -- they can only send messages, spawn, monitor and link. That's it. So there is no way for an "implicit return" value (such as in the case of a chain of mutating object values) to have an operation defined over it. There are no implicit returns in Erlang: every operation in Erlang has an explicit return.

In addition to explicit returns and a computational model of isolated-memory concurrent processes instead of shared-memory objects, Erlang code is typically written to "crash fast". Most of the time this means that nearly any function that has a side effect returns a tuple instead of a naked value. Why? Because the assignment operator is also a matching operator, which also makes it an assertion operator. Each line in a section of a program that has side-effects is very often written in a way to assert that an operation was successful or returned an expected type or crash immediately. That way we catch exactly where the failure happened instead of proceeding with possibly faulty or missing data (which happens all the time in OOP chained code -- hence the heavy reliance on exceptions to break out of bad cases).

With your code I'm not sure if the execution (and my eye, as I read it) is supposed to flow from the left to the right, or the other way around:

User = User:new():for_login(«stackoverflow_admin»):for_password(«1984»):load().

An equivalent that is possible in Erlang would be:

User = load(set_password(set_uid(user:new(), "so_admin") "1984"))

But that's just silly. From whence have these mysterious literals arrived? Are we going to call them in-line:

User = load(set_password(set_uid(user:new(), ask_uid()) ask_pw()))

That's going to be pretty awkward to extract yourself from if the user enters an invalid value (like nothing) or disconnects, or times out, etc. It will also be ugly to debug when a corner case is found -- what call to what part failed and how much stuff unrelated to the actual problem is sitting on the stack now waiting for a return value? (Which is where exceptions come in... more on that waking nightmare below.)

Instead the common way to approach this would be something similar to:

register_new_user(Conn) ->
    {ok, Name} = ask_username(Conn),
    {ok, Pass} = ask_password(Conn),
    {ok, User} = create_user(Name, Pass),
    User.

Why would we do this? So that we know when this crashes exactly where it happened -- that will go a long way to telling us how and why it happened. If any of the return values are not a tuple of the shape {ok, Value} the process will crash right there (and most of the time that means taking the connection with it -- which is a good thing). Consider how much exception handling a side-effecty procedure like this actually requires in a language like Java. The long-chain one-liner suddenly becomes a lot more lines:

User =
    try
        User:new():for_login("so_admin"):for_password("1984"):load()
    catch
        {error, {password, Value}} ->
            % Handle it
        {error, {username, Value}} ->
            % Handle it
        {error, db_create} ->
            % Handle it
        {error, dropped_connection} ->
            % Handle it
        {error, timeout} ->
            % Handle it
        %% Any other errors that might possible happen...
    end.

This is one super annoying outcome of uncertain (or even overly long) compositions: it stacks all your error cases together and they have to be handled by propagating exceptions. If you don't throw exceptions in the bad cases above within those calls then you don't know where something went wrong and you have no way of signaling the process or its manager that execution should terminate, retry, etc. The cost of the one line solution is at least an additional dozen lines added only to this one procedure, and we haven't even addressed what should happen in those error handlers!

This is the brilliance of Erlang's "let it crash" philosophy. The code can be written in a way that makes only assumptions of success as long as we assert those assumptions. Error handling code can be extracted out somewhere else (the supervisors) and state can be restored to a known condition outside of the main business logic when something goes wrong. Embracing this creates robust concurrent systems, ignoring it creates brittle crystals.

The cost of all this, however, is those one-line assertions. In concurrent programs this is a profoundly beneficial tradeoff to make.

zxq9
  • 13,020
  • 1
  • 43
  • 60
  • 1
    I'm very surprised and very grateful that you wrote a so long and detailed explanation on this subject, it is really very important, and I will give a link to this post to my friends who learn Erlang. (also sorry for my bad english, bla-bla-bla) (also you can just skip to the last line of this comment) As an importatant addition to what you said I have to indicate the subject area of the original question. It's not about writing any generic code, it's about writing "ORM". – Oleg Chirukhin Jan 06 '16 at 09:33
  • "ORM" (exactly quoted), at this stage of technology development, is not about oject-relational mismatch, but about "sql - real world" mismatch. We can't write in SQL - it's obviously very bad idea. So we're forced to use our main language (Java or Erlang) as a DSL to emulate SQL features. It's all about transforming our main language to a DSL. In practice, nobody knows internatls of such tools, so this DSL should provide very simple and straightworward APIs and (if it's possible) provide it via auto-completion in IDE. Very clear, very concise syntax of our Java-as-SQL or Erlang-as-SQL. – Oleg Chirukhin Jan 06 '16 at 09:33
  • As a part of this domain, we build SQL expression, step by step, increasing its complexity. Doesn't it sound like stateful thing? Classic builder pattern in OOP. – Oleg Chirukhin Jan 06 '16 at 09:33
  • We can't think about query details, it's too comlex. Imagine a query for 30 tables with very complex join rules, it will take hours just to imagine this structure. Moreover, imagine how many days you will spend creaing a mega query, dozens of pages of text, that you have to write to save this data back to DB. I I'm exaggerate, but not too much, it is common practice in large enterprise information systems. So "ORM" tool should provide a lot of meta-information, and this meta-information should be in uncontrollable way transofrm "clean and concise" calls. Why uncontrollable? – Oleg Chirukhin Jan 06 '16 at 09:34
  • Because control takes too much time, and we should write code fast. It should automatically decide what kind of join we want, how to aggregate information, how to sort it, etc, and in the end we want to see just records (or collections in java), and don't want to know how this magic works. – Oleg Chirukhin Jan 06 '16 at 09:34
  • ow imagine that for some parts of an application it's is half the work - writing CRUD for databases. We can't test it manually, because we have to write code fast, and testing every line in every part of call chain is quite opposite of "fast". Therefore, everything about CRUDs for databases completely relies on quality of underlying framework ("ORM"). We even don't write unit tests on it, quality of underlying framework is everything. If query fails, it's just game over - we don't want to know why select failed and on which stage. – Oleg Chirukhin Jan 06 '16 at 09:34
  • So "catch" block in your example always will be kind of empty (let it crash) or rarely filled with auto-generated "let't try again another 10 times" (user of the API don't want to know what it takes for ORM to retrieve data, he just waits for data or an error, so all this "catches" should be hidden deep in the underlying framework ). – Oleg Chirukhin Jan 06 '16 at 09:34
  • So basically we're already in situation where we creating DSL for corner cases of two base languages (SQL and Erlang/Java/...), intended solely to write CRUD code as fast as possible, sacrificing safety and comprehensibility of the flow. Really we can sacrifice every single principle of our main language to achieve this goal - as in Java we sacrifice everything when dealing with Hibernate or auto-generated code for SOAP and XML (complex metaprogramming and auto-generated code are the heaviest sins, but not when dealing with SQL/SOAP/XML/...). – Oleg Chirukhin Jan 06 '16 at 09:35
  • Probably you're right that Erlang is not very good in working with tasks like that. So maybe we should change syntax with preprocessors or something. Yesterday I read about this: https://github.com/esl/parse_trans, but not yet had time to try it in practice. – Oleg Chirukhin Jan 06 '16 at 09:35
  • 1
    So question is, how to create Erlang-as-SQL to write database code extremely fast, no matter how many basic features of the language we'll sacrifice down the road. I suggested to solve this problem the most simple and intuitive (for OOP programmer) way by chaning methods. Can you suggest a better way? (without writing tons of code and brackets and lisp-like expressions with "from the center" approach, it really slows down coding and renders complex queries unreadable)? – Oleg Chirukhin Jan 06 '16 at 09:35
  • 1
    @OlegChiruhin You want to build a query statement from underlying elements -- that's not at all the same thing as OOP-style chaining. You might want to meet me in http://chat.stackoverflow.com/rooms/75358/erlang-otp (within a matter of minutes a mod will almost certainly remove all your comments here). In any case you don't want an ORM -- you want a query builder, and that's not the same thing. ORM's tend to do far more damage than they do good. (On trivial data you don't need it, and on non-trivial data they incur *profound* tech debt.) – zxq9 Jan 06 '16 at 09:46
  • 2
    @OlegChiruhin On the subject of query generators, they are *very* similar to any other parsing/compiling task, but in when talking to a database you have to take your "inner" language and from that produce something that is understandable by the database. Building query compositions in Erlang is *not hard* -- it is actually much easier in an FP language than an OOP one. The main annoyance is that SQL is neither a calculus nor an algebra, adhering instead to arbitrary syntax and semantics. (>.<) That's why I did [something else](http://zxq9.com/ryuq/). – zxq9 Jan 06 '16 at 10:54
6

Although I found @zxq9 answer so informative, mentioning the history of imitating Java-like OOP style in Erlang could be helpful.

Method chaining needs state and there were efforts to include "Stateful Modules" into Erlang:

  • Parametrized Module: A proposal that lets developer to implement modules that accept parameters and hold them as a module state to use in its functions. Long ago it had been added to Erlang as an experimental feature, but the technical board decided to remove the syntactic support for this feature in R16 because of both conceptual and practical incompatibilities that it caused.

  • Tuple Module: A backwards compatibility for parametrized module, which also introduced in details in chapter 8 of Programming Erlang (Second Edition), but there is no official documentation for it and it is still a controversial feature that introduces complexity and ambiguity with no power in the Erlang way.

Imitating coding style of languages with different paradigms like Ruby and Java in Erlang which is a functional language with different concepts could be exciting but with no added value.

Community
  • 1
  • 1
Hamidreza Soleimani
  • 2,504
  • 16
  • 19
  • 4
    Indeed, things like this have been tried (and [come up in discussion here quite a bit](http://stackoverflow.com/a/29552156/988678)) but just don't work well. Fortunately Erlang the language is easy to learn because it has been protected from the blind inclusion of random features (but as can be seen above *we have experimented a lot*). The OP really appears to be getting at "how to write a query statement generator" -- and this does not require dot-operator style chaining (that actually makes it *harder*). Its hard to fit a query generator howto into an answer to this question, though. – zxq9 Jan 06 '16 at 10:39
  • 2
    Generally importing coding styles from languages with a different paradigm is generally not a good idea. As @zxq9 showed in his answer something which looks reasonable and "natural" in one style can have many reasonable interpretations in another. In FP would we chaining values as is natural in OO or would we go functional and chains functions? – rvirding Jan 07 '16 at 11:31
  • 2
    We did try to keep Erlang very simple and explicit so you could see what exactly was is intended. And the parametrized modules and its implementation method tuple modules just don't fit very well and resulted in much strangeness. Unfortunately the tuple modules are still with us. – rvirding Jan 07 '16 at 11:34
1

Take a look at BossDB. It's a compiler chain and run-time library for accessing a database via Erlang parameterized modules.

P_A
  • 1,804
  • 1
  • 19
  • 33
0

This pattern works if the number of arguments are the same:

chain(Val, []) -> Val;
chain(Val, [H|T]) ->
    chain(H(Val), T).

Example:

9> example:chain(1, [fun(X) -> X + 1 end, fun(X) -> X + 10 end]).
12
10> example:chain("a", [fun(X) -> X ++ "b" end, fun(X) -> X ++ "c" end]).
"abc"
11> example:chain(<<1,2,4>>, [fun binary:bin_to_list/1, fun lists:reverse/1, fun binary:list_to_bin/1]).
<<4,2,1>>
Aaron Lelevier
  • 19,850
  • 11
  • 76
  • 111