How should Euler integration be implemented in TensorFlow?

Question

I want to write a crude Euler simulation of a set of PDEs. I read the PDE tutorial on tensorflow.org and I am a little puzzled about how to do this properly. I have two specific questions but would welcome further feedback if there is anything I have overlooked or misunderstood.

The following code is from the tutorial:

# Discretized PDE update rules
U_ = U + eps * Ut
Ut_ = Ut + eps * (laplace(U) - damping * Ut)

# Operation to update the state
step = tf.group(
  U.assign(U_),
  Ut.assign(Ut_))

Question 1

Isn't there a bug here? Once U.assign(U_) has been evaluated, surely the next evaluation of Ut_ will use the updated value of U rather than the value from the same time step? I would have thought that the correct way to do it would be as follows:

delta_U = tf.Variable(dU_init)
delta_Ut = tf.Variable(dUt_init)

delta_step = tf.group(    
    delta_U.assign(Ut)
    delta_Ut.assign(laplace(U) - damping * Ut)
)

update_step = tf.group(
    U.assign_add(eps * delta_U),
    Ut.assign_add(eps * delta_Ut)
)

We could then run Euler integration steps by alternating evaluations of delta_step and update_step. If I understand correctly, this could be done via separate invocations of Session.run():

with tf.Session() as sess:
    ...
    for i in range(1000):
        sess.run(delta_step)
        sess.run(update_step)

Question 2

It seems frustrating that a single operation can't be defined that combines both steps in a fixed order, e.g.

combined_update = tf.group(delta_step, update_step)    

with tf.Session() as sess:
    ...
    for i in range(1000):    
        sess.run(combined_update)

but according to an answer on this thread, tf.group() does not guarantee any particular evaluation order. The approach described on that thread for controlling evaluation order involves something called "control dependencies"; can they be used in this instance, where we want to ensure that repeated evaluations of two tensors are made in a fixed order?

If not, is there another way to control the order of evaluation of these tensors, beyond explicitly using sequential Session.run() calls?

Update (12/02/2019)

Update: based on jdehesa's answer, I investigated in greater detail. The results support my original intuition that there is a bug in the PDE tutorial which produces incorrect results due to inconsistent evaluation order of tf.assign() calls; this is not resolved by using control dependencies. However, the method from the PDE tutorial usually produces correct results, and I don't understand why.

I checked the results of running the assignment operations in an explicit order, using the following code:

import tensorflow as tf
import numpy as np

# define two variables a and b, and the PDEs that govern them
a = tf.Variable(0.0)
b = tf.Variable(1.0)
da_dt_ = b * 2
db_dt_ = 10 - a * b

dt = 0.1 # integration step size

# after one step of Euler integration, we should have
#   a = 0.2 [ = 0.0 + (1.0 * 2) * 0.1 ]
#   b = 2.0 [ = 1.0 + (10 - 0.0 * 1.0) * 0.1 ]

# using the method from the PDE tutorial, define updated values for a and b

a_ = a + da_dt_ * dt
b_ = b + db_dt_ * dt

# and define the update operations

assignA = a.assign(a_)
assignB = b.assign(b_)

# define a higher-order function that runs a particular simulation n times
# and summarises the results

def summarise(simulation, n=500):
  runs = np.array( [ simulation() for i in range(n) ] )

  summary = dict( { (tuple(run), 0) for run in np.unique(runs, axis=0) } )

  for run in runs:
    summary[tuple(run)] += 1

  return summary  

# check the results of running the assignment operations in an explicit order

def explicitOrder(first, second):
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(first)
    sess.run(second)
    return (sess.run(a), sess.run(b))

print( summarise(lambda: explicitOrder(assignA, assignB)) ) 
# prints {(0.2, 1.98): 500}

print( summarise(lambda: explicitOrder(assignB, assignA)) ) 
# prints {(0.4, 2.0): 500}

As expected, if we evaluate assignA first then a gets updated to 0.2, and this updated value is then used to update b to 1.98. If we evaluate assignB first, b is first updated to 2.0, and this updated value is then used to update a to 0.4. These are both the wrong answer to the Euler integration: what we ought to get is a = 0.2, b = 2.0.

I tested what happens when we allow the order of evaluation to be controlled implicitly by tf.group(), without using control dependencies.

noCDstep = tf.group(assignA, assignB)

def implicitOrder():  
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(noCDstep)
    return (sess.run(a), sess.run(b))


print( summarise(lambda: implicitOrder()) ) 
# prints, e.g. {(0.4, 2.0): 37, (0.2, 1.98): 1, (0.2, 2.0): 462}

Occasionally, this produces the same result as evaluating assignB followed by assignA, or (more rarely) evaluating assignA followed by assignB. But most of the time, there is an entirely unexpected result: the correct answer to the Euler integration step. This behaviour is both inconsistent and surprising.

I tried to resolve this inconsistent behaviour by introducing control dependencies as suggested by jdehesa, using the following code:

with tf.control_dependencies([a_, b_]):
  cdStep = tf.group(assignA, assignB)

def cdOrder():   
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(cdStep)
    return (sess.run(a), sess.run(b))

print( summarise(lambda: cdOrder()) )  
# prints, e.g. {(0.4, 2.0): 3, (0.2, 1.98): 3, (0.2, 2.0): 494}

It appears that control dependencies do not resolve this inconsistency, and it is not clear that they make any difference at all. I then tried implementing the approach originally suggested in my question, which uses additional variables to enforce the computation of deltas and updates independently:

da_dt = tf.Variable(0.0)
db_dt = tf.Variable(0.0)

assignDeltas = tf.group( da_dt.assign(da_dt_), db_dt.assign(db_dt_) )
assignUpdates = tf.group( a.assign_add(da_dt * dt), b.assign_add(db_dt * dt) )

def explicitDeltas():
  with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(assignDeltas)
    sess.run(assignUpdates)
    return (sess.run(a), sess.run(b))

print( summarise(lambda: explicitDeltas()) )
# prints {(0.2, 2.0): 500}

As expected, this consistently computes the Euler integration step correctly.

I can understand why sometimes tf.group(assignA, assignB) produces an answer consistent with running assignA followed by assignB, and why it sometimes produces an answer consistent with running assignB followed by assignA, but I don't understand why it usually produces an answer that is magically correct (for the Euler integration case) and consistent with neither of these orders. What is going on?

jdehesa · Answer 1 · 2019-03-04T10:16:05.660

Indeed, you can make sure that things run in the order that you want using control dependencies. In this case, you just need to make sure that U_ and Ut_ are computed before the assignment operations are executed. I think (although I'm not absolutely sure) that the code in the tutorial is probably correct, and that for Ut_ to be computed with the updated U you would need to have something like:

U_ = U + eps * Ut
U = U.assign(U_)
Ut_ = Ut + eps * (laplace(U) - damping * Ut)
step = Ut.assign(Ut_)

However, whenever you want to make sure that some thing gets executed before another, you can just write the dependencies explicitly:

# Discretized PDE update rules
U_ = U + eps * Ut
Ut_ = Ut + eps * (laplace(U) - damping * Ut)

# Operation to update the state
with tf.control_dependencies([U_, Ut_]):
    step = tf.group(
      U.assign(U_),
      Ut.assign(Ut_))

This will make sure that, before any of the assignment operations are executed, both U_ and Ut_ will have been computed first.

EDIT: Some additional explanation about the new snippets.

In the first snippet in your update (12/02/2019), the code runs first one assignment, then the next. As you said, this is obviously wrong, since the second update will use the already updated value of the other variable.

The second snippet, if I'm not mistaken (correct me if I'm wrong) is what the tutorial proposes, grouping the assignment operations. Since you say you have seen instances of this producing the wrong result, I suppose it is not always safe to evaluate it like this. However, it is not surprising that you frequently get the right result. Here TensorFlow will compute all the necessary values to update both variables. Since evaluation order is not deterministic (when there are no explicit dependencies), it may happen that the update of a happens before b_ is computed, for example, in which case you would get the wrong result. But it is reasonable to expect that many times a_ and b_ will get computed before a and b are updated.

In the third snippet, you use control dependencies, but not in an effective manner. What you are indicating with your code is that the group operation should not run before a_ and b_ are computed. However, that does not mean much; the group operation is pretty much a no-op with dependencies to its inputs. The control dependencies there only affect this no-op, but does not prevent the assignment operations to run whenever before. As I suggested originally, you should instead put the assignment operations within the control dependencies block, to make sure that the assignments do not happen sooner than they should (in my snippet I also put the group operation within the block just for convenience, but it does not really matter whether that is in or out).

I've edited the question to indicate why control dependencies do not seem to solve the problem. — Simon McG McG, Mar 02 '19 at 14:57
@SimonMcGMcG I edited my answer with some more explanations, hope that helps. — jdehesa, Mar 04 '19 at 10:16
Thanks, moving the assignment operators inside the control dependency block solves the problem. I think some of my confusion came from not really understanding what objects persist between successive calls to `Session.run()`. Suppose some operation `x` is defined with a control dependency on `y`. I had imagined that this meant that `y` had to be evaluated at some point *in the session* before `x` could be evaluated (which would not be very helpful). But it actually means that `y` has to be (re-)evaluated *within the current call to `Session.run()`* before `x` can be evaluated, right? — Simon McG McG, Mar 06 '19 at 16:29
@SimonMcGMcG That is correct, control dependencies (or dependencies in general) refer to the control flow of each single call to `Session.run()`. Intermediate node values are not saved between calls to `run()` (only variables and a few other [stateful objects](https://stackoverflow.com/q/52636943) like random numbers) so, as you say, if you have `with tf.control_depencies([y]): x = ...` it means `x` will not be computed until `y` has been evaluated _within_ this call to `Session.run()`. — jdehesa, Mar 06 '19 at 16:44

How should Euler integration be implemented in TensorFlow?

Question 1

Question 2

Update (12/02/2019)

1 Answers1