-1

I’m walking through GPT-2s source code in Github. I’m trying to understand how it all works. I’m getting stumped on a function and I’m hoping somebody can explain to me what’s happening.

https://github.com/nshepperd/gpt-2/blob/finetuning/src/model.py

The code can be found in model.py, in the link above. Here it is specifically:

def shape_list(x):
   """Deal with dynamic shape in tensorflow cleanly."""
   static = x.shape.as_list()
   dynamic = tf.shape(x)
   return [dynamic[i] if s is None else s for i, s in enumerate(static)]

I did some research on what Tensorflow.Shape() returns and on the differences between a static and dynamic shape here: How to understand static shape and dynamic shape in TensorFlow?

I also read through this series of articles: https://medium.com/analytics-vidhya/understanding-the-gpt-2-source-code-part-3-9796a5a5cc7c

Despite all that reading, I’m not entirely sure what’s going. What isn’t clear to me is the last statement:

return [dynamic[i] if s is None else s for i, s in enumerate(static)]

What exactly is it saying here? My guess is that the functions purpose is to determine if the value of X has been defined yet. If it hasn’t then it will return the static shape, if it has then it will return the dynamic shape.

Am I way off here?

junfanbl
  • 451
  • 3
  • 21
  • 1
    Does this answer your question? [How to understand static shape and dynamic shape in TensorFlow?](https://stackoverflow.com/questions/37096225/how-to-understand-static-shape-and-dynamic-shape-in-tensorflow) – GPhilo Jan 31 '20 at 13:13
  • I have read through that link. I even posted it in my question above. I believe I understand the concept of returning a shape during graph computation (dynamic) and during Graph definition (static). What I'm having an issue with is the last statement. What is the logic behind it? – junfanbl Jan 31 '20 at 13:53
  • 1
    Then your problem is not with anything Tensorflow, but with list comprehensions in python. The last statement says "build a list where each element is the matching element from `static`, if that's not `None`, other wise the matching element from `dynamic`" (looks horrible spelled in a comment, but basically that's just a for loop rearranged. Look up list comprehensions and it will be clear in no time) – GPhilo Jan 31 '20 at 13:58
  • Oh okay, that makes more sense. Thank you for your help. How do I mark your comment as an answer? – junfanbl Jan 31 '20 at 14:19
  • You don't for comments, but I'll remove it and move it to an answer (where I can also explain a bit better) – GPhilo Jan 31 '20 at 14:20

1 Answers1

1

Your problem is not with anything Tensorflow, but with list comprehensions in python, which are a more pythonic way to define lists based on other iterables.

The last statement is (almost*) equivalent to:

ret = []
for i, s in enumerate(static):
  if s is None:
    ret.append(dynamic[i])
  else:
    ret.append(s)
return ret

*: About the "almost" above, the comprehension is actually more efficient, because internally it pre-allocates the memory for the whole result, while the loop appendss on every iteration, thus causing multiple allocations when extending the list, which is slower.

GPhilo
  • 18,519
  • 9
  • 63
  • 89