1

If you need to parse some XML which has or hasn't some entries you often end up with patterns like this:

planet = system.findall('.//planet')
row['discoveryyear'] = int(planet.findtext("./discoveryyear")) if planet.findtext("./discoveryyear") else None

Is there a nicer way to do that? I would like to avoid the second planet.findtext call but also don't want to write another line of text to store the variable first

Daniel Wehner
  • 2,159
  • 1
  • 18
  • 22
  • 1
    why don't you want to write another line of text? – timgeb Jun 06 '14 at 08:22
  • You can use ``defaultdict``. http://stackoverflow.com/questions/5900578/how-collections-defaultdict-work but I don't see any problem with ``value or None``. One of the top zen of python is readability and explicit better than implicit. I'd add an extra line for the if then the None instead of a single line - this helps debugging (you can quickly find why something fail later) – CppLearner Jun 06 '14 at 08:22
  • @timgeb I think he doesn't want to create an extra variable to hold the return value – Tim Jun 06 '14 at 08:29

3 Answers3

7

Instead of the try/except solution, I propose a helper function:

def find_int(xml, text):
    found = xml.findtext(text)
    return int(found) if found else None

row['discoveryyear'] = find_int(planet, "./discoveryyear")

(note that found is also falsy if it's '', which is good case to return None for as well)

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
5

This will do (except if it's discovered in year 0 haha):

row['discoveryyear'] = int(planet.findtext("./discoveryyear") or 0) or None
Fabricator
  • 12,722
  • 2
  • 27
  • 40
3

To avoid the extra function call you could wrap it in a try/except

try:
    row['discoveryyear'] = int(planet.findtext("./discoveryyear"))
except TypeError: #raised if planet.findtext("./discoveryyear") is None
    row['discoveryyear'] = None

This also doesn't store the return value in a seperate variable

Tim
  • 41,901
  • 18
  • 127
  • 145
  • At my opinion (and according to clean coding rules) exceptions should not be used for situations like this. It seems to be ok that "discoveryyear" is not existing. Therefor the code should handle that explicit while exceptions should be used for exceptional cases, something that is not expected to happen. I don't know about performance of exceptions in python, but in C# they are some of the more expensive functions. I like the solution posted by @RemcoGerlich. – this.myself Jun 06 '14 at 08:35
  • @this.myself I agree and I have to admit I would not use this in my own code, but I think it's an OK solution to what the OP wants: no extra function call and no extra variable to store it the return value – Tim Jun 06 '14 at 08:39