0

I'm a newbie of Python and trying to get to know about machine learning. Following is a code bock that I get from Udacity assignnment.

def maybe_download(filename, expected_bytes, force=False):
  """Download a file if not present, and make sure it's the right size."""
  dest_filename = os.path.join(data_root, filename)
  if force or not os.path.exists(dest_filename):
    print('Attempting to download:', filename) 
    filename, _ = urlretrieve(url + filename, dest_filename, reporthook=download_progress_hook)
    print('\nDownload Complete!')
  statinfo = os.stat(dest_filename)
  if statinfo.st_size == expected_bytes:
    print('Found and verified', dest_filename)
  else:
    raise Exception(
      'Failed to verify ' + dest_filename + '. Can you get to it with a browser?')
  return dest_filename

I can understand most of the part. However, I'm very confused about filename, _ = urlretrieve(...) part. What is this assign to? I trace it in debugger and find that filename = '.\\notMNIST_large.tar.gz' remain no change before or after this expression.

So my question what does this filename, _ = urlretrieve(...) really mean? Is this some kind of advance technique to assign value in hidden expression?

Mac Chen
  • 13
  • 2
  • `urlretrieve` function is returning 2 values. What is the value for `_` after the call? – gcw Nov 25 '17 at 14:46

2 Answers2

0

In Python, the _ variable is an idiom for "I am not interested in this return value". The urlretrieve function returns a tuple (a sequence of immuntable values) containing two values: the filename of the file containing the results of the request, and the headers of the returned document. So the headers are essentially "thrown away" because the function does not want to care about them.

Here is a little toy example that further illustrates the concept:

def myFunction():
  return(1,2)

a, _ = myFunction()
print a 
Eric
  • 1,691
  • 16
  • 24
0

Python has a feature named "tuple unpacking" that allow to assign elements of a tuple (or of just any sequence actually) to variables. So if you have for exemple a (width, height) tuple and want to assign it's elements to individual variables, you can just do it in one single statement ie:

size = (42, 124) 
width, height = size
print width
print height

Some functions make use of this by returning a tuple, which is usually unpacked on function call return ie:

def foo():
    return "a", 42

letter, answer = foo()

and sometimes the caller just don't care about one of those values, so assigning the unwanted value to _ is a common convention meaning "I don't care about this value" (FWIW '_' is nothing special or magic, it's a valid variable name just like 'a' or 'foo' or whatever else).

urlretrieve() happens to be one of those functions - it actually

Return a tuple (filename, headers) where filename is the local file name under which the object can be found, and headers is whatever the info() method of the object returned by urlopen() returned (for a remote object, possibly cached)

https://docs.python.org/2/library/urllib.html#urllib.urlretrieve

The reason for returning the local filename is that urlretrieve can be called without a filename argument, in which case it saves the content in a tempfile with a generated name and returns this name so the caller can access the file.

In your snippet there's actually no real reason to reassign this value to the existing filename since it's supposed to be the very same name anyway so it's confusing at best. Inspecting the headers (which are here ignored) to find out if what we got is really what we expected (content-type, encoding etc) would have been a better idea.

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118