1

I am trying to run below python udf in Pig

@outputSchema("word:chararray")
def get(s):
    out = s.lower()
    return out;

I am getting below error :

  File "/home/test.py", line 3, in get
    out = s.lower()
AttributeError: 'NoneType' object has no attribute 'lower'
Dair
  • 15,910
  • 9
  • 62
  • 107
alok tanna
  • 71
  • 6

1 Answers1

2

You should handle the case when s is none. In most of the examples such as:

from pig_util import outputSchema

@outputSchema('decade:chararray')
def decade(year):
    """
    Get the decade, given a year.

    e.g. for 1998 -> '1990s'
    """
    try:
        base_decade_year = int(year) - (int(year) % 10)
        decade_str = '%ss' % base_decade_year
        print 'input year: %s, decade: %s' % (year, decade_str)
        return decade_str
    except ValueError:
        return None

You need to handle the case when the value is None. So, one possible fix would be to try:

@outputSchema("word:chararray")
def get(s):
    if s is None:
        return None
    return str(s).lower()
Dair
  • 15,910
  • 9
  • 62
  • 107
  • 1
    One might even argue that you should make sure that you're dealing with a string if you want to call lower(). So maybe something like `str(s).lower()` would be even safer. Personally I would also prefer None as return value over an empty string if the input is None, but depending on the data/expected result even failing would be an option (None could be filtered out before calling the UDF). – LiMuBei Mar 03 '15 at 08:51
  • Good answer, but worth noting that this example's string formatting is not supported in more recent versions of python (see [discussion](http://stackoverflow.com/questions/13451989/pythons-many-ways-of-string-formatting-are-the-older-ones-going-to-be-deprec) here.) – C8H10N4O2 Dec 13 '16 at 17:15