1

I am running hive query using get_json_object to read json strings from files in HDFS. And I bumped with some strange behavior: if the json is as follow:

{"data":{"oneSlash":"aaa\bbb","twoSlashes":"ccc\\ddd","threeSlashes":"eee\\\fff"}}

The result of the query is:

{"oneSlash":"aaabbb","twoSlashes":"ccc\\ddd","threeSlashes":"eee\\fff"}

I understand the 'oneSlash' and the 'threeSlashes' result but why 'twoSlashes' did not equal to "ccc\ddd"? after all '\' should be unescaped to '\'

BTW the quesry is:

SELECT get_json_object(escaping_test.data, '$.data') FROM escaping_test
zohar
  • 2,298
  • 13
  • 45
  • 75

1 Answers1

1

it's because \b and \f is valid escape characters whereas \d is not. there's a post about this in more detail: Where can I find a list of escape characters required for my JSON ajax return type?

Community
  • 1
  • 1
the1plummie
  • 750
  • 2
  • 10
  • 21