1

Using rddfloat = rdd.map( (float(x[0]), float(x[1])) ), I converted the columns of an rdd into floats so that I could do math with them. Now I'm finished with the math and I want to convert them back into their original StringType.

I've tried rddstr = rddfloat( (str(x[0]), str(x[1]), str(x[2])) ), and it does return a string '40.745555', but that's not the same as the original rdd u'40.745555'. What is the difference between these, and how can I convert it back to how it was originally?

zero323
  • 322,348
  • 103
  • 959
  • 935
wheels
  • 99
  • 2
  • 2
  • 9
  • 2
    The original string is actually a unicode so if you will do `unicode(x[0])` you will get what you wanted – Tom Ron Dec 10 '15 at 17:01

1 Answers1

3

I assume you are using Python 2.X. This means that if you want to produce a unicode string, you need to call unicode, like

rddstr = rddfloat( (unicode(x[0]), str(x[1]), str(x[2])) )

However, to have a better understanding of the differences, I would suggest you to search online, because it's a pretty common question. For example, some of the answers reported in the following questions might sound reasonable for you:

In particular, this answer might help you: https://stackoverflow.com/a/18034409/126125

Community
  • 1
  • 1
Markon
  • 4,480
  • 1
  • 27
  • 39