0

I have a code in pyspark. I need to convert it to string then convert it to date type, etc.

I can't find any method to convert this type to string. I tried str(), .to_string(), but none works. I put the code below.

from pyspark.sql import functions as F

df = in_df.select('COL1')
> type(df) 
> <class 'pyspark.sql.dataframe.DataFrame'>

> df.printSchema() 
> |-- COL1: offsetdatetimeudt (nullable = true)
Gaurang Shah
  • 11,764
  • 9
  • 74
  • 137
  • Can you please add the output of df.printSchema() to your question? – cronoik Jul 07 '19 at 14:46
  • is this what you looking for? https://stackoverflow.com/questions/38610559/convert-spark-dataframe-column-to-python-list just convert df to string is kind pointless since it is an entire column, – xiaodelaoshi Jul 07 '19 at 14:47
  • |-- COL1: offsetdatetimeudt (nullable = true) output of df.printSchema() –  Jul 07 '19 at 15:12
  • I need to convert each row to Date, therefore I need it to be a string. –  Jul 07 '19 at 15:13
  • Your column values look like this: `2019-07-07T00:00:00.000Z`? – cronoik Jul 07 '19 at 16:40
  • yes @cronoik, exactly. In SQL I convert them to string and then to_date, I want to do the same with pyspark. –  Jul 08 '19 at 04:13

1 Answers1

0

Straightforward to cast column directly to string

df2 = df.withColumn('COL1', df['COL1'].cast(StringType()))
Duy Nguyen
  • 985
  • 5
  • 9
  • Thanks a lot, but it fails: AnalysisException: u"cannot resolve 'unix_timestamp(... I tried all these things, I just need to convert the column into string. –  Jul 08 '19 at 05:23
  • from pyspark.sql import functions as F, don't forget to import sql functions to your spark job. That causes issue with cannot resolve – Duy Nguyen Jul 08 '19 at 05:26
  • Can anyone convert the single column into string?? I want to work on column of strings) isn't it too trivial? –  Jul 08 '19 at 05:27
  • I import all required functions –  Jul 08 '19 at 05:27
  • added cast to String in my answer, check str_col1 you can try it (My DF type has string by default, but it should work in your case) – Duy Nguyen Jul 08 '19 at 05:30
  • no, it doesn't work: I need smt. like this: df.select('COL1').cast(StringType()) Can you help with that? –  Jul 08 '19 at 05:46
  • I removed all redundant code and just cast StringType for you, check it again in my answer – Duy Nguyen Jul 08 '19 at 05:50