0

Creating new column

data = spark.createDataFrame([(1,2,5), (3,4,9)], ['Col_1', 'Col_2','Col_3'])
data.show()

+-----+-----+-----+
|Col_1|Col_2|Col_3|
+-----+-----+-----+
|    1|    2|    5|
|    3|    4|    9|
+-----+-----+-----+

tmp_str = "F.col('Col_1')"
print(type(col_temp))
data = data.withColumn('Col_11',tmp_str)



AssertionError: col should be Column

AssertionError                            Traceback (most recent call last)
<command-2932446311694149> in <module>
     46 print(type(col_temp))
     47 print(col_temp)
---> 48 data = data.withColumn('Col_11',tmp_str)
     49 data.show()
     50 

I just gave the simple condition, but it is little complex. We can use expr , but need to use same thing like this. Any implicits will convert that string to column. Is there any way we can pass tmp_str as string, but need to calculate value

mssr
  • 23
  • 3
  • `tmp_str` is a string and you are expecting it to evaluate as a statement. Remove the quotes and this will work (`tmp_str = F.col('Col_1')`) but it's unclear what you are trying to even do with this. I don't even want to talk about [`eval`](https://stackoverflow.com/questions/1832940/why-is-using-eval-a-bad-practice) even though I'm sure someone will suggest it. – pault Dec 01 '20 at 21:19
  • Thanks pault. I even tried without quotes and it's working. That F.col('Col_1') is coming from input mapping file. When I import the data it will show as string and that's the reason just put in string. I am creating columns dynamically and it should evaluate as a statement . – mssr Dec 01 '20 at 21:54
  • how you are creating columns as dynamic ? – Srinivas Dec 02 '20 at 01:44
  • I will take col name and tmp_str from dictionary – mssr Dec 02 '20 at 01:59
  • Where does the dictionary come in? You need to create a [mcve] with your actual issue. Show how you read the input mapping file. Otherwise this is an [xy problem](http://xyproblem.info). – pault Dec 02 '20 at 12:51

0 Answers0