I am trying to run the following code
SparkSession spark = SparkSession
.builder()
.appName("test")
.master("local")
// .enableHiveSupport()
.getOrCreate();
List<String> list=new ArrayList<String>();
list.add("HI");
list.add("HI");
list.add("HI");
Dataset<Row> dataDs = spark.createDataset(list, Encoders.STRING()).toDF();
List<String> list2=new ArrayList<String>();
list2.add("1");
list2.add("2");
list2.add("3");
Dataset<Row> dataDs2 = spark.createDataset(list2, Encoders.STRING()).toDF().withColumnRenamed("value","newvalue");
Column col=dataDs2.col("newvalue");
dataDs=dataDs.withColumn("newcol",col);
dataDs.show();
However, an error is popping up saying that
Exception in thread "main" org.apache.spark.sql.AnalysisException: resolved attribute(s) newvalue#10 missing from value#1 in operator !Project [value#1, newvalue#10 AS newcol#13];; !Project [value#1, newvalue#10 AS newcol#13]
When I searched about it online, it says there might be a case of duplicate column names. However, my columns names are different. dataDs has column name as 'value' while dataDs2 has column name 'newvalue'. So, I am not getting why the error is still happening. Can someone help me out?