I have a dataFrame with 18 rows of records, and this dataFrame has like 20+ columns. For example:
----------- My list: ('N','N')
A B C
-----------
a b c
d e f
I also have a list with 18 values. Now I want to add this list to my dataFrame, each value in a list correspond a value to the row.
That means the final result should be like this:
--------------
A B C D
--------------
a b c N
d e f N
Here is what I tried(From this link):
//C is a list of values
val rdd = sc.parallelize(C)
//joindf is my dataframe has 20+ columns
val rdd_new = joindf.rdd.zip(rdd).map(r => Row.fromSeq(r._1.toSeq ++ Seq(r._2)))
sqlContext.createDataFrame(rdd_new,joindf.schema.add("CD",StringType)).show
This gives me error like this:Can't zip RDDs with unequal numbers of partitions: List(200,2)
Any help would be appreciated!
UPDATE
Not sure why the partition or the zip doesn't work out, but the comments provide another way to do this. I just duplicate the method from this link