I have two DataFrame with same number of row, but number of column is different and dynamic according to source.
First DataFrame contains all columns, but the second DataFrame is filtered and processed which don't have all other.
Need to pick specific column from first DataFrame and add/merge with second DataFrame.
val sourceDf = spark.read.load(parquetFilePath)
val resultDf = spark.read.load(resultFilePath)
val columnName :String="Col1"
I tried to add in several ways, here i am just giving few one....
val modifiedResult = resultDf.withColumn(columnName, sourceDf.col(columnName))
val modifiedResult = resultDf.withColumn(columnName, sourceDf(columnName))
val modifiedResult = resultDf.withColumn(columnName, labelColumnUdf(sourceDf.col(columnName)))
None of these are working.
Can you please help me on this to merge/add column to the 2nd DataFrame from 1st DataFrame.
Given example are not the exact data structure that i need, but it will fulfill my requirement to resolve this issue.
Sample Input Output:
Source DataFrame:
+---+------+---+
|InputGas|
+---+------+---+
|1000|
|2000|
|3000|
|4000|
+---+------+---+
Result DataFrame:
+---+------+---+
| Time|CalcGas|Speed|
+---+------+---+
| 0 | 111| 1111|
| 0 | 222| 2222|
| 1 | 333| 3333|
| 2 | 444| 4444|
+---+------+---+
Expected Output:
+---+------+---+
|Time|CalcGas|Speed|InputGas|
+---+------+---+---+
| 0|111 | 1111 |1000|
| 0|222 | 2222 |2000|
| 1|333 | 3333 |3000|
| 2|444 | 4444 |4000|
+---+------+---+---+