I am wondering whether it is possible to use another dataframe/table as a mapping table. Here is the best was I can describe it.
dfA (Original dataset)
Below is the original dataset. This is the dataset that we want to change based off a lookup
Name | TransactionID | Alert flag |
---|---|---|
John Smith | 93818 | Bad Math |
Jane Doe | 91982 | Invalid Row |
dfB (Lookup table)
Below is the table which will have references to the alert flag and what each alert flag should equal.
Alert_flag_OLD | Alert_flag_NEW |
---|---|
Bad Math | Incorrect Calculations |
Invalid Row | Invalid Transaction |
What I want to know is. Can we perform a look up. Something like
dfA = (
dfA
.withColumn('Alert_Flag', LOOKUP on dfB. if (dfA.ALERT_FLAG) in dfB.ALERT_FLAG_OLD then VALUE = ALERT_FLAG_NEW
the if (ALERT_FLAG)
is just saying if the alert flag exists in dfB then use find it's new value.
Is something like this possible?
Have not tried anything as of yet as I wouldn't really know how to start it