Currently inside transformation I am reading one file and creating a HashMap and it is an Static field for re-using purpose.
For each and every record I need to check against the HashMap<> contains the corresponding key or not. If it matches with record key then get the value from HashMap.
What is the best way to do this?
Should i broadcast this HashMap and use it inside Transformation? [HashMap
or ConcurrentHashMap
]
Does Broadcast
will make sure the HashMap
always contains the value.
Is there any scenario like HashMap become empty and we need to handle that check as well? [ if it's empty load it again ]
Update:
Basically i need to use HashMap as a lookup inside transformation. What is the best way to do? Broadcast or static variable?
When i use Static variable for few records i am not getting correct value from HashMap.HashMap contains only 100 elements. But i am comparing this with 25 Million records.