Calculating count of records and then appending those counts daily in a separate dataset using pyspark

Asked Apr 12 '23 at 06:22

Active Apr 12 '23 at 06:22

Viewed 42 times

I have a dynamic dataset like below which is updating everyday. Like on Jan 11 data is:

Name	Id
John	35
Marrie	27

On Jan 12, data is

I need to take count of the records and then append that to a separate dataset. Like on Jan 11 my o/p dataset is

Count	Date
2	11-01-2023

On Jan 12 my o/p dataset should be

Count	Date
2	11-01-2023
3	12-01-2023

and so on for all other days whenever the code is ran.

This has to be done using Pyspark

I tried using the semantic_version in the incremental function but it is not giving the desired result.

asked Apr 12 '23 at 06:22

Abhijeet Kumar

0 Answers0