0

This is a port of Read range of files in pySpark for spark.

I have time series data in a data frame that looks like this:

Index Time Value_A Value_B
0     1    A       A
1     2    A       A
2     2    B       A
3     3    A       A
4     5    A       A

I want to drop duplicate in the Value_A and Value_B columns such that duplicates are only dropped until a different pattern is encountered. The result for this sample data should be:

Index Time Value_A Value_B
0     1    A       A
2     2    B       A
3     3    A       A
Community
  • 1
  • 1
deltap
  • 4,176
  • 7
  • 26
  • 35

0 Answers0