I want to do a train test split on sorted Pyspark data frame based on time. Say that first 300 rows will be in train set and next 200 rows in test split.
I can select first first 300 rows with -
train = df.show(300)
but how can I select the last 200 rows from Pyspark dataframe?