I am using pyspark as code language. I added column to get filename with path.
from pyspark.sql.functions import input_file_name
data = data.withColumn("sourcefile",input_file_name())
I want to retrieve only filename with it's parent folder from this column. Please help.
Example:
Inputfilename = "adl://dotdot.com/ingest/marketing/abc.json"
What output I am looking for is:
marketing/abc.json
Note: String operation I can do. The filepath column is part of dataframe.