I have a large TSV data file that contains, lumped together, the fact table and its dimension tables. I'm wondering if its possible through Spark, to divide/partition that single file into different 'tables', and then perform a join to normalize them?
Any help pointing me in the right direction would be awesome.