The Spark csv readers are not as flexible as pandas.read_csv and do not seem to be able to handle parsing dates of different formats etc. Is there a good way of passing pandas DataFrames to Spark Dataframes in an ETL map step? Spark createDataFrame does not appear to always work. Likely the typing system has not been mapping exhaustively? Paratext looks promising but likely new and not yet heavily used.
For example here: Get CSV to Spark dataframe