I have a dataframe that has text in it. There are some words like isn't, couldn't etc..which need to be expanded.
For example:
I'd -> I would
I'd -> I had
Below is the dataframe
DataFrame:
temp = spark.createDataFrame([
(0, "Julia isn't awesome"),
(1, "I wish Java-DL couldn't use case-classes"),
(2, "Data-science wasn't my subject"),
(3, "Machine")
], ["id", "words"])
+---+----------------------------------------+
|id |words |
+---+----------------------------------------+
|0 |Julia isn't awesome |
|1 |I wish Java-DL couldn't use case-classes|
|2 |Data-science wasn't my subject |
|3 |Machine |
+---+----------------------------------------+
I am trying to search for a library in pyspark but haven't got it..How to achieve this?
Output:
+---+-----------------------------------------+
|id |words |
+---+-----------------------------------------+
|0 |Julia is not awesome |
|1 |I wish Java-DL could not use case-classes|
|2 |Data-science was not my subject |
|3 |Machine |
+---+-----------------------------------------+