I have a question to NLP in R. My data is very big and so I need to reduce my data for further analysis to apply a SVM on it.
I have a Document-Term-Matrix like this:
Document WordY WordZ WordV WordU WordZZ
1 0 0 0 1 0
2 0 2 1 2 0
3 0 0 1 1 0
So in this example I would like to reduce the dataframe by column WordY and WordZZ because this columns have no specific meaning for this dataframe. Is this possible to remove all column with only zero values with one specific order? My problem is that my dataframe is too huge to delete every specific column with one order. Its something about 4.0000.0000 columns in the dataframe.
Thank you in Advance guys. Cheers, Tom