I'm new to R. I'm mining data which is present in csv file - summaries of reports in one column, date of report in another column and report's agency in the thrid column. I need to investigate how terms associated with ‘fraud’ have changed over time or vary by agency. I've filtered the rows containing the term 'fraud' and created a new csv file.
How can I create a term freq matrix with years as rows and terms as columns so that I can look for top freq terms and do some clustering?
Basically, I need to create a term frequency matrix of terms against year
Input data: (csv)
**Year** **Summary** (around 300 words each)
1945 <text>
1985 <text>
2011 <text>
Desired 0utput : (Term frequency matrix)
term1 term2 term3 term4 .......
1945 3 5 7 8 .....
1985 1 2 0 7 .....
2011 . . .
Any help would be greatly appreciated.