I am currently involved in mathematics of machine learning (NLP to be precise). While on the task I have encountered a problem. I want to print out lines containing any of the following regexes:
1)fbchat
2)fb_timeline
3)Facebook Wall Post
into a separate text files, one for each string mentioned above.
Then in each of the resulting text files, I would like to sort each line with respect to the thread ID field of the Database mentioned in the very first line of messaged.dmp. I am a theoretical person with very less programming experience.
The download link to the database dump is given below
Update:
This is the script I tried to write:
import re
from sys import argv
scrip, file_name = argv
dfile = open(file_name, 'r')
for line in dfile:
if re.match("fbchat", line):
print line
But the script performs nothing.