I´m currently working on a little tool which analyses the usage of a group-chat in Whatsapp.
i´m trying to realize it with the whatsapp logfile. I managed it to format the raw .txt
to the following format to work with the formated text:
29. Jan. 12:01 - Random Name: message text
29. Jan. 12:22 - Random Name: message text
29. Jan. 12:24 - Random Name: message text
29. Jan. 12:38 - Random Name: message text
29. Jan. 12:52 - Random Name: message text
so far, so good. The Problem is that there are a few floppy lines like:
29. Jan. 08:42 - Random Name2: message text 1
additional text of the message 1
29. Jan. 08:43 - Random Name2: message text 2
or even worse:
15. Jan. 14:00 - Random Name: First part of the message
second part
third part
forth part
fifth part
29. Jan. 08:43 - Random Name2: message text 2
I guess I need a kind of algorythm to solve this problem, but i´m pretty new in programming and can´t create such a complex algorithm.
The same problem in Python: parse a whatsApp conversation log
[EDIT]
This is my code which doesn´t work. (I know it´s pretty bad)
import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
public class FormatList {
public static void main(String[] args) throws IOException {
// TODO Auto-generated method stub
FileReader fr = new FileReader("Whatsapp_formated.txt");
BufferedReader br = new BufferedReader(fr);
FileWriter fw = new FileWriter("Whatsapp_formated2.txt");
BufferedWriter ausgabe = new BufferedWriter(fw);
String line="";
String buffer="";
while((line = br.readLine())!=null)
{
System.out.println("\n"+line);
if(line.isEmpty())
{
}
else{
if(line.charAt(0)=='0'||line.charAt(0)=='1'||line.charAt(0)=='2'||line.charAt(0)=='3'||line.charAt(0)=='4'||line.charAt(0)=='5'||line.charAt(0)=='6'||line.charAt(0)=='7'||line.charAt(0)=='8'||line.charAt(0)=='9')
{
buffer = line;
}
else
{
buffer += line;
}
ausgabe.write(buffer);
ausgabe.newLine();
System.out.println(buffer);
}
ausgabe.close();
}
}
}
[EDIT 2]
In the end i want to read out the file and analyse each line:
29. Jan. 12:01 - Random Name: message text
I can tell when it was sent, who sent it and what/how much he wrote
If i now get the following line:
additional text of the message 1
I neither can tell when it was written nor who sent it