I have a CSV file with translation pairs. It has the following scheme:
text language 1;text language 2
text language 1;text language 2
text language 1;text language 2
and so on. The problem is sometimes the text is very long or contains \n or even multiple quotation marks, like this:
"Very long long long long long long long long long long long long long long long long long long long text";"Very long long long long long long long long long long text2"
text;text2
My problem is that i cant figure out the right Regex pattern to split the word or sentence pairings correctly. Especially when its a long bracked containing \n or even \r\n . In these cases however, the sentence pairs are each encapsuled in quotation marks if thats any help. Similar to this
"Long text with lines\r\nmore lines\nand another line\nAnd yet another";"Long text with lines\r\nmorelines\nand another line\nAnd yet another"\r\n
word1;word2
so i assume, i need to split the word pairs if theres either a "\r\n or a \r\n" or a ; ? Sadly im not experienced with regular expressions.
I uploaded the csv here: http://s000.tinyupload.com/?file_id=11646241007071639575