I try importing a .csv file into python pandas as the following:
dataframe = pd.read_csv(inputfile, sep=delimiter, header=None)
However, each line of the (huge) inputfile
consists of an integer, followed by some string. Like this:
1234 this string % might; contain 눈 anything
The result should be a two column dataframe
which has said Integer
on position 1 and the rest of the line in position 2.
Since any character can occur in the string I am unable to use a single character as a separator. Trying to use a highly unlikely long string sequence like "khlKiwVlZdsb9oVKq5yG" as a delimiter for one feels like a dirty workaround, secondly may not be 100% reliable and thirdly causes the following "error/inconvenience":
ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.
So my question is: Is there any better way to deal with my Problem? Maybe some option to tell pandas to ignore any further delimiters after the first in a line has been encountered?
Thank you for any suggestions!