I want to output a DataFrame to a string while adding a delimiter for each column, but haven't found a way to do so (neither with to_string
nor any other method). Doing it via regex, for example by substituting white spaces with the preferred delimiter, does not work since my data includes strings (sentences). Example:
import pandas as pd
data = pd.DataFrame({
'a': [1, 'This is some text', 3, 4, 5],
'b': [1, 2, 3, 'This is also some text', 5]
})
string = data.to_string(header = False)
string
'0 1 1\n1 This is some text 2 \n2 3 3\n3 4 This is also some text\n4 5 5'
Replacing white spaces with the preferred delimiter inserts the delimiter between every word in each sentence, which is not appropriate in this case. Is there a way to output a DataFrame to a string while specifying a delimiter for the variables?
My end goal here is to concatenate two DataFrames of different sizes (and with different variables), basically binding them row-wise, one on top of the other, and then outputting the combined results to a .csv file. The reason I am not doing this directly on the DataFrames (e.g. via pandas.concat
) is because this forces the rows of the thus combined DataFrame to be equal (with NaN
values for variables not included in one or the other DataFrame). This is obviously the preferred behaviour, but when printing to .csv, this produces blank spaces (e.g. ,,,
where the NaN
values in the DataFrame would be). I need to provide a .csv file that does not include any "blanks" and thus trying to achieve it through the above method.
Any suggestions on how to achieve this is highly welcome!