Pandas DataFrames: how to wrap text with no whitespace

Question

I'm viewing a Pandas DataFrame in a Jupyter Notebook, and my DataFrame contains URL request strings that can be hundreds of characters long without any whitespace separating characters.

Pandas seems to only wrap text in a cell when there's whitespace, as shown on the attached picture:

If there isn't whitespace, the string is displayed in a single line, and if there isn't enough space my options are either to see a '...' or I have to set display.max_colwidth to a huge number and now I have a hard-to-read table with a lot of scrolling.

Is there a way to force Pandas to wrap text, say, every 100 characters, regardless of whether there is whitespace?

Take a look at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.str.wrap.html , specifically the parameter `break_long_words`. — Shovalt, Dec 20 '16 at 09:20

paulo.filip3 · Answer 1 · 2017-11-23T13:02:55.813

29

You can set

import pandas as pd
pd.set_option('display.max_colwidth', 0)

and then each column will be just as big as it needs to be in order to fully display it's content. It will not wrap the text content of the cells though (unless they contain spaces).

edited Nov 23 '17 at 13:02

answered Nov 23 '17 at 12:55

paulo.filip3

3,167
1
23
28

3

Just thought I'd add the equivalent context-specific way: `with pd.option_context('display.max_colwidth', 0):` – matanster May 20 '19 at 17:55
8

What does this have to do with the question??? Where's your **wrapping**? – Apostolos Aug 08 '20 at 17:02
@matanster +1 ! – Ashish Gulati Feb 28 '21 at 00:16
Wow! +1 - I am amazed. – Jakub Mar 12 '21 at 23:27
@paulo.filip3 is there a similar way to instead ensure wrapped text display? Not changing the data itself in any way like adding \n. – Anirban Chakraborty Jun 30 '21 at 13:29

score 7 · Answer 2 · answered Dec 06 '17 at 15:32

7

You can use str.wrap method:

df['user_agent'] = df['user_agent'].str.wrap(100) #to set max line width of 100

answered Dec 06 '17 at 15:32

O.Suleiman

898
1
6
11

Julian · Answer 3 · 2021-11-30T21:36:09.087

Try wrapping the text first, then execute the function below. The top-voted answer does not effectively wrap text.

By using pd.set_option('display.max_colwidth', 0), it ineffectively wraps text like this:

Example 1

But, by using the following code, it will effectively wrap the text to any columns width:

from IPython.display import display, HTML

def wrap_df_text(df):
    return display(HTML(df.to_html().replace("\\n","<br>")))

df['user_agent'] = df['user_agent'].str.wrap(30)
wrap_df_text(df)
display(df)

Example 2 - Better Execution

score 1 · Answer 4 · answered Oct 30 '17 at 19:05

1

You can create a new column with the first 100 characters of the data

data['new_column'] = [i[:100] for i in data['old_column']]

answered Oct 30 '17 at 19:05

Pato Navarro

262
2
11

score 1 · Answer 5 · answered May 17 '22 at 10:53

For DataFrame visualization in Jupyter Notebook I would recommend to use the Styler class. It leverages CSS styling language which allows a lot of flexibility out of the box.

As you need to apply a style to all rows, you may use Styler.set_properties method, which returns the same properties for all cells.

Here is an example with CSS styles I've taken from Mozilla web docs for text wrapping.

import pandas as pd

df = pd.DataFrame(
    [['Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0; '
      'GomezAgent 3.0) like Gecko']], 
    columns=['user_agent']
)

df.style.set_properties(
    **{
        'inline-size': '10px',
        'overflow-wrap': 'break-word',
    }, 
    subset='user_agent'
)

You can find more examples how to control pandas DataFrame styling here https://pandas.pydata.org/docs/user_guide/style.html.

score 0 · Answer 6 · answered Aug 07 '17 at 13:24

If you don't mind solving this before you put the whole thing into a dataframe, you can do it like described here. In your particular case, if you'd like each line to be 10 characters long, you would have:

# Input
line = 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0; GomezAgent 3.0) 
like Gecko'
n = 10

# Split
line = [line[i:i+n] for i in range(0, len(line), n)]

# The rest is easy
df = pd.DataFrame(line)
print(df)

Without the white spaces, you'll get:

And by the way, the white space at the beginning of the last row occurs because there are not 10 characters to fill the row like there is in the preceding rows. In jupyter you could remedy this by using df.style.set_properties(**{'text-align': 'left'}):

score 0 · Answer 7 · edited Apr 22 '21 at 19:49

0

If you're only in this for ad-hoc, temporary display purposes in Jupyter, you can simply insert whitespace every 100 characters:

chunk_size = 100

data['new_column'] = [' '.join([val[0+i:chunk_size+i] for i in range(0, len(string), chunk_size)] for val in data['old_column']

Though it looks like the reason this is a problem in the first place is because multiple features are collapsed into a single column. It's hard to say without seeing your larger dataset, but if they all follow they same pattern, I'd strongly suggest splitting this out into multiple features (browser, browser version, OS, OS version, etc), which will make any additional work with this dataset easier.

edited Apr 22 '21 at 19:49

Derek O

16,770
4
24
43

answered Dec 19 '17 at 00:57

mr_snuffles

312
2
3

1

1) The whole `data['new_column'] = ` line produces syntax error 2) 'string' is undefined! Do you check your code before publishing it ??? This answer merits 100 downvotes, but I don't like downvoting. – Apostolos Aug 08 '20 at 17:31
+ to @Apostolos comment – rjurney Dec 24 '20 at 19:57

Pandas DataFrames: how to wrap text with no whitespace

7 Answers7

Linked