Identify cells with only whitespace

Question

I want to apply this function

df.column.str.split(expand = True)

but the problem is there are some "empty cells", and when I mean "empty" it means that it has, for example, 6 white-spaces. Moreover, this is an iteration, so sometimes I have cells with 2 white-spaces.

How can I identify this "empty cells"?

PD:

df[df.column != '(6 spaces inside)']

works only for a particular case when there are 6 spaces.

EDIT 1: the df.column is an object type with people names (one or more than one, even errors)

EDIT 2: The idea is to remove this cell (row) in order to successfully applied the "str.split" function. This is an interation so sometimes I have cells with 6 spaces and other with 2 spaces.

EDIT 3: I can't remove all whitespaces because then I won't be able to apply the string separation (because I have names like "Jean Carlo" that I want to separate)

FINAL SOLUTION: I could solve the problem with the post that was signaled only adding a '+' because I have whitespaces in other cells.

Solution:

df = df.replace(['^\s+$'], np.nan, regex = True)

Would be coold if there was a function that `strip`s whitespaces from strings so they shrinkt to empty strings... not sure if that would be a solution to your comparison — Patrick Artner, Mar 19 '18 at 19:58

score 0 · Answer 1 · edited Mar 19 '18 at 20:19

0

df['Col1'] = df['Col1'].map(lambda x: x.strip())

This will remove all leading and trailing spaces in df['Col1']

edited Mar 19 '18 at 20:19

philshem

24,761
8
61
127

answered Mar 19 '18 at 20:06

It_is_Chris

13,504
2
23
41

This can have unintended consequences. The OP only suggests that they want to identify cells that are entirely made up of whitespace, not affect whitespace on other strings – roganjosh Mar 19 '18 at 20:12
@roganjosh this is also a good point. Would probably be better off mapping a function that uses regex to normalize any multi-space value by returning an single space (or whatever you want your flag character to be) then subselecting on the flag character – zyd Mar 19 '18 at 20:14
@zyd the duplicate I have proposed takes care of this – roganjosh Mar 19 '18 at 20:16
@roganjosh - it does seem to, thanks for pointing that out. – zyd Mar 19 '18 at 20:18
I tried zyd's solution and I have "'float' object has no attribute 'strip'" – Agus Velazquez Mar 19 '18 at 20:20
@roganjosh You are correct. That is my mistake. I suppose an alternative option would be to use replace: `df['Col1'] = df['Col1'].replace(' ','6 spaces')` – It_is_Chris Mar 19 '18 at 20:22

Identify cells with only whitespace

1 Answers1