I have a dataframe, and want to add a blank column. If it's for numbers, I would use df["new_column"] = pd.np.nan
. But what if I want the column to (a) hold strings, and (b) be filterable with pd.isnull()
? Is there a better idea than df["new_column"] = ""
?
Asked
Active
Viewed 6,869 times
2

Dimitri Shvorob
- 495
- 1
- 6
- 24
-
1You can use None keyword – The Guy Feb 19 '20 at 18:05
-
`pd.isnull` plays nicely with strings (unlike `np.isnan`, which throws an error). Why not just use `np.nan`? – Fortunato Feb 19 '20 at 19:38
-
@Fortunato, because Python will throw an exception when you try to insert a string into the column initialized with `pd.nan` – Dimitri Shvorob Feb 20 '20 at 20:08
-
@DimitriShvorob, I'm probably misunderstanding what you are trying to do. can you provide an example? this seems to work fine for me: `a=pd.DataFrame([[1,2],[3,4]], columns=['col1', 'col2']); a['new_col'] = pd.np.nan; a['new_col'].iloc[0] = 'p'` – Fortunato Feb 20 '20 at 23:10
1 Answers
0
It is my understanding that Python doesn't have a Null string. Python does have a None value but it isn't a string it is A None that can be applied to all variables - not just those which are originally defined as a string.
You could use NoneType but the column would then not be a string but an object type. You could still insert strings in the column, but changing the type of the column to string would simply consider None values as the string 'None'
please see this too: What is the difference between NaN and None?
test_df = pd.DataFrame(data={'numbers_column':np.nan,
'strings_column':[None, None, None,'random_str']},
index=[1,2,3, 4])
Hope this helps.

emiljoj
- 399
- 1
- 7
-
1
-
There is, but it is considered experimental: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.StringDtype.html – timmey Jul 07 '21 at 12:31