I am struggling to determine as to why my split function along with code is not working. I have a column like this -
RegionName
Alabama[edit]
Auburn (Auburn University)
Florence(University of North Alabama)
Jacksonville
.
.
.
and so on..
The above entries show the cases that are there in the column. What i want to achieve is for entries having state names such as Alabama[edit], i want to have it displayed as NaN, for remaining other entries which are corresponding regions within that particular State, i want to clean all those entries if required. If no cleaning required, i want that entry to stay intact.i am using below code-
for x in Town['RegionName']:
if re.match(r"\s*\(",x):
x.split('(').strip()
elif re.match(r"\d+\[",x):
x = np.NaN
else:
x
The code runs without any error but all the entries stay intact. The desired output is -
RegionName
NaN
Auburn
Florence
Jacksonville
.
.
.
Cleaning required is - remove the entire content post parenthesis, there could be a space between required content and parenthesis so have to take that as well into account.
Please advise.