-1

I have a dataframe like:

df['Web']

I just want to check the first four characters of df['Web'] is 'http' or not.

I don't want to check if df['Web'] is in url format or not.

And how to use if condition like :

if (firstfour=='http'):
   print("starts with http")
else:
   print("doesn't starts with http")
Atom Store
  • 961
  • 1
  • 11
  • 35

2 Answers2

3

You can use string.startswith(). However you should not that it would also match https as well.

You could use regex to match http and not https.

df = pd.DataFrame({'Web': ['htt', 'http', 'https', 'www']})
df['match'] = df.Web.apply(lambda x: x.startswith('http'))

     Web  match
0    htt  False
1   http   True
2  https   True
3    www  False

Regex

df['match'] = df['Web'].str.match(r'^http(?!s)')


     Web  match
0    htt  False
1   http   True
2  https  False
3    www  False
PacketLoss
  • 5,561
  • 1
  • 9
  • 27
3

Use Series.str.startswith:

df['match'] = df.Web.str.startswith('http')

Or use Series.str.contains with ^ for start of string:

df['match'] = df.Web.str.contains('^http')
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252