1

I have a very lengthy string (length is also not fixed) I want to extract a substring lying between 'email' and '@gmail.com'

Suppose it is

xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc

I want to extract the 'substring' in the String.. Can I do this using regular expression , using sed tool.?

ALBI
  • 721
  • 2
  • 16
  • 34

6 Answers6

3
perl -lne 'print $1 if(/email:(.*?)\@gmail.com/)'

Tested below:

> echo "xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc" | perl -lne 'print $1 if(/email:(.*?)\@gmail.com/)'
substring
>
Vijay
  • 65,327
  • 90
  • 227
  • 319
1

VALUE="xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc"

echo $VALUE | awk -F":" '{print $2}' |cut -d@ -f1

Community
  • 1
  • 1
hek444
  • 19
  • 2
  • The `sed` just do the same as `awk`, so why not two `awk` like this `awk -F: '{print $2}' | awk -F@ '{print $1}'` or two `sed`? – Jotne Sep 20 '13 at 11:59
1

Using sed:

INPUT="xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc"
USERNAME=$(sed -n "s/.*\email:\(.*\)@gmail\.com.*/\\1/p" <<< $INPUT)
echo $USERNAME
Benoit Blanchon
  • 13,364
  • 4
  • 73
  • 81
  • 1
    This returns `substring@gmail.com` not correctly `substring`. You should also change backtics to paranthese `$(data)` Eksample `MAIL=$(sed -n "s/.*\email:\(.*@gmail\.com\).*/\\1/p" <<< $INPUT)` – Jotne Sep 20 '13 at 11:26
1

Another awk

awk -F":" '{split($2,a,"@");print a[1]}' file
substring

It you have many lines to search for gmail addresses

awk -F":" '/gmail\.com/ {split($2,a,"@");print a[1]}'
substring
Jotne
  • 40,548
  • 12
  • 51
  • 55
1

The shell can handle this:

$ line='xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc'
$ name=${line#*email:}       # remove the prefix ending with "email:"
$ name=${name%@gmail.com*}   # remove the suffix starting with "@gmail.com"
$ echo $name
substring
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
0

I think grep (with positive lookahead and positive lookbehind) is the correct tool for the job:

$ grep -oP '(?<=email:).*?(?=@gmail\.com)'<<< "xhxjcndjcnkjcnd cjkjcdckjncx email:substring@gmail.comjndhcjkdhcnchjdccb djc"
substring
user000001
  • 32,226
  • 12
  • 81
  • 108