-2

How I can filter domain in content?

For example.... I have some text content, like this:

dropwox.com N/A     $ 8.95  1 day ago
lute.info   N/A     $ 8.95  1 week ago
zolpidem4sleep.com  N/A     $ 8.95  1 week ago
youredmedsinfo.com  N/A     $ 8.95  1 week ago
youngsmhs.com   N/A     $ 8.95  1 week ago
jsntcj.com  N/A     $ 8.95  1 week ago
fioricetdirect2k.com    13,133,796      $ 8.95  1 week ago
dapoxetinebuynow.com    N/A     $ 8.95  1 week ago
86620000.com    N/A     $ 8.95  1 week ago
spidvid.com 1,884,910       $ 480.00    1 week ago
titsforall.com  20,318,475      $ 8.95  1 week ago

and I just need to filter the domains and see this list like:

dropwox.com
lute.info
zolpidem4sleep.com
youredmedsinfo.com
youngsmhs.com

Is any tool or online converter for do this work?

Help me

Mindus
  • 393
  • 1
  • 3
  • 5

2 Answers2

0

If a shell solution is OK, you can do something like this:

cut -d' ' -f1 file | sort | uniq
Community
  • 1
  • 1
fvu
  • 32,488
  • 6
  • 61
  • 79
  • Any text editor or some specific one? Look [here](http://superuser.com/questions/520372/how-to-keep-only-the-first-word-in-a-line-using-notepad) for some solutions using Notepad++, probably usable as a start even if you're using another editor, provided that it supports replace based on regex. – fvu Jul 11 '15 at 21:54
0

That is an old question, but why not answer for coming generations? If you use MacOS or Linux, there are a bunch of tools:

$ cat full_data.txt
dropwox.com N/A     $ 8.95  1 day ago
lute.info   N/A     $ 8.95  1 week ago
zolpidem4sleep.com  N/A     $ 8.95  1 week ago
...

You may use any of the following:

sed: removing everything after space:
$ sed 's/ .*//' full_data.txt > domains.txt

grep: with regular expression, everything from the beginning (^) until the first space :
$ grep -o "^\S\+" full_data.txt > domains.txt

cut: Pick a first part, space is a delimeter:
$ cut -d' ' -f1 full_data.txt > domains.txt

awk: my beloved awk — pick the first part, space is a delimiter, then printing it:
$ awk '{print $1}' full_data.txt > domains.txt

Also, Perl — same, taking first "variable" line by line :
$ perl -lane 'print $F[0]' full_data.txt > domains.txt