0

I have to extract only those string that contains underscore in notepad++. My file is like

T-cell_stimulation
transcription_factor
NF-kappa_B
kappa_B_site
HIV-1_long_terminal_repeat
HIV-1
HIV-2_enhancer
HIV-2
monocyte
T_cell
cis-acting_element
kappa_B_site
purine-rich_binding_site

and my desired output is

T-cell_stimulation
transcription_factor
NF-kappa_B
kappa_B_site
HIV-1_long_terminal_repeat
HIV-2_enhancer
T_cell
cis-acting_element
kappa_B_site
purine-rich_binding_site
JSON C11
  • 11,272
  • 7
  • 78
  • 65
Shaheen Gul
  • 61
  • 2
  • 4
  • 10
  • See this answer http://stackoverflow.com/questions/29537982/notepad-completely-remove-lines-that-contains-question-mark-via-regex/29543486#29543486 . but modify the method by searching for `_` and at the finish use **Remove unmarked lines**. – AdrianHHH Apr 11 '15 at 13:39

2 Answers2

0

Look into Notepad++'s regex search.

something like the following:

.*_.*
Huey
  • 5,110
  • 6
  • 32
  • 44
0

I solved my problem through code in python code is

import re;
file = "C:/Python26/test.txt";
f=open("rzlt.txt",'w')
pattern ='\w+_\w+[_\w+]*|\w+-\w+[-\w+]*';
with open(file,'r') as rf:
    lines = rf.readlines();
    c=0;
    for word in lines:
        if re.match(pattern, word):
            f.write( word)
            c=c+1;
    print c;
f.close();  
Shaheen Gul
  • 61
  • 2
  • 4
  • 10