2

I am parsing one file which has some html tag and changing into latex tag.

cat text

  <Text>A &lt;strong&gt;ASDFF&lt;/strong&gt; is a &lt;em&gt;cerebrovafdfasfscular&lt;/em&gt; condifasdftion caufadfsed fasdfby tfdashe l
 ocfsdafalised &lt;span style="text-decoration: underline;"&gt;ballooning&lt;/span&gt; or difdaslation of an arfdatery in thdfe bfdasrai
 n. Smadfsall aasdneurysms may dadisplay fdasno ofadsbvious sdfasigns (&lt;span style="text-decoration: underline;"&gt;&lt;em&gt;&lt;str
 ong&gt;asymptomatic&lt;/strong&gt;&lt;/em&gt;&lt;/span&gt;) bfdasut lfdsaarger afdasneurysms maydas besda asfdsasociated widfth sdsfudd

  sed -e 's|&lt;strong&gt;\(.*\)&lt;/strong&gt;|\\textbf{\1}|g' test

cat out

 <Text>A \textbf{ASDFF&lt;/strong&gt; is a &lt;em&gt;cerebrovafdfasfscular&lt;/em&gt;    condifasdftion caufadfsed fasdfby tfdashe locfsda
    falised &lt;span style="text-decoration: underline;"&gt;ballooning&lt;/span&gt; or    difdaslation of an arfdatery in thdfe bfdasrain. Sma
      dfsall aasdneurysms may dadisplay fdasno ofadsbvious sdfasigns (&lt;span style="text-decoration: underline;"&gt;&lt;em&gt;&lt;strong&gt
      ;asymptomatic}&lt;/em&gt;&lt;/span&gt;) bfdasut lfdsaarger afdasneurysms maydas besda   asfdsasociated widfth sdsfudd

Expected outputs should be \textbf{ASDFF} while i observe \textbf{ASDFF .........}. How to get expected result?

Regards

Manish
  • 3,341
  • 15
  • 52
  • 87

1 Answers1

1

You may want to use perl regex instead.

perl -pe  's|&lt;strong&gt;(.*?)&lt;/strong&gt;|\\textbf{\1}|g'

Your problem is similar with non-greedy-regex-matching-in-sed. And next time you may want to simplify your case to point out the real problem. For example, don't just paste the raw html code, use this instead:

fooTEXT1barfooTEXT2bar

Update

If you just want the greedy approach, just ignore this.

Community
  • 1
  • 1
Bin Wang
  • 2,697
  • 2
  • 24
  • 34
  • Thanks, I have multiple expression how can i use all possible repalcement in one line in perl like perl -pe -pe .... – Manish Jan 17 '13 at 07:46
  • 1
    You may want a pipe instead, e.g. `cat file | perl -pe '' | perl -pe '' ` – Bin Wang Jan 17 '13 at 07:52
  • But for pipe, i need to do it so many times. My file is very big arnd 100 GB. Is there any way to do all at one time. Which means no need to scan file number of times. – Manish Jan 18 '13 at 01:54
  • @user15662: The best way to solve it may be to learn a little perl. I don't know if there is some one-line perl code to do these things. Perl is easy to learn and you could have a try. – Bin Wang Jan 18 '13 at 02:45
  • Thanks for your suggestion, Is there any way to the same if data tags are in multiline. – Manish Jan 18 '13 at 02:47
  • @user15662:yes, perl can do it – Bin Wang Jan 18 '13 at 04:08