0

I have a text with news where i got html attributes that i don't need. How can i delete phrases in ruby such as

img width="750" alt="4.jg" c="/unload/medialiy/df6/4.jg" height="499" title=4.jg"

img width="770" alt="5.jg" c="/unload/medialiy/ty6/5.jg" height="499" title=5.jg"

So i need some regex smth like news.sub('/img*jg"/, ''). but it doesn't work.

Community
  • 1
  • 1
  • _"a text with news where i got html attributes"_ – what does that mean? Do you have HTML or text containing HTML? Why are the angle brackets missing? How does your actual input look like (i.e. `news`) and what is your expected output? – Stefan Sep 11 '17 at 11:34

2 Answers2

1

I would use:

img .*\.jg"

test

if you want to say in regex "any symbols in any quantity", use .* Dot means any symbol, and star - any quantity.

But are you sure you don't want to include angle braces?

<img .*\.jg">

As an aside, what if the order of attributes will be changed? Then you'll fail to match the img tag. We really need img tag with .jg" substring in it.

<img [^>]*\.jg"[^>]*>

test

Gangnus
  • 24,044
  • 16
  • 90
  • 149
  • _"Dot means any symbol"_ – are you sure that you want `.jg` then? ;-) – Stefan Sep 11 '17 at 11:39
  • Oh! How foolish of me to omit back slashes! Practically, it was a longer substitution for '.?'. It was a correct regex, so tests passed, but not good for explanation, thank you! – Gangnus Sep 12 '17 at 06:35
0

In your particular case you can do this:

element = '<img width="750" alt="4.jg" c="/unload/medialiy/df6/4.jg" height="499" title="4.jg">'

puts element.gsub(/(width|alt)=\"[^ ]+\" ?/, '')

You can also play around with this regex here.

But if you need a more robust solution, try to take a look at the Nokogiri gem. This SO question can help.

Psylone
  • 2,788
  • 1
  • 18
  • 15