0

I am confusing about how to make any regular expressions. e.g I have to make the regular expression of this type of string,

fetch/https://upload.wikimedia.org/wikipedia/w_100/commons/f/f9/Phoenicopterus_ruber_in_São_Paulo_Zoo.jpg fetch/https://upload.wikimedia.org/wikipedia/w_100,h_100,fl_progressive,dpr_2.0/commons/f/f9/Phoenicopterus_ruber_in_São_Paulo_Zoo.jpg

so I want to match only frm first string w_100 and from second string w_100,h_100,fl_progressive,dpr_2.0 likewise different urls.

var regex = /[a-z]_[0-9a-z]/g;
var found = string.match(regex);

it shows me output something like,

["w_1","s_r","r_i","n_s"]

I want something like this

["w_100","h_100","fl_progressive","dpr_2.0",]

can anyone suggest me a regular expression for this.?

Keval
  • 85
  • 8
  • `[0-9a-z]` should be `[0-9a-z]+`. You need 1 or more. Ideally both should have one or more option: `/[a-z]+_[0-9a-z]+/g;` – Rajesh Feb 12 '20 at 08:00
  • should be `/[a-z]+_([a-z]+|[0-9]+\.(?=[0-9]+)/` – AZ_ Feb 12 '20 at 08:06
  • @Rajesh your answer gives me like this ["w_100", "hoenicopterus_ruber"] I don't want this. – Keval Feb 12 '20 at 08:13
  • @AZ_ Invalid regular expression: /[a-z]+_([a-z]+|[0-9]+\.(?=[0-9]+)/: Unterminated group – Keval Feb 12 '20 at 08:16
  • 1
    @Keval `/[a-z]+_([a-z]+|[0-9]+\.(?=[0-9]+))/` there was a typo. – AZ_ Feb 12 '20 at 08:48
  • @AZ_ can you please check but it's only giving me ["fl_progressive",dpr_2.] for this rg/w_100,h_100,fl_progressive,dpr_2.0/wikipedia/commons/f/f9/Zoo.jpg' – Keval Feb 12 '20 at 09:03

2 Answers2

1

You can use following regex

let str = '/https://upload.wikimedia.org/wikipedia/w_100,h_100,fl_progressive,dpr_2.0,ar_3:4,quality_auto:good,effect_auto_brightness,effect_auto_color:50,effect_green:-30/commons/f/f9/Phoenicopterus_ruber_in_São_Paulo_Zoo.jpg'

console.log(str.match(/([a-z]+_([a-z]+|[0-9]+(\.[0-9]+)?)(?=(,|\/)))/g))

Update for added requirement.

console.log(str.match(/([a-z]+(?:_|:|:-)?([a-z]+|[0-9]+(\.[0-9]+)?)(?=(,|\/)))+/g))
AZ_
  • 3,094
  • 1
  • 9
  • 19
  • see "hoenicopterus_ruber" is also there. thanks for the answer. I'll sort it out. – Keval Feb 12 '20 at 09:12
  • what's the rule behind including `fl_progressive` but not `hoenicopterus_ruber`? – AZ_ Feb 12 '20 at 09:14
  • fl_progressive is the query parameter in the string URL. SO I want to extract it. btw thank you. now it's working fine.! – Keval Feb 13 '20 at 06:13
  • if i change the string to https://upload.wikimedia.org/wikipedia/w_100,h_100,fl_progressive,dpr_2.0,ar_3:4,quality_auto:good,effect_auto_brightness,effect_auto_color:50,effect_green:-30/commons/f/f9/Phoenicopterus_ruber_in_São_Paulo_Zoo.jpg now can you please change the regex according to the string? – Keval Feb 14 '20 at 05:52
  • updated regex is not working for upper example. can you please check and fix it? – Keval Feb 25 '20 at 09:55
0

The reason you have three characters matching for each match is because when you use the square brackets '[]', you are matching a single character from a possible range of characters. So your regex is currently looking for anything that has an alphabetic character between a-z, followed by an underscore, and finally another character between 0-9 or a-z.

To match multiple characters, you can add an asterisk '+' quantifier, which says the bit of regex that it is added to should match 1 or more times.

So to get the full matches, you can use the following: /[a-z]+_[0-9a-z.]+/g

However, this will also match hoenicopterus_ruber which is in the URLs.

To get around this issue, you can use a positive lookbehind:

(?<=wikipedia\/|,)[a-z]+_[0-9a-z.]+

NOTE: I've made the assumption that the URLs you are parsing are all wikipedia ones. If they are not, then you will need another solution for the second issue.

Shahzad
  • 2,033
  • 1
  • 16
  • 23
  • it's just a format for my request URL, it should be anything like name.jpg, but if in case image name like name_like_this.jpg then it should not be matched with regex. – Keval Feb 12 '20 at 09:07