Regex to detect that the URL doesn't end with an extension

Question

I'm using this regular expression for detect if an url ends with a jpg :

var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*^\.jpg)/ig;

it detects the url : e.g. http://www.blabla.com/sdsd.jpg

but now i want to detect that the url doesn't ends with an jpg extension, i try with this :

var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]\b)/ig;

but only get http://www.blabla.com/sdsd

then i used this :

var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*[^\.jpg]$)/ig;

it works if the url is alone, but dont work if the text is e.g. :

http://www.blabla.com/sdsd.jpg text

score 2 · Accepted Answer · edited May 23 '17 at 10:24

Try using a negative lookahead.

(?!\.jpg)

What you have now, [^\.jpg] is saying "any character BUT a period or the letters j, p, or g".

EDIT Here's an answer using negative look ahead and file extensions.

Update

Knowing this is a "url finder" now, here's a better solution:

// parseUri 1.2.2
// (c) Steven Levithan <stevenlevithan.com>
// MIT License
// --- http://blog.stevenlevithan.com/archives/parseuri
function parseUri (str) {
    var    o   = parseUri.options,
        m   = o.parser[o.strictMode ? "strict" : "loose"].exec(str),
        uri = {},
        i   = 14;

    while (i--) uri[o.key[i]] = m[i] || "";

    uri[o.q.name] = {};
    uri[o.key[12]].replace(o.q.parser, function ($0, $1, $2) {
        if ($1) uri[o.q.name][$1] = $2;
    });

    return uri;
};
parseUri.options = {
    strictMode: false,
    key: ["source","protocol","authority","userInfo","user","password","host","port","relative","path","directory","file","query","anchor"],
    q:   {
        name:   "queryKey",
        parser: /(?:^|&)([^&=]*)=?([^&]*)/g
    },
    parser: {
        strict: /^(?:([^:\/?#]+):)?(?:\/\/((?:(([^:@]*)(?::([^:@]*))?)?@)?([^:\/?#]*)(?::(\d*))?))?((((?:[^?#\/]*\/)*)([^?#]*))(?:\?([^#]*))?(?:#(.*))?)/,
        loose:  /^(?:(?![^:@]+:[^:@\/]*@)([^:\/?#.]+):)?(?:\/\/)?((?:(([^:@]*)(?::([^:@]*))?)?@)?([^:\/?#]*)(?::(\d*))?)(((\/(?:[^?#](?![^?#\/]*\.[^?#\/.]+(?:[?#]|$)))*\/?)?([^?#\/]*))(?:\?([^#]*))?(?:#(.*))?)/
    }
};//end parseUri

function convertUrls(element){
    var urlRegex = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig
    element.innerHTML = element.innerHTML.replace(urlRegex,function(url){
        if (parseUri(url).file.match(/\.(jpg|png|gif|bmp)$/i))
            return '<img src="'+url+'" alt="'+url+'" />';
        return '<a href="'+url+'">'+url+'</a>';
    });
}

I used a parseUri method and a slightly different RegEx for detecting the links. Between the two, you can go through and replace the links within an element with either a link or the image equivalent.

Note that my version checks most images types using /\.(jpg|png|gif|bmp)$/i, however this can be altered to explicitly capture jpg using /\.jpg$/i. A demo can be found here.

The usage should be pretty straight forward, pass the function an HTML element you want parsed. You can capture it using any number of javascript methods (getElementByID, getElementsByTagName, ...). Hand it off to this function, and it will take care of the rest.

You can also alter it and add it tot he string protoype so it can be called natively. This version could be performed like so:

String.prototype.convertUrls = function(){
    var urlRegex = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig
    return this.replace(urlRegex,function(url){
        if (parseUri(url).file.match(/\.(jpg|png|gif|bmp)$/i))
            return '<img src="'+url+'" alt="'+url+'" />';
        return '<a href="'+url+'">'+url+'</a>';
    });
}
function convertUrls(element){
    element.innerHTML = element.innerHTML.convertUrls();
}

(Note the logic has moved to the prototype function and the element function just calls the new string extension)

This working revision can be found here

This won't help if the URL has a fragment or a query or if any of the characters in the extension is percent encoded. — Mike Samuel, Mar 01 '11 at 15:18
@MikeSamuel: I don't see their regex doing the same. I was unaware this had to be a bullet-proof solution. — Brad Christie, Mar 01 '11 at 15:20
@Brad, not being able to handle fragments seems a rather serious shortcoming. That's not just not bulletproof, but fragile. — Mike Samuel, Mar 01 '11 at 15:25
@MikeSamuel: Unless you know the OP's intent and where they are using the solution, this may or may not be necessary and (at this point) is speculative. Either way, the solution is to use a negative look-ahead, "proper url formatting" aside. — Brad Christie, Mar 01 '11 at 15:29
(?!\.jpg)....$ it works, but if i have "http:/ /www.blabla.com/sdsd.jpg text" don't detect. (it's part of a script for replaces url width and images urls width ) — Daniel Flores, Mar 01 '11 at 15:34
@Brad, agreed mostly and if you were answering a question in email I would agree, but this question is on a public forum so the asker is not the only one with the question. I don't think negative look-ahead is the solution. I think the solution is to invert the test and to avoid ad-hoc URL parsing using regular expressions. — Mike Samuel, Mar 01 '11 at 15:50
@devnieL: Using a function i found for URI parsing, here's what I've come up with: http://jsfiddle.net/bradchristie/nVFQh/ — Brad Christie, Mar 01 '11 at 16:08

score 0 · Answer 2 · answered Oct 01 '14 at 09:18

0

Generally you can check all the extensions with some like (for pictures):

([^\s]+(\.(?i)(jpg|jpeg|png|gif|bmp))$)

answered Oct 01 '14 at 09:18

user1079877

9,008
4
43
54

score 0 · Answer 3 · answered Mar 01 '11 at 15:22

Define the URL regex from the RFC 3986 appendix:

function hasJpgExtension(myUrl) {
  var urlRegex = /^(([^:\/?#]+):)?(\/\/([^\/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?/;
  var match = myUrl.match(urlRegex);
  if (!match) { return false; }

Whitelist the protocol

  if (!/^https?/i.test(match[2])) { return false; }

Grab the path portion so that you can filter out the query and the fragment.

  var path = match[5];

Decode it so to normalize any %-encoded characters in the path.

  path = decodeURIComponenent(path);

And finally, check that it ends with the appropriate extension:

  return /\.jpg$/i.test(path);
}

score 0 · Answer 4 · answered Mar 01 '11 at 19:33

This is a simple solution from the post of @Brad and don't need the parseUri function:

function convertUrls(text){
    var urlRegex = /((\b(https?|ftp|file):\/\/|www)[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
    var result = text.replace(urlRegex,function(url){
        if (url.match(/\.(jpg|png|gif|bmp)$/i))
            return '<img width="185" src="'+url+'" alt="'+url+'" />';
        else if(url.match(/^(www)/i))
            return '<a href="http://'+url+'">'+url+'</a>';
        return '<a href="'+url+'">'+url+'</a>';
    });

    return result;
}

The same result :

http://jsfiddle.net/dnielF/CC9Va/

I don't know if this is the best solution but works for me :D thanks !

Indeed, you don't need to. And I could have just extracted the regex, but I felt more than a little obligated to give the original author credit. Also, I don't think yours will detect images in a link with get variables (`foo.jpg?bar=foobar` for instance, case in point: [see this](http://jsfiddle.net/bradchristie/CC9Va/1/)), but if it works for you than so be it. ;-) — Brad Christie, Mar 01 '11 at 21:04

Regex to detect that the URL doesn't end with an extension

4 Answers4

Update