Friday, April 8, 2011

A regex that validates a web address and matches an empty string?

The current expression validates a web address (HTTP), how do I change it so that an empty string also matches?

(http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?
From stackoverflow
  • Put the whole expression in parenthesis and mark it as optional (“?” quantifier, no or one repetition):

    ((http|https):\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)?
    
    Peter Morris : Down voted because the suggested expression returns True for IsMatch("asd");
    Gumbo : You expression didn’t consider this neither.
  • Expr? where Expr is your URL matcher. Just like I would for http and https: https?. The ? is a known as a Quantifier -- you can look it up. From Wikipedia:

    ? The question mark indicates there is zero or one of the preceding element.

  • If you want to modify the expression to match either an entirely empty string or a full URL, you will need to use the anchor metacharacters ^ and $ (which match the beginning and end of a line respectively).

    ^(|https?:\/\/[\w\-_]+(\.[\w\-_]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?)$
    

    As dirkgently pointed out, you can simplify your match for the protocol a little, so I've included that for you too.

    Though, if you are using this expression from within a program or script, it may be simpler for you to use the languages own means of checking if the input is empty.

    // in no particular language...
    if input.length > 0 then
        if input matches <regex> then
            input is a URL
        else
            input is invalid
    else
        input is empty
    
    Peter Morris : Accepted as the answer because you were the only person to mention the ^ and $ required, without which simply adding the ? made any pattern match. Thanks!

0 comments:

Post a Comment