Simple Effective PHP REGEX URL Validate
how about this i made yesterday;
THE CODE - one liner
$urlregex = "^(https?|ftp)\:\/\/([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*(\:[0-9]{2,5})?(\/([a-z0-9+\$_-]\.?)+)*\/?(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?(#[a-z_.-][a-z0-9+\$_.-]*)?\$";
if (eregi($urlregex, $url)) {echo "good";} else {echo "bad";}
(OPTIONAL: READ BELOW FOR EXPLANATION)
it will validate all these types of urls
// valid urls
$url = "https://user:pass@www.somewhere.com:8080/login.php?do=login&style=%23#pagetop";
$url = "http://user@www.somewhere.com/#pagetop";
$url = "https://somewhere.com/index.html";
$url = "ftp://user:****@somewhere.com:21/";
$url = "http://somewhere.com/index.html/"; //this is valid!!
THE CODE - broken into section for easy editing and understanding:
// SCHEME
$urlregex = "^(https?|ftp)\:\/\/";
// USER AND PASS (optional)
$urlregex .= "([a-z0-9+!*(),;?&=\$_.-]+(\:[a-z0-9+!*(),;?&=\$_.-]+)?@)?";
// HOSTNAME OR IP
$urlregex .= "[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)*"; // http://x = allowed (ex. http://localhost, http://routerlogin)
//$urlregex .= "[a-z0-9+\$_-]+(\.[a-z0-9+\$_-]+)+"; // http://x.x = minimum
//$urlregex .= "([a-z0-9+\$_-]+\.)*[a-z0-9+\$_-]{2,3}"; // http://x.xx(x) = minimum
//use only one of the above
// PORT (optional)
$urlregex .= "(\:[0-9]{2,5})?";
// PATH (optional)
$urlregex .= "(\/([a-z0-9+\$_-]\.?)+)*\/?";
// GET Query (optional)
$urlregex .= "(\?[a-z+&\$_.-][a-z0-9;:@/&%=+\$_.-]*)?";
// ANCHOR (optional)
$urlregex .= "(#[a-z_.-][a-z0-9+\$_.-]*)?\$";
// check
if (eregi($urlregex, $url)) {echo "good";} else {echo "bad";}
all the lines in the code above can be safely removed (except for hostname) if you don't want to allow some URL segment (if you don't want getqueries in your urls, just comment the respective $urlregex .= ....) - but do not reorder them.
the "(optional)" states that the part MAY exist, but url will be valid even if it doesn't contain the part (see the valid urls above).
syntax:
<http[s]|ftp> :// [user[:pass]@] hostname [port] [/path] [?getquery] [anchor]
-taking into account allowed safe characters
-assuming .. (dot dot) is never allowed in hostname or path
FEEDBACK IS APPRECIATED