I launched Phillymetal.com years ago. Its features and layout were extremely basic, necessities of both time (I wanted it launched fast) and ability. The most common request to fix the lack of clickable links. If someone posted a link, you’d have to copy and paste it into your browser. What a drag. This past weekend, I decided that enough was enough and set out to fix it.

I based my goal on the way Facebook handles URLs. I didn’t want to have to put in any goofy HTML or mock-HTML similar to what you’d find in UBB, I wanted it to just see a URL and know what to do with it. Like Facebook, I decided to have it look for URLs that start with http://, https://, or www. Anything else won’t match. Easier said than done! Here’s how it all worked out. Code is below.

Define the regular expression that matches something that starts with either http://, https://, or www then define the rest of the URL.

Use preg_replace to insert “http://” before everything it finds, even the strings that already have the protocol handler. We have to do this to catch the ones that just start with www, even though it adds duplicates.

Create a function that strips out the duplicates and then use preg_replace_callback to call that function.

Preg_replace_callback was the saving grace here. I’m pretty sure I could be doing things a bit more cleanly but I’m new to these functions and this gets the job done very quickly. It’s worked every time, so far. Code is below. WordPress fucked up my indents, so please excuse the mess.

Can you help me improve this? Please, tell me what I did wrong! If you’re curious, I use { and } as my delimiters when defining the regular expressions because I needed characters that wouldn’t appear inside of URLs. There’s probably a better way to do that…

function cleanURL($inputURL)

{

$urlHandlerRegex = “{(http://http://)}”;

$inputURL = preg_replace($urlHandlerRegex, “http://”, $inputURL);

$urlHandlerRegex = “{(http://https://)}”;

$inputURL = preg_replace($urlHandlerRegex, “https://”, $inputURL);

return $inputURL[0];

}

function urlify($text)

{

//Define the regex – possibly http or https OR www. — gotta be one or the other

$regexURL = “{((https?://) (www.))([a-zA-Z0-9-])+.([a-zA-Z0-9\/?=#!-._])+([a-zA-Z0-9\/=#])}”;

if (preg_match($regexURL, $text))

{

$foundURL = preg_replace($regexURL, ‘<a href=”http://$0″>$0</a>’, $text); //wrap it up

$handler = “{(https?://https?://)}”; //defines a protocol handler

$foundURL = preg_replace_callback($handler, ‘cleanURL’, $foundURL);

}

else

{

$foundURL = $text;

}

return $foundURL;

}