Subscribe via RSS


And with this, I declare the blog fully operational.

January 28th, 2008 last updated February 22nd, 2008

Alright! No doubt I'll keep making updates from time to time, but I think I have a nearly fully functional blog now, with the ability for users to submit comments! It's pretty straightforward, but I'm going to break down some of the work I did for the hell of it. In the meantime, I invite you all to leave your thoughts. Some basic notes, you're welcome to put a link to your website in the website form, but both it and the [url] BBCode will include rel="nofollow" tags to stop bots from following the links. Hopefully this will discourage anyone from posting just to improve their pagerank, and will help fight spam. Also, your email address will never be displayed to the general public, though I may use it to contact you if I feel you had a comment worthy of followup.

To go into more depth about how I actually built this, pretty straightforward for the most part, the most interesting thing (for me) was building a BBCode interpreter. My interpreter is very basic, it only allows [url], [url=], [b], [code], and [codeBlock] (code being inline, and codeBlock being it's own block) and does not recurse - meaning you cannot wrap tags inside other tags. Of course, you shouldn't really need to with this very limited set of tags, but a limitation nonetheless.

I know I should have just gone out and found a BBCode parser, but I've always preferred writing my own code, so I spent a few hours scrapping my own parser together. I'm really not very pleased with it, even though I believe it does work properly and fail gracefully. It's a pretty ugly piece of code too, so no doubt in the future I'll upgrade, but in the meantime, here it is - don't worry, I'll walk you through it:

// processes a string for BBCode markup // Allowed markup [url], [url=], [b], [code], [codeBlock] function process_markup($string) { // Splits into separate segments $array = split_string_bb($string); foreach($array as $line) { if(substr($line,0,1) == '[') { $newArray[] = process_line_bb($line); } else $newArray[] = $line; } return implode($newArray); }

This is the main function, which actually takes the input, parses it, and outputs the new string. It splits the string into segments, notably any time it finds a valid BBCode tag, it puts that in it's own index in the array. For instance, $array[2] = '[b]How are you doing?[/b]';. From there, it iterates through the array, and parses the BBCode on any index that has it. Any index that doesn't start with [ will not have any BBCode in it.

// returns an array of strings. Only strings starting with [ should be processed. function split_string_bb($string) { $firstBrac = strpos($string, '['); if($firstBrac === false) return array($string); $array[] = substr($string, 0, $firstBrac); $string = substr($string, $firstBrac); $end = strpos($string, ']'); $tag = substr($string, 1, $end-1); if(substr($tag, 0, 4) == 'url=') $tag = 'url'; if($tag == 'url' || $tag == 'b' || $tag == 'code' || $tag == 'codeBlock') { $endTag = '[/'.$tag.']'; $endTagPos = strpos($string, $endTag); if($endTagPos !== false) { $endTagEnd = $endTagPos + strlen($endTag); $array[] = substr($string, 0, $endTagEnd); $string = substr($string, $endTagEnd); $array = array_merge($array, split_string_bb($string)); } else { $nextBrac = strpos(substr($string, 1), '['); $array[] = '[' . substr($string, 1, $nextBrac); $array = array_merge($array, split_string_bb(substr($string, $nextBrac+1))); } } else { $string = '[' . substr($string, 1); $array = array_merge($array, split_string_bb($string)); } return $array; }

This function works through the string recursively, finding each BBCode in the string and putting it into its own index in an array, then appending to it the result of the function call applied to the remainder of the string. Text which is not inside BBCode or which has invalid code such as [coded]me[/code] will be treated in a moderately haphazard way, but at the end of execution, the [ in front of coded will be converted to it's HTMLChar equivalent so that it will not be parsed. This is probably the messiest method of the set, with lots of obtuse and inelegant code. But hey, it gets the job done.

// Parses the BBCode function process_line_bb($string) { $end = strpos($string, ']'); $tag = substr($string, 1, $end-1); if(substr($tag, 0, 4) == 'url=') $tag = 'url='; switch($tag) { case 'url': $string = replaceFirst($string, '[url]', ''); $string = replaceLast($string, '[/url]', ''); if(isURL($string)) $string = '<a href="'.$string.'" rel="nofollow">[link]</a>'; elseif(isURL('http://'.$string)) $string = '<a href="http://'.$string.'" rel="nofollow">[link]</a>'; else $string = '[url]'.$string.'[/url]'; break; case 'url=': $string2 = $string; $string2 = replaceFirst($string2, '[url=', ''); $string2 = explode(']',$string2); if(isURL($string2)) { $string = replaceFirst($string, '[url=', '<a href="'); $string = replaceFirst($string, ']', '" rel="nofollow" >'); $string = replaceLast($string, '[/url]', '</a>'); } elseif(isURL('http://'.$string2)) { $string = replaceFirst($string, '[url=', '<a href="http://'); $string = replaceFirst($string, ']', '" rel="nofollow" >'); $string = replaceLast($string, '[/url]', '</a>'); } break; case 'b': $string = replaceFirst($string, '[b]', '<strong>'); $string = replaceLast($string, '[/b]', '</strong>'); break; case 'code': $string = replaceFirst($string, '[code]', '<code>'); $string = replaceLast($string, '[/code]', '</code>'); break; case 'codeBlock': $string = replaceFirst($string, '[codeBlock]', '<code class="blockLevel">'); $string = replaceLast($string, '[/codeBlock]', '</code>'); break; } return $string; }

This method is a bit long, but does the job fairly straightforwardly. It's passed a string of text which (theoretically) starts with a BBCode tag, and ends with the close for that tag. From there it just replaces the BBcode with the equivalent HTML. The URL tags are more complex, since I tried to prevent invalid URLs from being converted to links.

And that's about it! On the off chance anyone actually wants to use this code, please just link back to my site. If you have any suggestions, or know of any particularly good free BBCode parsers, let me know in the comments!


Guess what?

by Michael! on January 28th, 2008

First post!

What if I want to italicize?

by Eric on February 4th, 2008

You allow url, url=, b, code, and codeBlock.... but not <i> and <u>? Blasphemy!

Too Much Work

by Michael on February 6th, 2008

If I allowed more than one of the basic formatting tags I'd have to implement a different (and I think more complex) parsing algorithm which would deal with the tags stacked on top of each other. This current method only parses the top layer, therfore if you tried [ url=something][ b]text[/b][/url] it would only render the url tag.

New Comment

Your email address will never be displayed or shared. Your comment will appear once approved.