John Gruber’s SmartyPants script is a great tool for converting a body of text into a more typographically correct and visually appealing version. Chad Miller’s Python port makes it really easy to implement this functionality as a template tag, but I wanted to take it a step further and use SmartyPants as a Markdown pre-processor, just as I did with Pygments.
The solution is actually very simple, but the actual RegEx required to do the job took me a bit of time to put together. The problem was to properly ignore code blocks and apply the SmartyPants filter on all areas outside the code examples. Here is the final Markdown pre-processor that I came up with:
class SmartyPantsPreprocessor(TextPreprocessor):
"""
A Markdown preprocessor that implements SmartyPants for converting plain
ASCII punctuation characters into typographically correct versions
"""
pattern = re.compile(r'(.+?)(@@.+?@@end|$)', re.S)
def run(self, lines):
def repl(m):
return smartyPants(m.group(1)) + m.group(2)
return self.pattern.sub(repl, lines)
The pattern looks for every piece of text that comes before a text block or before the end of the content, then filters the first part through SmartyPants, and passes the code blocks as they are. Finally, I make sure to register this processor before my Pygments pre-processor:
md = Markdown()
md.textPreprocessors.insert(0, SmartyPantsPreprocessor())
md.textPreprocessors.insert(1, CodeBlockPreprocessor())
One last thing that I did was to implement Markdown2 instead of the old python-markdown library. This new version is much faster, and backwards compatible:
try:
from markdown2 import Markdown, TextPreprocessor
except ImportError:
from markdown import Markdown, TextPreprocessor
Good typography makes me happy!
Marty, these are both excellent suggestions, and I didn’t know about either! typogrify is especially interesting since my next step was going to be implementing Widon’t and I’ll definitely use the template tags that this library provides.
On the other hand, since I would like to cache the results and parse only content that’s outside the code blocks, I still like my implementation better in some ways.
Thanks a lot for the thoughtful comment!
© Copyright 2001-2010 Taylan Pince. All rights reserved.
On one hand, I was going to suggest you also look into typogrify, since it applies SmartyPants as well as a few other useful typographical improvements. But since you’re also interested in Markdown and seem to have already shown interest in Pygments, I wonder if you’ve seen James Bennett’s typygmentdown snippet.