Editor's Note: This blog post contains jargon that my reader(s) may find unsuitable. Reader discretion is advised.
- Cruft
- Cruft
- Cruft
- Cruft
- Cruft
I recently went through the arduous process of removing all the posterous cruft that accrued in my blog since I started using their service. Cruft is the little bits of unused data that build up in computers over time. I am using it here to mean crumby markup that doesn't serve a purpose other than taking up bandwidth.
HTML is simple. Note the code below:
<p>This is a paragraph in a paragraph tag</p>
<blockquote>This is a block of quoted text</blockquote>
The above code would output:
This is a paragraph in a paragraph tag
This is a block of quoted text
Simple, right? Except posterous1 inserts all kinds of meaningless tags and classes and even invalid markup like <p/> which as far as I know means nothing to nobody.
Another downside of posterous is that they host all your images. So when I sent them a post with a pic they stored the image on their servers and hotlinked to it from my site even though MetaWeblog API supports uploading images. This is potentially good as it saves on bandwidth, but as I noted in my post on cloud computing, the downside is that when posterous disappears, so do your images. And this happened too, recently posterous suffered a DOS attack that lasted for SIX DAYS.
And yet one more reason why posterous sucks is that they steal your link love. Google gives credit for original content, so when posterous double posts your content to their site you can incur a duplicate content penalty, and since posterous even adds a link back to their site you pass PageRank to the posterous site.
To be fair, my intent of using posterous as a way to update my blog without having to go through the interface may not be what the posterous peeps intended. In the end, without page views you can't monetize, so posterous is being smart by sending all the traffic back to their site (although they aren't upfront about this). Also, there could be other revenue models. (Maybe insert ads in my feed while being upfront about it, and perhaps even share the revenue.)
So I cleaned up the cruft. Got my images back and hosted safely on my site and learned my lesson. In the end I've only myself to blame. Ultimately, if you don't control your own data then you don't really own it. (see also: facebook.)