TV version (Display Regular Site)

Skip to: Navigation | Content | Sidebar | Footer


Weblog Entry

Not a Test

January 03, 2009

It’s a new year, so it’s time for a slight change of direction. You may have noticed your feed reader of choice just barfed up a few dozen posts from these here parts. I’m hoping that little bit of necessary unpleasantness will be one time only.

I’ve come to realize that my content-creating has become a lot more distributed, which means the long-form post format of this site has been seeing less and less love in recent years. Much has been written about Twitter killing the urge to write longer blog posts, and I won’t dispute that as a cause. I liked Andy Budd’s take on why his site has been suffering, I can relate to a lot of those reasons.

So for the past month I’ve been working on a way of piecing together content I produce on other sites and funnel relevant bits into a stream that I could present on this site.

You’re now seeing the result. I’m merging my traditional posts with links from Delicious and Google Reader (which is what I was up to when I wrote about the latter’s API), photos from Flickr, and Twitter posts (or tweets, if you prefer). The home page, archives, and primary Atom feed all work on this new system.

Totally nuts, right? The volume will be too high, and nobody wants to see every photo I upload or hear every inane thought I come up with while out for dinner. So that’s why I’m exercising editorial control and only bringing over the bits and pieces I’ve hash- or machine-tagged.

Photos on Flickr, for example, can be tagged as either mezzoblue:post=description or mezzoblue:post=photo. Both tags will show the photos here, in slightly different configurations (see the bottom of the August 2008 archive for both).

On Twitter I’m using a hash tag (#mb) which shows up in the original, but I’m stripping from the on-site version. Google Reader pulls in shared items tagged with mezzoblue. And I’m just throwing in everything from Delicious for now, since I got into the habit of using it for the now-deprecated mezzoblue Dailies.

I’m still not sure if I’m going to write up the scripts I built to make this happen, or package it up into some kind of actual open source release. I think the latter way would be more interesting, but there’s a lot of work that would have to happen to get to something even slightly worthy of putting out there for public consumption.

Now I do realize that not everyone will want this of course, so the way this site used to work isn’t gone. You can follow the clutter free post-only feed, or browse just my original posts on the traditional archive pages. Both are accessible from the main archives page, and will continue to exist. It’s just the defaults that have changed, but you can go ahead and ignore all the new stuff if you want.

Expect a few bugs as I stress-test my scripts live over the next few weeks, and let me know if you find anything horribly wrong.

Update: and first major bug has been found: the full Atom feeds weren’t ready for prime time at all. For now I’ve backed out and made the default feed post-only again until I can figure out what’s causing old items to duplicate. Sorry about the collateral damage to your feed reader.


January 04, 00h

That’s similar to what I’m doing on my relaunched site, except I’m running cron jobs on the server to make the API calls (Tumblr/Delicious, All Consuming, Flickr, and Twitter in my case) and then inserting the data directly into my site database. Different RSS feeds then give visitors the choice of ‘blog only’ or ‘blog plus links’.

The reason for importing the data into my own database is that it affords the opportunity to present the link (and my comments) in context, and invite comments (similar to how Jeff Croft’s site handles his linklog).

I hadn’t considered including tweets in my RSS - I don’t think I’ve ever said anything that interesting on Twitter…

January 04, 00h

I’m not much of a developer. (Read: Scripting way outta my league)

…But I’d LOVE a way to do a daily digest of my Tumblr posts. Since I have my Tumblr reading in everything all I need is a way to do a digest of that one RSS feed.

Geoff says:
January 04, 00h

Very cool. I’ve wanted to do the same for a long time now, and just recently found a tool that may make this a bit easier if anyone else is interested in checking it out.

It’s still very beta, but it may be a good starting point for some enterprising php hackers:

http://www.sweetcron.com/

It lets you add a bunch of rss feeds and allows post processing of the feeds, so you can have the results look any way you like, and is pretty easy to set up.

Kenneth says:
January 04, 04h

Thanks for the post-only feed! Your link, though, seems to have some draft text stuck inside the (non-functional) link.

Ethan says:
January 04, 06h

Given my own quasi-tumblelog canoodlings I suppose I’m a bit biased, but I quite like having all the various bits you’ve collected in one place. Good stuff, sir.

Beto says:
January 04, 07h

I think all of us who have been on the web for a while are ending up doing this sort of thing in a way or another. The personal site is no longer the end-all, be-all of content it used to be, but rather a place we can use as a hub to concentrate all the bits and pieces we generate that end up scattered all over the web now. Zeldman summarized this brilliantly a while ago ( http://is.gd/9nb ).

That’s precisely what I am doing on my own site as well. If you see the home page, besides the main menu and doodles, there’s the latest three Twiter posts displayed, the latest 15 or so Flickr photos, and the Delicious links I personally choose to appear on the site by means of a special tag.

Since I’ve never been much of a programmer I’m pulling all this out by means of a few customized Wordpress plugins, and I wish I knew a way to select the tweets I want to show on the site by means of a tag as you suggest above (I’d rather not have replies and other irrelevant messages show up there). Will be interesting to see how your developments on these matters will turn out.

Mike D. says:
January 04, 10h

It sounds like this will be the first recognizable web trend in 2009. I’m in the process of making a similar conversion myself but will be doing so using Matthew’s logic above. I still want Mike Industries to be permanent home for all of that stuff… not some random third party services I happen to be using at the time. You’ve probably set yours up so they live in your database, which is great, but I think the ability to comment on everything is essential (although some people I know feel the opposite way and are turning comments off on all of their stuff).

I like your machine-tagging approach to what goes on the site. Hadn’t thought of that. I ran a poll ( http://www.mikeindustries.com/blog/archive/2008/12/what-should-go-in-a-default-rss-feed ) to see what people wanted in the default feed, and Twitter clearly *didn’t* make the cut (not a surprise), but with your machine-tagging approach, you can selectively put that stuff in there which is great.

Now more than ever am I glad I made the switch to WordPress last year, as WP-O-Matic seems to do everything I need to meta-publish from RSS feeds with a simple plug-in.

Dave S. says:
January 04, 11h

@Matthew Pennell - I took a slightly different approach. I initially tried to avoid storing anything locally and run it all off RSS feeds, but realized I had to use the APIs in order to get anything past the most recent 15 or so items.

In the end I gave in and went with a file system-based cache instead of a database. Nothing’s running live, unless the cache has expired (some live for 15 mins, older ones live forever). It’s good enough for now, and did manage to give me enough control to write out custom Atom feeds with and without the extra stuff.

@Joey Baker - if Tumblr has an API that can return an RSS/Atom/other XML file for a specific date range (the date range is essential for the way I’ve got things set up) then it could fairly easily be adapted to work with the scripts I’m using.

@Geoff - hadn’t heard of Sweetcron, but it looks close to what I’m doing here. Probably even closer to what Matthew Pennell describes in his comment.

@Kenneth - fixed.

@Ethan - and thank you sir. I was a bit leery that the increased volume would be a problem, but so far the response seems to be positive. Having opt-out ability is pretty key, I’d say.

@Beto - the Twitter Search API is what I used. Getting the data in from the API actually wasn’t terribly difficult with a bit of PHP knowledge and a few minutes reading the documentation, it was the processing and doing something useful with it that took a bit more work. I’m using a PHP library called MiniXML to parse the query results, and then my own custom stuff to do the rest. Not sure if that helps a non-programmer, but I don’t really consider myself much of one and I figured it out, so hopefully it’s not that hard. I’ll see what I can do about providing some more concrete code that will help you out.

@Mike D. - I thought about permalinks and comments for all the off-site content here on mezzoblue. I didn’t do permalinks, but that’s next on my list.

I chose not to do comments though. I used to run a linkblog (Dailies) that was first its own Movable Type blog, then lived in Delicious. In the first incarnation comments were open, but after a few hundred links there were only a handful of useful comments. So I’m of the mind that comments on links are pretty much pointless. Photo comments could be a bit more meaningful, but I’m just passing the comments through to Flickr cause it doesn’t make sense to have those live in two different places. (Maybe one day I’ll do some slick data back and forth to post comments to Flickr directly from a form on here, but I’ve thrown back enough Tylenol already for now)

January 04, 16h

Definitely a trend for 2009, And also looking for do this with cron jobs ( actually looking for a easier way, messing with the wp admin, to make it control the where everything should go, from it’s own interface )

Let’s how this works :)

And maybe we should start a list or something to help other to do this … even if we do not completely open our sources…

January 04, 18h

While I did frown a bit in hearing that another of my rss-ed blogs will be posting more often this new year, I’ll hopefully be getting used to the idea of twitter-as-a-blog-post.

I’m sure I won’t be disappointed; if the twitter, flickr, and delicious pieces pass the same quality control tests as the other posts on this site, I’ll be in for an enjoyable ride.

January 04, 20h

I like that you cached it locally in the FS. I do it differently on a few sites, but I mostly cache to a local DB - allowing it to be used in connection with a bigger picture of things. Somewhat of 2 layers of caching, pulling in the API (by cron + rake task), then page/action caching different elements. Still gives me the speed factor, while also giving me the flexibility to use the data models however I see fit.

Not sure this is a new trend (per Mike D.) - but I do see more and more people moving towards pulling their content into their own domain. I like to push and pull the content, but ultimately it’s about really interacting with the APIs.

I love that you go above and beyond the typical JS widgets - those seem almost useless to me. The real power comes when you can work, form, and shape that data into a bigger system. Well done :)

January 05, 00h

Another option to hash-tagging tweets is to fave your own tweets and then use the API to call in from your fave list. Either include just your tweets (or everybody’s if you feel like sharing).

January 05, 09h

I’d be very excited to get ANY sort of peek behind the curtain, especially to see your technique for Flickr. I’m another of the masses who are starting to tackle this right now, and I’d hate to re-invent the wheel.

Beyond the news, though, I just thought I’d issue a more general compliment for your entire site. I’ve visited many times over the years, but for some reason I’ve never really taken in the completeness of it all. It’s a very well-cared-for little home on the web.

January 05, 11h

I’ve been looking at doing this on my site for some time now, but scripting it from scratch seems to be well over my head. I’ve looked into some EE plugins, but haven’t found anything suitable yet. Perhaps 2009 will be the year I get it figured.

Dave S. says:
January 05, 12h

@Spencer - that’s certainly the goal. I’m planning on using them sparingly, and only if they feel on-topic and relevant for this site.

@Nate - I started out actually using a PHP caching library for the raw source, but realized that the hit of parsing XML on every page request made that a little silly. In the end I ditched the library and just wrote everything directly to PHP files.

And yeah, the JS widgets never really felt like a good solution in my mind. Tack-on content is a second class citizen, and I’d like to do it in a way that makes everything I chose to post here first class.

@Jonathan Snook - good call, never thought of that. Actually I don’t think I knew you could even favourite your own, ala Flickr. The big downside I can see is that other services make assumptions about why someone Favourites (cough Favrd cough) that a) wouldn’t be true in that case, and b) make me look like a self-promoting tool. Hash tags on tweets are lame, but for now seem like the best fit.

@punkassjim - alright, will see what I can do. I might write up some of the various stages (data acquisition, parsing, caching, and display) in separate posts, but I’d have to modify my source to do it in a way that each build off each other. Still, that might be the best way to get this out there. For now I’d recommend taking a look at the Flickr REST API, it’s well-documented, and armed with a bit of PHP knowledge and the MiniXML library you can probably go further than you’d think in a few hours of tinkering.

@Jason Landry - yeah, I wouldn’t say it’s a small undertaking, there are a lot of individual little problems to solve along the way that can be head-scratching. You might want to check out the sweetcron.com tool mentioned in a previous comment, still in beta but might be what you need.

Dave S. says:
January 05, 12h

Oh, and a general note in case you missed the post update – the full Atom feed wasn’t working out, so I’ve backed out to post-only for now. Every time a source updated, all the posts from that source refreshed in the feed, which is more than a little annoying.

Mike D. says:
January 05, 12h

Can I just say that I really like punkassjim’s name? kthxbye

January 05, 14h

Hahaha, that’s the first positive feedback I’ve gotten for my screen name in probably 10 years. Usually, I just get a leery glance from family members.

@Dave - Thanks a ton for even considering posting about it. I’ll look into the REST API reference though…it’ll be good to see how much of it I can slog through on my own.

Matt says:
January 05, 15h

Has anyone tried the lifestream plugin for Wordpress to accomplish something like this - http://wordpress.org/extend/plugins/lifestream/

January 05, 19h

@Matt - It’s a good resource, and it actually gets me about 80% to where I want to be. But, two things: 1) I’d like to keep the number of ready-made plugins down, and 2) since it’s just based on RSS feeds, there isn’t as much information to play around with, so it kinda limits the possibilities. For example, in Dave’s system, with every Flickr post, it shows how many comments that photo has at any given time.

January 06, 10h

Now I’m curious: what’s the behavior model for the Flickr comment links? Do you run a cron job or something to occasionally update the number of comments on all linked photos?

Dave S. says:
January 06, 11h

@punkassjim - nope, they’re coming straight out of the API along with the rest of the data. You have to run two queries, one to search for photo IDs that meet your criteria, then a second to get the info from those photos. The second one is when the comments show up, then I just plug ‘em in with everything else.

In terms of how often they’re updated, well, that’s still a problem I’m working through. Relatively recent data is being refreshed once an hour (you could cron it, or you could just check to see if the cache has expired when someone loads the page and run the request live if it has, which is what I’m doing).

But the flaw in my plan is that Twitter’s Search API promises to only show results for the last six months, so older than that and if I refresh my cache to get Flickr comments (or even blog comments, though usually after six months they’re closed anyway) I lose Twitter posts for that range. Right now everything older than 6 months is permanently cached. The only way I can see around this is putting it all into a database, which I so didn’t want to have to do.