Python and RSS

Python really needs a good (ie. full featured) RSS library. Perl has one (XML::RSS). Java has one (Informa). Python has 2: one that doesn’t do everything that it should with a great API, and one that should do everything but with a less rich API.

FeedParser is a RSS library for Python that does a great job of parsing RSS feeds into an object hierarchy. It’s simplicity is it’s greatest strength. The object hierarchy you get is a pretty straight translation of the XML. As long as you are comfortable with the RSS XML format, you should have no problem using FeedParser. FeedParser is also a pretty liberal parser. It understands all the RSS formats that are out there, and will even try to understand poorly done RSS feeds. It also is done in straight Python and requires no additional libraries to be installed.

But FeedParser is lacking one major feature that I consider paramount in an RSS library: Any way of writing an object hierarchy back out into RSS format. This is what I would call the FeedWriter portion that is missing.

On the other hand, we have RSS.py. This claims to be able to read most RSS feed formats out there and to write RSS 1.0. I was unable to test these claims under windows because RSS.py was complaining about signal.SIGALRM (which I am guessing is because that signal doesn’t exist under windows). I didn’t have much interest at all in trying to track down what the problem was, since this was all for a fun project that I wanted to limit the scope of (so it doesn’t eat up all my time).

Setting aside my problems with getting it to run, RSS.py looks like it has some potential. Problem areas for it would be in different RSS format support. I gather that it’s parsing support isn’t as robust as FeedParser. In addition, it would be nice to suport writing other RSS formats like RSS 2.0 or even Atom.

In the end, I’ll probably settle on using Java and Informa to do my RSS parsing, no matter how unhappy I am with having to compile my code (I really wanted to do this in a scripting language to avoid that step…).