Getting rid of old feeds

So. I’m learning Perl and needed a simple project to test out. My rss2email list has more than a 100 feeds that are checked hourly. Unfortunately I have no way of knowing which feeds are broken, or no longer being updated. That seemed like a simple enough thing to test out. Enter check_feeds.pl. This simple script accepts a single file as input (the file contains rss feeds, one per line). Running it gives a list of broken feeds, and those that have not been updated in the past two months.

I used XML::Feed to fetch and parse the feeds. It did crash on at least a dozen feeds in my list, so I regrettably had to comment them out for now. Future versions should fix this to make it more usable. Here is what a sample run looks like:

thaum ~/code/perl$ ./check_feeds.pl feedlist.txt
Feeds older than60 days  are:
http://teddziuba.com/atom.xml
http://feeds.feedburner.com/emacsblog
http://emacs.wordpress.com/feed/
http://acmel.wordpress.com/feed/
http://feeds2.feedburner.com/LinuxSystemAdminsBlog
<snip>

Dead feeds that could not be read are:
#bad#http://www.debian.org/News/weekly/dwn.en.rdf
#bad#http://stevehanov.ca/blog/index.php?atom
#bad#http://www.nintendolife.com/feeds/reviews
<snip>


And here is the actual file itself.
Update: Ok, I finally signed up on GitHub, so the script is available there as well. Hopefully a few months down the line the quality of my code will improve :)

Related articles





 

Enhanced by Zemanta

Leave a Reply