Feed2JS and Spam

By on October 28, 2006
Follow me on Twitter: @varnum 2 comments

Feed2JS is a great tool for reusing RSS feeds on web pages. (See my May 2005 post, It’s Not Stealing, It’s Syndicating, for an overview.)
However — there’s always a ‘however,’ isn’t there — there is a fixable problem. If you run your own copy of Feed2JS on your own server (rather than using Feed2JS’s public version), unscrupulous folks can borrow your script — and your bandwidth — to repurpose other RSS feeds from other sites without your knowledge or permission.
I learned this the hard way when a copy of Feed2JS I manage at my workplace was “borrowed” by someone who was running a fake weblog designed to sell Google ads; the owner of this revenue-driven site was borrowing feeds from other blogs and using my copy of Feed2JS to reproduce them on his site. I was the unwitting intermediary in an unscrupulous, and possibly illegal, reuse of content. (Ironically, I was first made aware of this use of my copy of Feed2JS when another individual else whose commercial site devoted to hair-loss remedies complained to me that my Feed2JS was misappropriating his weblog content on a competitor’s blog…)
So how do you tell if your own version of Feed2JS has been borrowed? Look in the feed2js/magpie/cache/ and feed2js/magpie/cache_utf8 directories. There should be one file in the cache directory for each feed you use. The files have inscrutable names like “ad1cb3ddb313d3f10f9b7d50ec8da638.” There will be one for each RSS feed your script is monitoring. If you use Feed2JS to monitor three RSS feeds, there will be three files in the cache directory. If there are more files than there should be, your script has likely been borrowed.
Feed2JS.org offers directions for restricting Feed2JS to the feeds you want to be reused. With a bit of extra tinkering with the PHP, you can allow feeds from more than one server to be repurposed through your script.

Entry filed under: RSS Tools. Tags: .

Pageflakes & Library Feeds Measuring RSS Usage

2 Comments

  • 1. Andy  |  October 31, 2006 at 5:24 pm

    The “Workaround” makes Feed2JS almost useless. The reason we’re using feeds from other domains is because there is nothing worthwhile on ours.

  • 2. Ken Varnum  |  October 31, 2006 at 5:39 pm

    In my versions of the workaround, I edited the PHP script provided by Feed2JS to check for multiple domains — all the ones that I wanted to be used by my Feed2JS script. So mine has a series of “if… elseif… elseif…” statements, one for each domain that I want to use. So, for example, the “If” clause Feed2JS proposes is amended as follows:

    // This is where it all happens!
    

    Just below that, add the following. Replace rss4lib.com and http://www.rss4lib.com with the sites that *host* the feeds that you want to be reused (by you or anyone else). You can add more “elseif” clauses as needed.

    if (strpos($src,'some.domain.com')) {
    $fail = "no";
    } elseif (strpos($src,'another.domain.org')) {
    $fail = "no";
    } elseif (strpos($src,'still-another.domain.edu')) {
    $fail = "no";
    } else {
    $fail = "yes";
    }
    if ($fail == "no") {
    $rss = @fetch_rss( $src );
    

    This example allows any RSS feed hosted by some.domain.com, another.domain.org, or still-another.domain.edu to be reused — but blocks all others.
    This is manageable for the small number of sources that we harvest feeds from — lots from our own server, and a handful from outside. If you’re reusing lots of feeds from lots of sources, then a better solution might be needed.


Drupal in Libraries

Learn more about Ken's book, Drupal in Libraries

Archives

Date

Category

Advertisement