I was browsing the web as usual this week, when I read that A List Apart is taking responses on what you love or hate about the web right now. Without a doubt, the most annoying trend on the world wide web, is the number of junk sites. At least an hour or two of my day is devoted to technical problems, and the web is my number one resource for technical solutions. Google and Yahoo are my default search tools but more and more searches are turning up sites that list partial RSS feeds for other sites. Each of these sites takes at least five minutes of my time to assess and see if they actually have anything of value, so the more sites I find coming up, the more time I waste. In some cases, I am completely given up, and tried other search terms to see if I can hit a real site instead. All this work reduces productivity and makes the web useless to me. In the past most scrapper sites would get their content the hard way, they would scrape other sites using Perl scripts. These sites were plentiful, but they never ranked very well on search engines, and so their existence did not really bother me. Eventually RSS came along and tons of blogs would syndicate their content. This made it easier for webmasters to put together a site made completely of RSS feeds from legitimate sites. The end result was a tide of splogs and scrapper sites that now rank very well in search engines and clog most search results.
Some would argue that the reason why all these splogs exist is that they are financed by online ads that make splogs a good business to be in, and I really cannot argue with that, but what matters to me is that like most other Internet related problems, no one party is responsible. The search engines are hesitant to not rank these sites well, which leads one to wonder if this conflicts with their own advertising business. Then there are the sploggers themselves that have taken it upon themselves to abuse the ideal of the blog. Their chief argument being that Google does something similar with Google News, why can’t they? All of this leads us to the current state of the web, namely RSS pollution has made internet searching a less effective resource for all of us.