WyBlog, the best thing about New Jersey since the invention of the 24 hour diner.
Chris Wysocki
Caldwell, NJ
The nine most terrifying words in the English language are "I'm from the government and I'm here to help." - Ronald Reagan
Linkiest
CH 2.0 Info Center
The Jersey Report
Labor Union Report
Memeorandum
Net Right Nation
The Patriot Post Newsletter
Pajamas Media
PJTV
Trending Right
Victor Davis Hanson
J! E! T! S! Jets! Jets! Jets!
OpenVMS.org Portal
AVS Forum
NJ.com Caldwell Forum
The Caldwells Patch
The Jersey Tomato Press

Technorati is indexing me again! They had to make a code change to fix the problem with my blog getting stuck in their queue. Kudos to Eric M. and the guys at GetSatisfaction.com where they have "community powered support for Technorati".
Well, they're "sorta, kinda" indexing me anyway. It's on a 24 hour tape delay or something. So I never get picked up by Memeorandum because they pull from Technorati and Technorati has stuff I posted yesterday listed as my latest blog entry. And that's old news to Memeorandum.
Wankers.
"This site contains copyrighted material the use of which has not always been specifically authorized by the copyright owner. It is being made available in an effort to advance the understanding of environmental, political, human rights, economic, democracy, scientific, social issues, etc. It is believed that this constitutes a 'fair use' of any such copyrighted material as provided for in section 107 of the US Copyright Law. In accordance with Title 17 U.S.C. Section 107, the material on this site is distributed without profit for research and educational purposes."
Recent headlines from my Posterous Blog:
The Yahoo Slurp search engine crawler has gone berserk. It's bombarding my blog with hundreds of page requests per minute. As a result my server is running slower than a 286 with one floppy drive.
Something is seriously broken in Yahoo-land. I understand why they'd want to index my blog. But every other search engine manages to do that in a completely unobtrusive fashion. I don't have 50+ Google bots pounding me night and day! Even Microsoft Bing got it right.
Their "support" web page says to put a time delay into robots.txt. OK, I did that. Slurp is completely ignoring the setting their own support page says to use. I'm at my wits end.
What I really don't get is, Slurp issues multiple simultaneous requests for the same page! Three, 5, 6, even 8 Yahoo bots all try to grab the same blog posting within .01 second of each other.
Then, they all time out because my poor server is overloaded by their constant pounding. So, they regroup and issue the same request again and again. Hour after hour. Day after day.
I emailed Yahoo support to ask for help. They responded with a suggestion to read the support faq where it says to use robots.txt. Uh, gee, thanks. Been there, done that.
So, it's time to go nuclear. Most of the bot requests are coming from one
ip address - 72.30.161.248 - so I've blackholed it. Yes, I know that will
stop Yahoo from including my blog in search results. C'est la vie. Their
crawler has gone rogue and if it's overwhelming my site so nobody else can
reach my blog what good is having it listed in their index?
Posted at 13:00 by Chris Wysocki
[/misc]
Comments |
Archived
|
Perm Link |
Technorati Tags:
Yahoo
Slurp
DOS
|
Tweet
| Previous: Misleading euphoria over short-term TARP "profit" | Next: On September 8, 2009 the indoctrination of our children begins |
| Main | |