Using htaccess to stop Bad Bots from stealing bandwidth and crashing your server

advertisement logo

 

Few days ago my site was hit by a bunch of really bad bots which crawl my site continuously until it overloads my web server. Now I'm publishing a way to block these so-called bad robots from ruining your website by their crazy crawling method.

Assuming you are using Apache Http server, create .httaccess file and append this line to the newly created file.

CODE:
  1. <ifmodule mod_rewrite.c>
  2. RewriteEngine On
  3. RewriteCond %{HTTP_REFERER} q=Guestbook [NC,OR]
  4. RewriteCond %{HTTP_USER_AGENT} ^BlackWidow [OR]
  5. RewriteCond %{HTTP_USER_AGENT} ^Bot\ mailto:craftbot@yahoo.com [OR]
  6. RewriteCond %{HTTP_USER_AGENT} ^CherryPicker [OR]
  7. RewriteCond %{HTTP_USER_AGENT} ^ChinaClaw [OR]
  8. RewriteCond %{HTTP_USER_AGENT} ^Crescent [OR]
  9. RewriteCond %{HTTP_USER_AGENT} ^Custo [OR]
  10. RewriteCond %{HTTP_USER_AGENT} ^DISCo [OR]
  11. RewriteCond %{HTTP_USER_AGENT} ^Download\ Demon [OR]
  12. RewriteCond %{HTTP_USER_AGENT} ^eCatch [OR]
  13. RewriteCond %{HTTP_USER_AGENT} ^EirGrabber [OR]
  14. RewriteCond %{HTTP_USER_AGENT} ^EmailCollector [OR]
  15. RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon [OR]
  16. RewriteCond %{HTTP_USER_AGENT} ^EmailWolf [OR]
  17. RewriteCond %{HTTP_USER_AGENT} ^Express\ WebPictures [OR]
  18. RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro [OR]
  19. RewriteCond %{HTTP_USER_AGENT} ^EyeNetIE [OR]
  20. RewriteCond %{HTTP_USER_AGENT} ^FlashGet [OR]
  21. RewriteCond %{HTTP_USER_AGENT} ^GetRight [OR]
  22. RewriteCond %{HTTP_USER_AGENT} ^GetWeb! [OR]
  23. RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla [OR]
  24. RewriteCond %{HTTP_USER_AGENT} ^Go-Ahead-Got-It [OR]
  25. RewriteCond %{HTTP_USER_AGENT} ^GornKer [OR]
  26. RewriteCond %{HTTP_USER_AGENT} ^GrabNet [OR]
  27. RewriteCond %{HTTP_USER_AGENT} ^Grafula [OR]
  28. RewriteCond %{HTTP_USER_AGENT} ^HMView [OR]
  29. RewriteCond %{HTTP_USER_AGENT} HTTrack [NC,OR]
  30. RewriteCond %{HTTP_USER_AGENT} ^Image\ Stripper [OR]
  31. RewriteCond %{HTTP_USER_AGENT} ^Image\ Sucker [OR]
  32. RewriteCond %{HTTP_USER_AGENT} Indy\ Library [NC,OR]
  33. RewriteCond %{HTTP_USER_AGENT} ^InterGET [OR]
  34. RewriteCond %{HTTP_USER_AGENT} ^Internet\ Ninja [OR]
  35. RewriteCond %{HTTP_USER_AGENT} ^Irvine [OR]
  36. RewriteCond %{HTTP_USER_AGENT} ^Java [OR]
  37. RewriteCond %{HTTP_USER_AGENT} ^LWP [OR]
  38. RewriteCond %{HTTP_USER_AGENT} ^lwp [OR]
  39. RewriteCond %{HTTP_USER_AGENT} ^JetCar [OR]
  40. RewriteCond %{HTTP_USER_AGENT} ^JOC\ Web\ Spider [OR]
  41. RewriteCond %{HTTP_USER_AGENT} ^larbin [OR]
  42. RewriteCond %{HTTP_USER_AGENT} ^LeechFTP [OR]
  43. RewriteCond %{HTTP_USER_AGENT} ^Mass\ Downloader [OR]
  44. RewriteCond %{HTTP_USER_AGENT} ^Microsoft.URL [OR]
  45. RewriteCond %{HTTP_USER_AGENT} ^MIDown\ tool [OR]
  46. RewriteCond %{HTTP_USER_AGENT} ^Mister\ PiX [OR]
  47. RewriteCond %{HTTP_USER_AGENT} ^Mozilla.*NEWT [OR]
  48. RewriteCond %{HTTP_USER_AGENT} ^Navroad [OR]
  49. RewriteCond %{HTTP_USER_AGENT} ^NearSite [OR]
  50. RewriteCond %{HTTP_USER_AGENT} ^NetAnts [OR]
  51. RewriteCond %{HTTP_USER_AGENT} ^NetSpider [OR]
  52. RewriteCond %{HTTP_USER_AGENT} ^Net\ Vampire [OR]
  53. RewriteCond %{HTTP_USER_AGENT} ^NetZIP [OR]
  54. RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO [OR]
  55. RewriteCond %{HTTP_USER_AGENT} ^Octopus [OR]
  56. RewriteCond %{HTTP_USER_AGENT} ^Offline\ Explorer [OR]
  57. RewriteCond %{HTTP_USER_AGENT} ^Offline\ Navigator [OR]
  58. RewriteCond %{HTTP_USER_AGENT} ^omniexplorer_bot [NC,OR]
  59. RewriteCond %{HTTP_USER_AGENT} ^PageGrabber [OR]
  60. RewriteCond %{HTTP_USER_AGENT} ^Papa\ Foto [OR]
  61. RewriteCond %{HTTP_USER_AGENT} ^pavuk [OR]
  62. RewriteCond %{HTTP_USER_AGENT} ^pcBrowser [OR]
  63. RewriteCond %{HTTP_USER_AGENT} dloader(NaverRobot) [OR]
  64. #RewriteCond %{HTTP_USER_AGENT} ^puf [NC,OR]
  65. #RewriteCond %{HTTP_USER_AGENT} ^RealDownload [OR]
  66. RewriteCond %{HTTP_USER_AGENT} ^ReGet [OR]
  67. RewriteCond %{HTTP_USER_AGENT} ^SearchExpress [OR]
  68. RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [OR]
  69. RewriteCond %{HTTP_USER_AGENT} ^SmartDownload [OR]
  70. RewriteCond %{HTTP_USER_AGENT} ^SuperBot [OR]
  71. RewriteCond %{HTTP_USER_AGENT} ^SuperHTTP [OR]
  72. RewriteCond %{HTTP_USER_AGENT} ^Surfbot [OR]
  73. RewriteCond %{HTTP_USER_AGENT} ^Siphon [OR]
  74. RewriteCond %{HTTP_USER_AGENT} ^tAkeOut [OR]
  75. RewriteCond %{HTTP_USER_AGENT} ^Twiceler [OR]
  76. RewriteCond %{HTTP_USER_AGENT} ^Teleport\ Pro [OR]
  77. RewriteCond %{HTTP_USER_AGENT} ^VoidEYE [OR]
  78. RewriteCond %{HTTP_USER_AGENT} ^Web\ Image\ Collector [OR]
  79. RewriteCond %{HTTP_USER_AGENT} ^Web\ Sucker [OR]
  80. RewriteCond %{HTTP_USER_AGENT} ^WebAuto [OR]
  81. RewriteCond %{HTTP_USER_AGENT} ^WebBandit [OR]
  82. RewriteCond %{HTTP_USER_AGENT} ^WebCopier [OR]
  83. RewriteCond %{HTTP_USER_AGENT} ^WebFetch [OR]
  84. RewriteCond %{HTTP_USER_AGENT} ^WebGo\ IS [OR]
  85. RewriteCond %{HTTP_USER_AGENT} ^WebLeacher [OR]
  86. RewriteCond %{HTTP_USER_AGENT} ^WebReaper [OR]
  87. RewriteCond %{HTTP_USER_AGENT} ^WebSauger [OR]
  88. RewriteCond %{HTTP_USER_AGENT} ^Website\ eXtractor [OR]
  89. RewriteCond %{HTTP_USER_AGENT} ^Website\ Quester [OR]
  90. RewriteCond %{HTTP_USER_AGENT} ^WebStripper [OR]
  91. RewriteCond %{HTTP_USER_AGENT} ^libwww [OR]
  92. RewriteCond %{HTTP_USER_AGENT} ^WebWhacker [OR]
  93. RewriteCond %{HTTP_USER_AGENT} ^WebZIP [OR]
  94. #RewriteCond %{HTTP_USER_AGENT} ^Wget [OR]
  95. RewriteCond %{HTTP_USER_AGENT} ^Widow [OR]
  96. RewriteCond %{HTTP_USER_AGENT} ^WWWOFFLE [OR]
  97. RewriteCond %{HTTP_USER_AGENT} ^Xaldon\ WebSpider [OR]
  98. RewriteCond %{HTTP_USER_AGENT} ^Zeus [OR]
  99. RewriteCond %{HTTP_USER_AGENT} ^ZyBorg
  100. RewriteRule .* - [F,L]
  101. </ifmodule>

This will prevent a badly written http crawler bot from accessing your website, thus saves you from wasting those precious bandwidth and your server's CPU resources.

Tags: , , , , , , ,

Bookmark this article These icons link to social bookmarking sites where readers can share and discover new web pages.
  • digg
  • YahooMyWeb
  • NewsVine
  • Netvouz
  • Reddit
  • Spurl
  • Furl
  • del.icio.us
  • StumbleUpon
  • Technorati
  • TwitThis

Keep updated with this website! : Subscribe to your email

Recommended Reading

4 smashing comments for this post.

  1. Irwan Said:

    Akan ku amalkan cara ini kiranya blog ku mengalami symptom yg sama.

  2. Azmeen Said:

    I think you’re missing the last line. Maybe something like:

    RewriteRule .* - [F,L]

  3. mypapit Said:

    forget to include it.. thanks

  4. Bryan A. Smith Said:

    I have just “installed” your suggested *htaccess file at the website I maintain, and have had NO problems (500 server errors).

    I attempted to contact you via your “contact form”, as I didn’t want to clog your blog with a bunch of blather about my situation with “bad bots” — but found it confusing and was unable to get the form to take my message.

    Please advise.

    Regards and thanks

Leave a Comment

Subscribe by email

Enter your Email