################################### # This is a smart robots.txt which logs the ip and user agent of every visitor. # Due to the compatibility issues between different bots and whether they support # wildcards (*), multiple user-agents and end-anchors ($), I am providing different # blocks for some. # # Detected Spider/Bot: None # # Headers Sent: # Content-Type: text/plain # Sitemap: http://degreesofzero.com/index.php?zc=.xml;type=sitemap;get=blogs # Google - Most Important bot # Unfortunately a robots.txt will only stop it crawling certain urls, and NOT adding any # urls which it comes across into its index. So we're relying on a meta noindex tag. User-agent: Googlebot # Don't index mobile versions Disallow: forum/index.php?*;wap Disallow: forum/index.php?*;wap2 Disallow: forum/index.php?*;imode # Default zCommunity Folders Disallow: forum/Languages/ Disallow: forum/lib/ Disallow: forum/Plugins/ Disallow: forum/Sources/ Disallow: forum/Themes/ Disallow: forum/Tutorials/ # Default zCommunity Actions Disallow: forum/index.php?zc=approvearticle Disallow: forum/index.php?zc=approvecomment Disallow: forum/index.php?zc=bcp Disallow: forum/index.php?zc=deletecomment Disallow: forum/index.php?zc=deletearticle Disallow: forum/index.php?zc=deletepoll Disallow: forum/index.php?zc=deletedraft Disallow: forum/index.php?zc=lockarticle Disallow: forum/index.php?zc=lockpoll Disallow: forum/index.php?zc=notify Disallow: forum/index.php?zc=pollvote Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=printpage Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=reporttm Disallow: forum/index.php?zc=help # Now allow bits and then disallow bits Allow: forum/robots.txt$ Allow: forum/index.php$ Allow: forum/index.php?article=*.0$ Allow: forum/index.php?article=*.*0$ Allow: forum/index.php?article=*.*5$ Allow: forum/index.php?blog=*.0$ Allow: forum/index.php?blog=*.*0$ Allow: forum/index.php?blog=*.*5$ # But don't allow these Disallow: forum/index.php?*.comment Disallow: forum/index.php?article=*.comment*0$ Disallow: forum/index.php?article=*.comment*5$ Disallow: forum/index.php?*.new # Bad bot - Often ignores robots.txt - Waste of bandwidth # Despite claiming on their website to be a search engine in development # I'm suspicious as to whether they are a harvester pretending to be SE User-agent: Twiceler Disallow: / User-Agent: W3C-checklink Disallow: / # Stop following PHPSESSID's User-Agent: MJ12bot Disallow: forum/index.php?PHPSESSID # Default zCommunity Folders Disallow: forum/Languages/ Disallow: forum/lib/ Disallow: forum/Plugins/ Disallow: forum/Sources/ Disallow: forum/Themes/ Disallow: forum/Tutorials/ # Default zCommunity Actions Disallow: forum/index.php?zc=approvearticle Disallow: forum/index.php?zc=approvecomment Disallow: forum/index.php?zc=bcp Disallow: forum/index.php?zc=deletecomment Disallow: forum/index.php?zc=deletearticle Disallow: forum/index.php?zc=deletepoll Disallow: forum/index.php?zc=deletedraft Disallow: forum/index.php?zc=lockarticle Disallow: forum/index.php?zc=lockpoll Disallow: forum/index.php?zc=notify Disallow: forum/index.php?zc=pollvote Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=printpage Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=reporttm Disallow: forum/index.php?zc=help # Catch all (remainder) # Will be followed by any bots other than ones identified above # Uses BASIC robots.txt directives without wildcards, end-anchors etc # So Spiders should understand these (including MSNBOT) User-agent: * # Default zCommunity Folders Disallow: forum/Languages/ Disallow: forum/lib/ Disallow: forum/Plugins/ Disallow: forum/Sources/ Disallow: forum/Themes/ Disallow: forum/Tutorials/ # Default zCommunity Actions Disallow: forum/index.php?zc=approvearticle Disallow: forum/index.php?zc=approvecomment Disallow: forum/index.php?zc=bcp Disallow: forum/index.php?zc=deletecomment Disallow: forum/index.php?zc=deletearticle Disallow: forum/index.php?zc=deletepoll Disallow: forum/index.php?zc=deletedraft Disallow: forum/index.php?zc=lockarticle Disallow: forum/index.php?zc=lockpoll Disallow: forum/index.php?zc=notify Disallow: forum/index.php?zc=pollvote Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=printpage Disallow: forum/index.php?zc=post Disallow: forum/index.php?zc=reporttm Disallow: forum/index.php?zc=help