How about adding “Disallow” in Robots.txt? Wouldn’t that be good enough?
This is one of the common mistakes that people do. Adding “Disallow” as shown below in the Robots.txt will only prevent search engines from stop crawling the pages (or directories) you specifiy when they visit your site. If someone has already created a back link to your web pages on their website, search engines can still visit those pages and index them.
Additionally, even if Google respect the Robots.txt code and doesn’t crawl your site, Google can still get information from other sites such as open directory to add your site on its search result. (Watch the video below for more information)
Are there other ways to prevent search engines from indexing my web pages?
I have found a really good video created by Matt Cutt who explains multiple methods and their pros and cons.
According to him, using .htaccess file and password protect pages (directories) is the best way to prevent search engines from crawling.
Conclusion for Robots.txt vs. Noindex Robots Meta Tag
If you want to be sure to not get your web pages from indexed by search engines, use combination of Robots.txt and adding Noindex / Nofollow in the robots meta tag is the way to go. You can also use the “Remove URLs” tool in the Google Webmaster Tools. Bing Webmaster Tools also provides a URL removal tool if you want to be thorough.
I am a SEO consultant who happens to be interested in affiliate marketing.
Unlike general SEO, websites built for affiliate marketing require completely different mindset and strategy.
Join my journey to generate passive income using SEO.