Step 4) Excluding Pages
Using the Meta Robots plugin from Yoast.com ensure to NoIndex certain sections of your site because if you don’t:
- Search engines will find duplicate content everywhere and they will not know which copy is the primary copy to show in search results; which means they will have to guess (never ideal).
- It is highly likely you will have a link popularity spread out amongst the various copies of your article; because various copies of your article will be available for people to find. Since link popularity (links to your content) are a major factor for attaining top search engine rankings you will be spreading yourself too thin. Just imagine how much more authority that same piece of content would have if all of the links pointed to just one copy!
- Often parts of your site where the content resides is not all that intuitive for people to navigate – at least in terms of relevance to the article. For example, what if the copy of your article within your date archive received the most visits? Is a date archive really the best place for visitors to enter and stay engaged enough to surf the rest of your site?
Highlighted: I highlighted “The login and register pages” because I highly recommend preventing the indexing of that area – unfortunately my system or theme is glitching and my command to do that will not hold at the moment.
Commentary on RSS Feeds:
- Noindex the comment RSS feeds: definitely do this take away any allure that your site could be a haven for comment spam.
- Noindex all RSS feeds: I think this should be left open because RSS feeds are an excellent mechanism for getting news out quickly to news aggregation engines and even the search engines. Feeds are expected to be timely by nature so if your site is at all busy you can bet your feed will be watched closely for updates.
Commentary on “Prevent Indexing”:
I chose to disable 3 out of 4 of the archives with the exception of the “Category Archives” because categories will be the main route for the search engines to index this site’s content. This is particularly true on my company’s blog where the categories are the key navigational element. I may, however, even choose to open up the tag element as well since it provides a lot of context for search engines to index; I am not ready to do that at the moment.
A note about the “noarchive meta tag”: leave this alone unless you have some reason to want to deny the search engines the opportunity to keep a copy of how your page looked when they indexed it.
Commentary on “DMOZ and Yahoo! Directory”:
These settings should be enabled to prevent the Open Directory Project and Yahoo! Directory from guessing the topic of your website from the content on your site. This is a wise move because frankly, those directories do a horrible job at guessing.
Commentary on “Archive Settings”:
- Disable the author archives: I blocked author archives because I have no reason to want them indexed. That said, if your site has a great many authors they may appreciate that you left this open for them so they can get some additional attention for their contributions to your site.
- Disable the date-based archives: unless your site is a newspaper-type website where date plays a huge role in categorizing content I think it is best to block indexing of this archive.
- Redirect search results pages when referrer is external: my experience with this block is minimal so I am testing it right now. From what I understand there is no real need to block this section since search results are already blocked in the “Prevent Indexing” section.
Commentary on “Internal nofollow Settings”:
In this application, nofollow attributes are used to block search engines from even following a link to a section on your site; usually because the section has no merit being indexed by a search engine (such as a login page). Over the years that has changed slightly because it became well known that nofollow attributes could be used to save Google PageRank and funnel it towards more important links and content. That has changed, however, so it is not all that it once was. I can’t get into the details but here is some reading to start you off if you want to delve into this somewhat advanced SEO subject. At any rate, what I am getting at is that applying the nofollow attribute is far less powerful than it once was so I chose to apply it sparingly instead of going hog wild with it. Whatever you do, it is a personal choice at this point – except with comment links which I STRONGLY recommend applying the nofollow so that your site isn’t targeted by comment spammers.
- Nofollow category listings on pages: categories are a key content stream for this site and for the StepForth Web Marketing Blog so I chose to leave this open.
- Nofollow category listings on single posts: I see the point in this from the old nofollow perspective but I am choosing to leave this open, at least for now.
- Nofollow outbound links on the frontpage: this is often enabled to limit the bleed of PageRank from the home page. I chose to block only particular outbound links on my home page instead of all of them; by blocking them one by one.
- Nofollow the links to your tag pages: I have chosen to add the nofollow to tags for now while I build out that avenue of my website.
- Nofollow login and registration links: I see no reason why anyone would want to leave these open to search spiders so definitely nofollow these.
- Nofollow comment links: EXTREMELY IMPORTANT! Comment spammers are like piranhas in a freeding frenzy when they find sites that have comment sections that pass PageRank. Definitely add nofollow here.
- Replace the META widget with a nofollowed one: this is important only if you plan on keeping the Meta widget visible in your theme. If you are, then it makes perfect sense to block the links since they offer no benefit to search engines.
Thanx for sharing this information!