Blog

4 Search Engine Optimization Questions Answered!

Enough of myth-busting! :D Today I will answer a few search engine optimization questions, some received from my NuttieZine subscribers and others gotten from public forums. Now, if you think my answers are not up to the par, feel free to kick my butt by posting a comment below! ;)

On to the Q’s and A’s:

Q1:  Should I make sitemaps "noindex,nofollow"?

A1: Once upon a time, that is exactly what I used to do, based on the suggestion of a forum member who claimed to do just that to get "Google Juice". The guy believed that if one doesn’t forbid search engines from indexing and following the sitemap links, he would suffer from duplicate content penalty.

Over time I thought over it and realized that while forbidding search engines from indexing the sitemap seems alright (because it is not a page worth indexing; I mean, there are more important pages that need to be given higher indexing priority), "no following" the whole sitemap seems to be self-defeating. I mean, if you don’t want search engines to find out and follow the links of your sitemap then why create it at all?

From that time onwards I changed the meta tag of the "Head" section of my sitemap page (for those new to meta tags, check this article) to the following:

<meta name="ROBOTS" content="noindex, follow ">

I use this free sitemap generator tool to create sitemaps, and my choice seems to be a good one because Google™ recommends this tool in their list of recommended sitemap generators. From the generated sitemaps, I download only the .xml and .html files; the .xml file comes handy for submitting my sitemap to Google, and the .html file is what I link to from my website’s homepage!

Far from suffering from "duplicate content penalty" (which is by the way a myth), I have gotten more of my useful pages indexed in Google! It seems that due to my making the sitemap page "nofollow", Google did not follow the links on that page and naturally was unable to index some of my pages (because there was no other way to find them except through the sitemap).

It is important to understand that search engine find your webpages in this manner:

Home page=>Internal and External Links in Homepage=>Search Engine Spider Follows these Links, Finds Out and Indexes the Other Pages!

Now, there might be some pages that you don’t link to from your homepage,  and search engines robots would have no way to find them out except through a sitemap. Therefore, the sitemap MUST be linked to from your homepage!

Sure, if you have a "duplicate sitemap" anywhere else on your website, or say, a page full of reciprocal links, you can safely forbid search engines from following and indexing them!

Q2: When is the right time to add content to or build backlinks for my website? Should I wait for Google Pagerank Updates?

A2: Here is the real story about Google Pagerank Updates

So, it does not make much sense to bother yourself with it, does it? A Google PR update hardly means anything to real entrepreneurs; in my humble opinion, its scope and power is limited to the green bar you see on your Google toolbar! ;)

Let your site’s pagerank go up or down, but you should keep adding content and backlinks consistently!   

Yeah, yeah, there used to be a time when I regarded a pagerank boost as the greatest reward for my efforts, but no more. You live and learn, so to speak! :P:

Q3: How do search engines see my website?

A3: There is a lot of speculation going about it. Some people say that search engines can read JavaScript, Flash, Multimedia, etc., while others recommend building only "text-based" websites! Since I am an ordinary guy and not any seo guru, I prefer to go by what Google says in its webmasters guidelines. Google says that "Most spiders see your site much as Lynx would. If features such as JavaScript, cookies, session IDs, frames, DHTML, or Macromedia Flash keep you from seeing your entire site in a text browser, then spiders may have trouble crawling it."

However, in this article Google includes "Flash" among the list of file-types it can index! Carefully read the following two paragraphs:

"In general, however, search engines are text based. This means that in order to be crawled and indexed, your content needs to be in text format. (Google can now index text content contained in Flash files, but other search engines may not.)

This doesn’t mean that you can’t include rich media content such as Flash, Silverlight, or videos on your site; it just means that any content you embed in these files should also be available in text format or it won’t be accessible to search engines. The examples below focus on the most common types of non-text content, but the guidelines are similar for any other types: Provide text equivalents for all non-text files."

This article further stresses Google’s ability to index Flash files.

Now the question remains: should you add multimedia on your website or limit yourself to just text content? The answer to this question is probably best answered in this line: "Provide text equivalents for all non-text files"!

If you are still unsure, I think going the "trial and error" way is your best choice, especially if you cannot afford to have a "text-based" site. ;)

BTW, if you don’t want to install Lynx browser, there are two options:

a) Here is a free online tool that emulates the Lynx browser, and it is very hands-free: http://www.delorie.com/web/lynxview.html

The only catch is that you would need to upload a special file called delorie.htm or delorie.gif on your web server to prove your ownership of the respective website. The file can be empty.

While it is perfectly understandable that the webmaster has done this to prevent abuse of his resources, I wish he would have made it clear on the very first page.

b) Web developer Toolbar for Firefox: With the help of this Firefox addon, you can see your site just as you would through Lynx. Here is how: you can disable all non-textual content such as images, css, cookies, javascript, forms, popups, etc., from showing up on your browser (I don’t see a way to disable flash content, but since I don’t use flash on my websites, I have never felt the lack of it)!

What remains at the end is text content! Once you are happy with your website’s layout, you can re-enable all the disabled options! ;)

I have used it this addon with good results. I had originally downloaded it to use it as an alternative to Lynx (I am terrified of installing any more softwares programs on my machine since there are already a boatload of them), but now I love it for more than one reason. Download it and you would know why! ;)  

This (along with Firebug) is probably the best friend of a web designer or programmer! It should be in everyone’s toolbox, irrespective of whether you are an expert web developer or a plain Nuttie Guru like me! :D

Q4: What are canonical URLS and what is their importance in SEO?

A4: A canonical URL is the BEST URL of your website. Let me give you some examples:

The following URLs may not look different to you:
 
www.flexiblewriter.com
http://flexiblewriter.com
http://flexiblewriter.com/index.php
www.flexiblewriter.com/index.php

But search engines DO differentiate between them! Search engines, unlike humans, see them as four different URLs! When building your site links, it is important to decide on your site’s canonical URL (this in turn would influence your website’s internal linking structure as well as the links in your Google sitemap). For example, if you use links starting with http://yourdomain.com on your site, don’t expect the search engines to index your site as http://www.yourdomain.com; the reverse is also true!

It is also important that you remain consistent with your choice of canonical URL. For example, if you have decided to go without the ‘www" part, stay with that choice. Don’t make 50% of your site’s links starting with "http://domain.com" and the other 50% with "http://www.domain.com" !     

But what if you have made the "mistake" of not using the "www" part in your URLs and now want to have it? There is an easy way out. Create a 301 redirect using your website’s .htaccess file, so that all traffic (human and search engine spiders) coming to http://domain.com would be automatically redirected to http://www.domain.com.

Just open your website’s .htaccess file (it is usually located in the root folder of your site, but if you cannot see it there, just set your FTP program to show all the hidden files of your server) and add the following lines to it:

Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]

Obviously, you should replace domain.com with YOUR domain name!

In my opinion, PSPAD is one of the best editors for editing files such as .htaccess, .ini, etc.; it is what I mainly use for this purpose!

Once you are done editing, re-upload the .htaccess file back to your server, then test your redirection using this tool (note that I have not used the tool myself)!

Note that this type of redirection only works on Linux servers that have the Apache mod-rewrite module enabled! If you are not sure then check with your web host first! Most standard web hosts do have the mod rewrite enabled by default!

 

Resources:

More  redirection options can be found here

More information on canonical URLS can be found on Matt Cutt’s blog (for the uninitiated, if you care about Google, you should care about what Matt Cutts says, for in many ways he is the "human representative" of Google ;) )

So that’s it for now! I have received way more questions than I (and possibly you) can handle. On another note, I am trying hard to keep my articles short so as to make them less boring for you! ;)

Expect to receive answers to the other questions in my future article(s)!

BTW, I know this is irrelevant, but I have a plan of creating a "black hat" website soon, lol, using article rewriter softwares, and all that. That is not what I plan to build my business on, one reason why the site will run under a pen name! I would not be doing it for making cash! It is just to get a little fun out of the monotonous IM work ;)

You might recall that in my previous article I said that "black hat seo methods, if used smartly, can still work! But unless you have a lot of time and money to spend on them, it is best to stick to white-hat SEO!" I still stand by it; if you need to make a few dollars fast then being a good guy/gal is your best option. Once you establish yourself as Dr. Jekyll, becoming Mr. Hyde won’t be much of a problem ;) (kidding)

If you didn’t get the analogy, read this story! :D

Anyway, if I ever get around to doing it, I will let you know of my results (short of mentioning my site’s domain name, that is ;) ), provided of course that you don’t unsubscribe from my list (s) in the meantime! :D

Now, how about a nice comment? :) (btw, you can also post your SEO questions in the comments section below and I will try to answer them IF I have the answer ;) )

6 Comments

  1. Dating Books

    that’s a highly useful article, particularly for webmasters who might not be aware of the importance of the canonical URL in their marketing efforts. i think most people probably stick to one form or the other out of habit without realizing the importance of doing that. but being aware of the issue can definitely help you avoid wasting some precious marketing effort when you might inadvertently go with a different URL for one reason or the other. – Stephen

  2. Teris

    Okay, I went to the free sitemap generator, entered my url of a site that is 3 yrs old, page rank 3, and it could not crawl it, says
    an error occured. So, I went to submit a ticket, and went to the forum and found a whole lot of other people who had big problems with this site. IT may be free, but its not very reliable, maybe find another source. Darn..was looking forward to getting a good site map.

    1. Arindam

      Hi Teris,

      Not sure. Could be a temporary server problem on their or your end? Their tool works because just this morning I generated a sitemap for one of my sites.

      I would suggest u try again later. Of course there are several other tools that Google recommends, but I have not used any of them except the free tool I talk about, and I have been using it much before the big G started recommending it ;)

      May I know the inputs you used to build the sitemap? Did u select automatic priority?

  3. Chandan

    Thanks for the great article Arindam. I have a doubt. If I use my canonical url without www, will it affect my page ranking? I am a subscriber of Anjela and for backlinks I am using http://mydomain.com format. Can I continue with this format or should I change it to http://www.mydomain.com. Hope you will help me..

    1. Arindam

      No it is fine. Just stick with it. If you ever do change your mind you know what to do, hehe :D I seldom use www. part myself, because there is always a chance to forget it. But you cannot forget to add the http:// to ur link part because without it your links would simply not function ;)

      EDIT: I thought you were asking about onsite SEO. The above article is fine for onsite seo. But when u r building backlinks (offsite seo), I do recommend using variations of the URL to make it look all natural. Here is an article on that topic:

      http://flexiblewriter.com/in-link-building-sometimes-it-pays-to-be-different

      So, for onsite seo, just stick with one canonical URL: either http:// or http://www. But for offsite SEO, I would vary it. This makes the link building look all natural. I must confess tho, I use the http://domain.com format more often than http://www.domain.com. ;)

      There is however no hard and fast rule in SEO. I am just telling you what *I* do. Feel free to do your own experiments as you may have completely different results! :)

  4. Post Room sms

    When I have entered my sitemaps in Google Webmaster tools initially everything was fine.When I later returned to check I had an error on my sitemap.This was due to the canonical URL conflict as the pages in my site varied and not consistent with the one format.This confirms what you have mentioned in your article.