Posts Tagged ‘ search engines ’

Although extremely hard to pronounce, canonicalization is a hot topic right now. Google's latest and greatest idea, canonicalization is the process of consolidating all duplicate URLs to one original canonical version. If there are a lot of URLs that lead to pretty much the same page, you're going to make the search engines work extra hard and spend a lot more time crawling all the different URLs. Often times, this means that they'll miss the important pages of your website because your crawl time is limited or too slow.

Here are some times when canonicalization is an issue:

1. When you have not redirected www and non-www versions of the website and they both resolve
These versions will be the same – but ideally, you need a 301 or permanent re-direct from one to the other in order to deliver the best possible results. Without this simple server redirect in place, you basically have two websites that will be indexed by the search engines – which spells bad news for your results.

2. You've changed your URL structure so that your information and content still exists on both the new and old versions
Obviously, you don't want to lose traffic that is linking to or visiting the new content. In this situation, a 301 redirect is important to use and is usually the best possible way to redirect your traffic – especially since both the search engines and web browsers can follow a 301 redirect. (A 302 redirect can only be followed by a web browser and not a spider)

3. Your URL structure generates infinite URLs
If you have a dynamically generated URL structure that could generate an infinite amount of URLs, you'll be in trouble! This generally happens in large e-commerce websites that have tons of product listings that can be sorted by price, size, closest to you, color, etc. If the website generates a different URL for each of these results, you could spell trouble. Most often, the reason that this is set up this way is so that your marketing department can add a tracking code to the URL to keep track of the campaigns.

Here's an example – let's say you have a new shoe campaign and your marketing department is sending out direct mail pieces, has an email marketing campaign, a blogger relationship database, and just search engine traffic. If the email url is "www.example.com/email", direct mail is "www.example.com/dmail", etc. then you can have a bunch of URLs for the same content.

If a spider suspects that the page can load with infinite URL variations, it can fall into a "spider trap" and stop indexing your website. Since there is limited resources for the spiders to crawl your website, important content may be left uncrawled. When this happens, it's a great idea to use a canonical meta tag or Google Webmaster Tools parameter handling tool.

4. Your pages are blocked by the robots exclusion tag
As you probably know, the robots.txt exclusion helps you block out search engines from indexing the information on your website that you don't want it to index. While it's a good practice to use this tag on occasion, it's very easy to accidentally block the spiders from indexing pages that are relevant and helpful. If your website isn't being properly indexed – this is the first place I'd look.

About the Author: Andrew Hallinan is the owner of Tampa search engine optimization company, and is Tampa Bay's leading Search Marketing Specialist. Andrew Hallinan has more free tips and advice at his blog.

Could you imagine what it would be like if you could make Google bow to your every whim and desire? What if you could be at the top of the search engines for every search term that you wanted to be on top for – without spending a dime on PPC? You could rule the business and informational world! Alas, we all bow down to Mother Google and Queen Yahoo!, but there are some ways that we can direct the search giants (including Bing) in how they view your website and what functions to perform while visiting. This article will discuss the fine points of your robots.txt file, your robots meta tag, and even the new nofollow HTML attribute.

1. Don't use the Whitehouse.gov's website as an example for your robots.txt file

Now-a-days, if you don't disallow the search engines from indexing your website, they will default to indexing it. There's no real need to "allow" Google, Yahoo!, or Bing to access your website – the will by default.

If, however, you wish to keep the robots off of your website (maybe during development, a redesign, or whatever), use the following in your robots.txt file:

User-agent: Googlebot
Disallow: /

This will block Google from indexing your entire website.

To make all of the search engines blocked:

User-agent: *
Disallow: /

Now, in some cases you'll want to block the search engines from indexing only a specific directory or subdirectory on your website. Here's what that'll look like in your robots.txt file:

User-agent: *
Disallow: /yoursubdirectorynamehere

Make sense?

Now, here's a hint – if I were you, I'd make folders on my server before the development of my application that were for "allow" and "no_allow" folders. This doesn't necessarily benefit the search engines, but the alternative is having a 200 line robots.txt file that mirrors the stupid robots.txt file found on the previous Whitehouse.gov robots.txt file. (http://web.archive.org/web/20070217205444/www.whitehouse.gov/robots.txt)

2. Disallowing a cached version of your website

If you don't want the search engines to send a searcher to the search engine's stored cached version of your website, you would use the following meta tag:

Generally, webmasters use this in cases in which a page is constantly updated and refreshed, and the previous copy that might have been stores on the search engine is no longer applicable to the searcher.

3. Controlling Snippets
I bet you don't know what a snippet is, do you? A snippet is the couple of lines of words that are displayed under your website's title on the search results. Sometimes it comes from your meta description tag, but often times, it comes from a snippet of your page that Google deems relevant. It's been proven that when Google produces these snippets, people actually click your link more often because they see how relevant the website will be.

However, there are times when you simply want Google to obey – and to display exactly what is contained in your search engine results snippet. For example, if you are a newspaper and your website is updated several times a day, Google may not have time to re-index your website, and so your snippets would be detrimental to your click through rates.

In that case, here's what meta tag would look like:

About the Author: Andrew Hallinan is the owner of Tampa search engine optimization company, and is Tampa Bay's leading Search Marketing Specialist. Andrew Hallinan has more free tips and advice at his blog.

Google, Yahoo!, and Bing all have heavy duty ways to crawl each and every website on the internet. This is a huge job and it takes up tons of resources – which is why they don't ever want to make sure that they don't "over" crawl any one website. They simply don't want to over burden their already resource intensive crawls. For that reason, most search engines only spend a limited amount of time crawling any one website. Here are some factors that can influence your crawl allocation:

1. Server response times
The search engines, with Google leading the pack, are trying now more than ever to increase the speed of the internet as a whole. If your server is slower than your competition's, and your website responds to requests too slowly, the search engine spiders may slow their crawls of your website down to make sure that they are not overloading the server.

2. Page load times
It's simple, really – the faster your individual pages load, the more pages of your website the spiders can crawl! If you have a 100,000 page website and the crawler takes a second per page, that's way too long. You can actually monitor your own page load times in your Google Webmaster Tools accounts.

3. Your content
You content MUST be unique. Autoblogs, automatic RSS feeds, and other forms of using dynamically distributed content are great – but if they are the only way your website gets traffic, you'll never dominate the search engines. You must have unique content that is relevant for the searcher and search phrase. If there is too much duplicate content, or you have too many pages with thin content, the search engines won't be crawling your website too often.

4. URLs, redirects, and missing pages
For whatever reason, there can be issues with the crawler crawling your website. It can get stuck in a redirect loop or have any other number of problems with crawling your website. You can view your crawl report and diagnose/troubleshoot problems from your Google Webmaster Tools account. Chances are pretty good that if Google has had problems crawling your website, Yahoo! and Bing will have problems crawling as well.

5. Server efficiency
You can lessen your server's resources that the spiders are allowed to use by creating compressed files and if-modified-since methods on the server. This is a great way to reduce your bandwidth. This isn't a problem for small websites, but when it comes to a website with 100,000 pages or unique products, the bandwidth can be very costly. If you use the if-modified-since portion of your server, it will return a 304 (not modified) response to the bot when it's requesting a web page that has not been modified since the last time it's contents were indexed. You can find out much more about this by visiting http://janeandrobot.com/library/managing-robots-access-to-your-website

6. Bot efficiency
You can adjust the crawl times of both Bing and Yahoo!'s bots by using a crawl-delay setting in your robots.txt file. If either of these seem too slow, see if an entry in this file exists. Another good way of being able to tell if the other bots are indexing too slowly is by checking on Google's own crawl speed – it may be a good indicator as well.

About the Author: Andrew Hallinan is the owner of Tampa search engine optimization company, and is Tampa Bay's leading Search Marketing Specialist. Andrew Hallinan has more free tips and advice at his blog.

What is SEO and why is it so important?

SEO is an abbreviation for search engine optimization. It describes the process of redesigning your website in a way that it will appear higher in search engine rankings. Once a website has been optimized, it can bring much greater success to the site and the business.

You can't assume that a beautiful eye catching website will be automatically picked up by a SE (search engine). Your website can be pumped full of attractive graphics whether it be video content, photo content or audio files. However, they are useless as the world's most important web 'spiders' will probably just see straight past it all!

Benefits of SEO:

SEO provides numerous benefits:

Your website will be easily found by users – there is no point in having a website if nobody can find it.

A website that's optimised for SE's is also optimised for users too. Some of the most basic aspects of SE optimisation include:

Finding words and phrases (keywords) that Internet users are looking for when they type something into Google. Target the wrong keywords and your efforts will be in vain.

Placing these important words and phrases in prominent places on each page, for example page title, headings, and links.

Writing good efficient HTML code that SE's such as Google, can easily search through.

Providing good content that other websites will then want to link to.

This article was submitted by Dom Bracher, a copywriter for Somerset SEO Experts Blaze Concepts

To achieve any goal, you first need to know everything there is to know about your goal and its influencing environment. That way you can formulate a plan of attack that will work, and avoid time-wasting activities that will not. This applies to everything: running a business, waging a war, winning a race, and of course, marketing your web site. Information is power.

Here, we will focus on some interesting facts on search engines and the Web. We shall see how we can use these facts to promote our sites through the search engines more effectively. Of course, there are many more ways you can market your web site, but the most effective both in results and in costs is getting included in the search engines and getting a good rank in searches for your products or services. That's because getting listed in a search engine is free, but if you are placed well, the traffic from an engine is literally what will feed you. Search engines are the most popular tools that web users use to find new information on the Net.

- General Facts

Forrester Research estimates that there are 500 to 600 million pages on the Internet. That number is growing fast. However, the largest search engine, AltaVista, only has about 150 million pages indexed (about 27% of the whole Web), with Excite and Lycos at only about 50 million indexed (about 10% of the whole Web)! From September 1996 until September 1997, none of the search engines increased size significantly, despite the fact that the web continued to grow! To a webmaster, these are shocking statistics! The two main reasons why relatively few pages are indexed are (1) the Web is growing faster than the engines can keep up with, and (2) many webmasters do not know how to design and submit their pages correctly. Getting and staying indexed well in a search engine needs a little more work than most people assume it does. You need a four-step approach.

The first thing you need to do is make sure that all your web pages can be reached from your home page within three clicks. Most engines will only crawl to three levels deep when indexing your site. Also, make sure all your pages have TITLE tags and META description and keyword tags as most engines now use these. It is also highly advisable to have META category, language, and robot revisit tags, and ALT tags on all your images. Don't just slap these into your pages. Put some thought into them. For example, the text in the TITLE tag for a particular page should start with a word that summarizes the entire page (a keyword). Say you have a page that mostly has information on vacations in Cancun, Mexico. Your TITLE tag should read something like 'Cancun vacations, tours, and travels in Mexico. Packages include diving…' The word 'Cancun' starts the sentence, and the rest of the sentence is made up of keywords that are related to the content of the page. This goes a long way in getting you better rankings. Same thing with the META tag text. If you use frames on your site, make sure you use good NOFRAMES tags since not all major engines support frames. If you don't, your pages simply will not be indexed by those engines. If you use image maps, make sure you have a text links navigation bar somewhere on the same page too as not all major engines support image maps either. Quick note: the TITLE tag text should be only up to 200 characters long, with the first 80 characters being the most important as these are the ones most engines focus on in ranking and results display. Do not simply repeat keywords in the title tag. Make some grammatical sense out of the sentences but ensure that the keywords feature early and are not diluted by too many 'junk' words.

The second thing to do is to submit only your home page and perhaps one other major page and let the engines crawl your site. I will explain this in depth below. The only exception is Infoseek. Infoseek does not crawl so you must submit every page on your site to it manually.

Because the engines are so overwhelmed, you need a third step – you must monitor your submission and re-submit your home page every couple of weeks. The engine may have taken your submission but dropped it later (happens a lot with Excite), gone to your site and found it unavailable at the time, or just not indexed your site due to a technical error on their part. Resubmitting and checking on your submission every two weeks will ensure that you will eventually get in and stay in the index.

You also need to get as many people linking to your site as possible. Visit related sites and ask for a link to your site. There is a trend by the engines to increasingly use link popularity and traffic as an indicator of relevancy. What this means is that the more people link to your page relative to your competitors pages, the more you will rank highly on the engines. Not only will getting many incoming links get you a better rank on the engines, but it will also get you a lot of traffic (following links is the second most popular way people find new sites). Furthermore, on Excite, HotBot, and Lycos, link popularity also determines whether the engine will crawl deep into your site and index more pages or not. Do not ignore this fourth step, no matter how hard it sounds!

For the major engines, do not leave the submission process to automated programs and services. The major search engines are too important and the automated services sometimes do it wrong. You are only submitting the home page and one other major page to Excite, Lycos, AltaVista, Infoseek, Northern Light, and HotBot – that is not much work to do manually every two weeks!

- About Spamdexing

Because the search engines are so overwhelmed, they are coming up with more ways to make their job easier and weed out pages they feel are not worth indexing. One of the new developments is that most engines now insist or highly recommend that you only submit your home page to them and let the engine crawl through your site and index the pages it finds. If you decide to go against this recommendation and submit a whole bunch of pages through the online submission forms, you will risk being tagged as a "spamdexer" (index spammer). There is also an indication that engines like AltaVista give a higher ranking to crawled pages than submitted pages. So for your own interests, you want your pages crawled so that they have a higher score. Other engines like Excite will take the same amount of time to add your pages to their index whether you submit them manually or let it crawl to them from your home page. So not only will you be wasting your time submitting each and every page you have to Excite, but you will risk spamming that engine. Conclusion: submit only your home page and one other major page and let the engines crawl your site. The only exception is Infoseek. Infoseek does not crawl so you must submit every page on your site to it manually. You can make a list of URLs to your pages and email that to Infoseek if you have more than 50 pages you wish to submit (see their submission page for more details).

There are a few other things to watch out for to avoid having your pages excluded from the engines. The following are things that make an engine tag a particular page as spam and therefore not index it. Make sure that none of your pages has any of these.

1. Keyword stuffing. This is the repeated use of a word to increase its frequency on a page. Search engines have the ability to analyze a page and determine whether the frequency is above a "normal" level in proportion to the rest of the words in the document.

2. Invisible text. Some webmasters stuff keywords at the bottom of a page and make their text color the same as that of the page background. This is also detectable by the engines.

3. Tiny text. Same as invisible text but with tiny, illegible text.

4. Page redirects. Some engines, especially Infoseek, do not like pages that take the user to another page without his or her intervention, e.g. using META refresh tags, cgi scripts, Java, JavaScript, or server side techniques. If you use redirection, it should have a delay of about 7 seconds.

5. META tags stuffing. Do not repeat your keywords in the META tags more than 1 to 3 times, and do not use keywords that are unrelated to the content of your site.

6. Do not submit the same page more than once on the same day to the same search engine.

7. Do not submit virtually identical pages, i.e. do not simply duplicate a web page, give the copies different file names, and submit them all. That will be interpreted as an attempt to flood the engine.

Below are several useful facts and tips for each major search engine that you can use to improve your search engine marketing.

- AltaVista (www.altavista.com) Facts

Pages in index in millions: 150
Time it takes to index a submitted page: 1-2 days
Time it takes to index crawled pages (may take longer than indicated): 1 day to 1 month
How to check if your page is on the index: In the search box, type: '+url: yourcompany.com/yourpage.htm'.
How to check how many pages link to your site: In the search box, type: 'link:yourcompany.com'. You can narrow your search to a particular directory or page like: 'link:yourcompany.com/ourpage.htm'. To eliminate from the results all the pages within your own domain that link to each other, use the -url command like: 'link:yourcompany.com -url:yourcompany.com'
Supports frame pages: Yes
Supports image maps: Yes

- HotBot (www.hotbot.com) Facts

Pages in index in millions: 110
Time it takes to index a submitted page: 2 days to 2 weeks
Time it takes to index crawled pages (may take longer than indicated): About 2 weeks
How to check if your page is on the index: Select the advanced search options and enter your page's URL.
How to check how many pages link to your site: In the search box, type: 'linkdomain:yourcompany.com'. To eliminate from the results all the pages within your own domain that link to each other, use the -domain command like: 'linkdomain:yourcompany.com -domain:yourcompany.com'. These methods get you all the pages linking to your domain. To find the links to only a particular page, enter your URL into the search box, then choose the "links to this URL" option.
Supports frame pages: No
Supports image maps: No

- Infoseek (www.infoseek.com) Facts

Pages in index in millions: 75
Time it takes to index a submitted page: 1 day for pages submitted online, 7 days for email submissions.
Time it takes to index crawled pages (may take longer than indicated): Rarely spiders, if it does then 1 – 2 months
How to check if your page is on the index: In the search box, type: 'URL: http://www.yourcompany.com/page.htm'.
How to check how many pages link to your site: In the search box, type: 'link:yourcompany.com'. You can narrow your search to a particular directory or page like: 'link:yourcompany.com/ourpage.htm'. To eliminate from the results all the pages within your own domain that link to each other, use the -url command like: 'link:yourcompany.com -url:yourcompany.com'
Supports frame pages: No
Supports image maps: Yes

- Excite (www.excite.com) Facts

Pages in index in millions: 55
Time it takes to index a submitted page: About 2 weeks
Time it takes to index crawled pages (may take longer than indicated): Up to 6 weeks
How to check if your page is on the index: In the search box, type in the full URL of the page.
How to check how many pages link to your site: N/A
Supports frame pages: No
Supports image maps: No

- Lycos (www.lycos.com) Facts

Pages in index in millions: 50
Time it takes to index a submitted page: 2-4 weeks
Time it takes to index crawled pages (may take longer than indicated): 2-4 weeks
How to check if your page is on the index: Not available.
How to check how many pages link to your site: N/A
Supports frame pages: No (limited)
Supports image maps: No

- Yahoo! (www.yahoo.com) Facts

Yahoo is the most popular directory on the web. Many people have problems getting their site listed. A rough estimate is that only 1 out of every 10 submissions gets listed, if that. Moreover, it takes an estimated 4 to 15 weeks to be listed for those who actually get listed! Those who got listed had to resubmit their site an estimated 4 times over several weeks or months before getting listed (resubmitting too often is spamming, by the way). One this is for sure – you must get into Yahoo! For some sites, Yahoo actually brings them over 50% of their business. By the way, Yahoo now has an express submission service whereby you pay $199 for a response to your submission within 7 weeks. It doesn't guarantee that you will be listed with them, but at least you get to know within 7 days whether you are in or if not, why. Here is a set of links that you need to visit to learn how to successfully get into Yahoo.

http://help.yahoo.com/help/search/url/
http://howto.yahoo.com/chapters/10/1.html
http://searchenginewatch.com/sereport/9903-yahoo.html

About the Author
Article by David Gikandi (support@searchpositioning.com), of Search Positioning.com.

©2009 Virtual-Environments Inspired by: ทำบุญวันเกิด