Home
High
Med
Low
AWA
ASA
News
Google Forum

Double Your Adwords Profits in 7 minutes!

Tools

$7 Secrets
People are making thousands of dollars using the techniques and scripts included with this popular report. And it costs only $7! [Read more...]

AdWords Secrets
Free 5-day course that can help you make money using Google AdWords.

Articles for 25 Cents Each
Get 400 exact keyword optimized articles, delivered to your inbox, every month. A new and inexpensive way to build keyword rich web sites that can make money, month after month.

Trade Links with 5,000+ Sites
This is not an automated link system. The Add URL Directory is a directory that lists over 5,000 "add url" pages that have forms that you can use to add your site and exchange links. The directory is divided into categories, helping you find sites in your topic area.

Backlink Analyzer
Automatically analyze the anchor text of all of your backlinks. Analyze your competitors' backlinks to see how you compare in the anchor text department.

SEO Web Site Templates
Web site templates that have been designed specifically for search engine optimization (SEO).




My sponsered child, Hama from Niger, Africa
A portion of the proceeds from this site help sponsor Hama from Niger. Learn more about Child Sponsorship.

Identical URLs in Google



العربيه

Author Message
GoogleGuy Says







PostPosted: August 3, 2003 11:20 PM 

Importance: Low

A member describes a case of near-identical URLs found in Google search results. GoogleGuy responds.

GoogleGuy Says: [Link to quote]

Interesting case, killroy. Thanks for passing it back to me via stickymail. I think the difference is that one url has a trailing slash and one url doesn't.
I can practically hear folks asking "But isn't www.foo.com/path the same as www.foo.com/path/"? In practice, they almost always are the same, but technically according to the HTTP standards I don't think that they have to be the same.

I've got a few minutes free, so let's go into detective mode for a bit. Most webservers are configured to append the "/" automatically via a 301 redirect. For example, if you try to fetch www.google.com/webmasters, our web server will do a permanent 301 redirect to the canonical page, which is www.google.com/webmasters/ (note the trailing slash).

Just to illustrate the point, let's use the same imitate-the-browser-using-telnet technique that I posted about in
http://www.webmasterworld.com/forum3/15750.htm
It's a really good debugging technique. What actually happens when you request a directory without the trailing slash looks like this:


telnet www.google.com 80
Trying 216.239.33.99...
Connected to www.google.com (216.239.33.99).
Escape character is '^]'.
GET /webmasters HTTP/1.0
HTTP/1.0 301 Moved Permanently
Connection: Keep-Alive
Date: Sun, 03 Aug 2003 22:11:43 GMT
...
Location: http://www.google.com/webmasters/
Content-Type: text/html
Server: GWS/2.1
Content-length: 163

301 Moved

301 Moved


The document has moved
here.



So the server basically said "Instead of fetching this page, try it again with a trailing slash"? That's why it's ever-so-slightly faster if you go to "www.webmasterworld.com/forum3/" instead of "www.webmasterworld.com/forum3"--because your browser doesn't have to get the redirect and do another fetch of the new url.

So to make a long story not quite as long, I noticed that the webserver for this domain returns a 301, but it looks like it doesn't add the trailing slash correctly in either the "Location:" field in the HTTP headers or in the text of the page. So that's the main thing I'd check on your web server.

On the other hand, even if we get duplicate content for two nearly identical urls, we have heuristics that normally detect that sort of thing. That's why the search collapses those two urls together unless you do "&filter=0". So the duplicate content filter was cleaning things up in this case. I think if you switch the webserver to do the 301 to the trailing-slash url, you should be in good shape in the future too.



Subscribe to this discussion: Email

Join the conversation:









Remember personal info?





Check to Subscribe to this Comment:
(email field must be filled in)



Subscribe Without Commenting