Google search engine crawling experiment



Recently I’ve had an experiment with the way Google crawls a site. I had a client site that had not been spidered in spite of being submitted to Google a good while back. I looked at the site and saw nothing amiss. There was plenty of text on the page everything looked good. I found their page in google but their was no cached text.


After a bit of searching I found others with similar problems and the speculation was that it was banned for “invisible text”. Some developers have used that technique to load keywoards into pages to take advantage of the search engines. Now, I hadn’t done anything of the sort, but I did write the pages in PHP and I had a few php comments to remind myself what I had done and why I had done it.

I removed the comment text and within a few days the main index page was spidered with searchable text and the summary was showing up in Google. So, I went about building a site map and submitting to get the rest of the pages crawled. Now the pages all were in the main directory with .php file extensions. I submitted the sitemap and waited and watched. A couple weeks went by and the sub pages still hadn’t been crawled. In that time, I had been doing frequent posts on this site and found Google spidering all over the place. WordPress uses easy to remember permalinks that look like directory paths, for instance…. http://www.averyjparker.com/2005/08/15/pcbsd-configuration/ instead of http://www.averyjparker.com/20050815pcbsd-configuration.php or something… Google seemed to like this as it was spidering and caching the various posts with about a 2 or 3 day delay.

About this time I found an interesting writeup on search engine optimization and specifically about Google. Among the things he noted is that he had noticed this behavior. No one at google could explain it to him, but it seemed that directories got spidered more quickly than specific document files. So, contact.php aboutus.php skills.php might not get indexed before domain.com/contact/ domain.com/aboutus/ and domain.com/skills

I tested this theory out on the site in question. I moved each subpage of the site to it’s own directory and renamed it to index.php (so that on viewing the directory it would automatically display), I updated my sitemap to reflect that (and the site’s menu). Within 2 days the Googlebot spidered each of the sub-directories. (After a wait of several weeks prior.)

So my best advice is if you’re eager to get a google spidering, go ahead and plan on giving seperate pages their own directories with a relevant name.

Related Posts

Blog Traffic Exchange Related Posts
  • Asheville based Web Design, VPS Hosting and SEO Services [/caption] Change is constant. The last couple of years I have been doing less onsite computer service. Health has been one large reason for that. I have been focusing on other things though. One of the things that I've been working on is now going live. I've redesigned my web......
  • Google indexing weirdness In looking at my Google Analytics info.... I checked on the Northcarolinagenealogy.net site's stats and found that it's really dropped since about Friday or Saturday from decent traffic to next to nothing. (20 visitors a day now.) The first thing I noticed was no google.com referrers.... So, I started looking......
  • Google search optimization I've just read a great article by Eric Wolfram on Google Search optimization. Well, it's probably not a new article it's been around for a while. I saw it referenced at marketingtom.com from February of 2004. He does have some good ideas though. Among other things the main idea is......
Blog Traffic Exchange Related Websites
  • SEO Tips for Blog Traffic Generation While it may be true to say content is king when it comes to blog publishing, the truth is that writing your blog content is not by far the only thing that you should be focusing on when it comes to attracting a readership following. Quality SEO, or search engine......
  • Dreamhost Promo Code For November 2011 with Free Domain Here is the Dreamhost promo code for this month November 2011. You all know Dreamhost is one of the famous web hosting provider and they are leading the market from past years. There are more than 500K web sites hosted by Dreamhost and that figure grows every day. Even i......
  • Develop A Membership Site Utilizing Internet Membership Software Programs Web membership software can be applied to create a membership web web page which is liable to bring you in a very good typical monthly income in case you have chosen the correct niche. In reality, when you choose the correct membership software package, you can run a variety of......
PDF24    Send article as PDF   

Similar Posts


See what happened this day in history from either BBC Wikipedia
Search:
Keywords:
Amazon Logo

Comments are closed.


Switch to our mobile site