Internet Technology Lessons for SEO - Fetching Website Content from a Web Server
(Page 2 of 2 )
Okay, once the DNS server holds the name server records for all websites on the Internet, any time a user visits a website, the ISP knows where to fetch the website content -- because the name server record contains the address of the web hosting company for that website, as well as the website's IP address.
It will then communicate with the website's hosting server, asking for content. If the information is found on the website server, it will return a 200 OK status with the content -- otherwise, if it's not found, it returns a 404 header status. If there is some server error going on, it returns a 500 status. You might have observed these common errors when visiting websites.
Once you receive the content, your client browser will then render the HTML source code (received from the website hosting server) so you can read the content properly.
A web server can also be configured to provide content depending on the requesting party. The requesting party is also identified because of its IP address and DNS server. This is why, when a website is hacked, a website administrator can trace the hacker by looking at the server logs and getting those IP addresses with malicious activity.
One abusive SEO practice is known as "cloaking." It involves serving content based on the nature of the visitor making the request. This is common of hacked websites -- a hacker may configure a website to return Viagra-related content if the requester is Googlebot, but otherwise present normal content to normal visitors.
Luckily, you can diagnose this kind of problem by using "Fetch as Googlebot" in Google Webmaster Tools. It simulates what Google sees by making Googlebot visit your website server and then fetch content so that you can analyze whether it has been altered or not.
Below is the final diagram depicting how the Internet works, from client request to web server, with the DNS resolving the domain names into IP addresses. It also shows the involvement of Googlebot in the process:
Google has its own DNS servers, http://code.google.com/speed/public-dns/docs/intro.html which are now publicly available. You can even use them to replace your own ISP DNS server for a faster browsing experience, especially if your ISP DNS server is loaded and appears slow.
What are Class C IP Blocks?
Sometimes, in a shared hosting environment, a lot of websites are hosted on a single Class C IP block. For example, in this IP address: 209.35.17.17, the Class C IP block is this: 209.35.17
and so on, the search engines see that they all originate from a single Class C IP address block, and therefore the above series of backlinks can only count as one instead of 1000 +.
It is very expensive to host a lot of websites on a hosting server with different Class C address, as you will discover if you look at some hosting packages: http://www.page1hosting.com/packages.html.
I hope you've found my introduction to Internet technology helpful to your understanding of SEO.