Internet Technology Lessons for SEO

This is an introductory lesson for new SEOs that are not familiar with how the Internet and websites work. An SEO with a clear understanding of Internet technology will be able to understand a lot of technical processes that can help to diagnose SEO-related issues.

If you are ready, let’s get started.

Technical definition of Internet

Let’s start with the most basic lesson about Internet technology. The Internet is a network made up of a vast number of computers. The computers are either connected through the use of wire communications (through cables) or wireless communications (antennas).

Each of the computers on the Internet has been assigned a name with which to identify itself as “unique” in the network. This is called an IP address. An example of an IP address is 209.35.17.17. It is a numerical naming convention assigned to a network computer; this convention has existed ever since the birth of the Internet.

Using an IP address, you can determine the following information:

  • The computer’s ISP (Internet service provider).
  • The computer’s geographical location.

For example, you can use the following tool to determine your computer’s IP address on the Internet: http://www.whatismyip.com/

Once you have your IP address, you can input the result into http://ip-lookup.net/. That tool will tell you your ISP hostname and your country of origin.

This is also the reason Google knows the exact geographical location of website servers, because of the server’s IP address.

So what is an ISP? Their main job is to provide Internet connections to their subscribers. In return, subscribers make monthly payments to keep up their Internet connections.

To make things fast, especially when the ISP has a large number of Internet subscribers using connections at the same time, the ISP will implement its own DNS server. A DNS server actually works as a “cache,” a temporary storage for "IP Address to Domain Name" system equivalents.

Since the Internet is a network of computers, each identified by its IP address, websites use an alias known as a “domain name” instead of an IP to make things easy for users to remember (it is easier to remember a word than a number).

A DNS server is used to convert these domain names into their IP address equivalents. So a DNS server holds the following information (for example):

Domain Name == IP Address
seochat.com === 209.35.17.17
ibm.com === 129.42.38.1
apple.com === 17.251.200.70

The TCP/IP Internet protocol communicates with numbers and bits, and one piece of information used in the exchange of information is the IP address. So when a client browser visits a website, the ISP DNS server translates that domain name request into an IP address, for communication purposes.

The Hosting Name Server

Every website on the Internet is stored in a computer (like the computer you use at home). These computers are connected and identified on the Internet using an IP address (basic protocol discussed previously). Since websites use a domain name instead of an IP address, a “name server” holds the authoritative information pertaining to the website IP address (also known as the A-record) and the MX record (mail exchange record). Some records that can be found are the CNAME and others.

If you update a name server, the information about the website (its IP address , etc) will be “propagated” throughout the Internet, reaching as many DNS servers as possible at different Internet service providers. This process, called "DNS propagation," can take up to 48 hours to complete. This is why web developers often advise SEOs to start doing onsite work after a full/complete DNS update.

Once the DNS server has this information, it will be stored in its computer. If a client requests a certain website from the DNS server, the server will then look up its equivalent IP address to fetch content from the hosting server.

As of this point, we’ve covered the following (shown in communication diagrams):

The arrow pointing from the ISP DNS to the Website Name server and vice versa means that the DNS server fetches/updates information about the website’s IP address, etc. The information is cached in the DNS server for a period of time.

This makes communication a little faster, as mentioned previously. If a web surfer sends a query about a domain, the DNS has this information cached and can quickly communicate further to retrieve the website’s content (to be discussed in the next section).

The name server record is configured using your domain registrar control panel. Two of the most popular domain registrars are Go Daddy and Network Solutions. So if you register a domain name with a registrar and have a hosting account for your website, the hosting company will provide you with their name server information.

Examples of name servers are:

NS1.AGILITYHOSTER.COM

NS2.AGILITYHOSTER.COM

Once you have the name server information from your web hosting company, you will need to update that information with your domain registrar. If you switch to a different web hosting company, then you will also need to update the name server in your domain registrar control panel.

Okay, once the DNS server holds the name server records for all websites on the Internet, any time a user visits a website, the ISP knows where to fetch the website content — because the name server record contains the address of the web hosting company for that website, as well as the website’s IP address.

It will then communicate with the website’s hosting server, asking for content. If the information is found on the website server, it will return a 200 OK status with the content – otherwise, if it’s not found, it returns a 404 header status. If there is some server error going on, it returns a 500 status. You might have observed these common errors when visiting websites.

Once you receive the content, your client browser will then render the HTML source code (received from the website hosting server) so you can read the content properly.

A web server can also be configured to provide content depending on the requesting party. The requesting party is also identified because of its IP address and DNS server. This is why, when a website is hacked, a website administrator can trace the hacker by looking at the server logs and getting those IP addresses with malicious activity.

One abusive SEO practice is known as “cloaking.” It involves serving content based on the nature of the visitor making the request. This is common of hacked websites — a hacker may configure a website to return Viagra-related content if the requester is Googlebot, but otherwise present normal content to normal visitors.

Luckily, you can diagnose this kind of problem by using “Fetch as Googlebot” in Google Webmaster Tools. It simulates what Google sees by making Googlebot visit your website server and then fetch content so that you can analyze whether it has been altered or not.

Below is the final diagram depicting how the Internet works, from client request to web server, with the DNS resolving the domain names into IP addresses. It also shows the involvement of Googlebot in the process:

Google has its own DNS servers, http://code.google.com/speed/public-dns/docs/intro.html which are now publicly available. You can even use them to replace your own ISP DNS server for a faster browsing experience, especially if your ISP DNS server is loaded and appears slow.

What are Class C IP Blocks?

Sometimes, in a shared hosting environment, a lot of websites are hosted on a single Class C IP block. For example, in this IP address: 209.35.17.17, the Class C IP block is this: 209.35.17

The significance is that it is very difficult to have multiple IP addresses on different Class Cs if you are hosting all of your websites on a single hosting server (http://www.soulcast.com/post/show/408636/Importance-of-Class-C-IP-address-for-SEO )

This fact is used by search engines in detecting spam, so if thousands of backlinks come from the following IP addresses:

121.3.45.2
121.3.45.100
121.3.45.89
121.3.45.65
121.3.45.71

and so on, the search engines see that they all originate from a single Class C IP address block, and therefore the above series of backlinks can only count as one instead of 1000 +.

It is very expensive to host a lot of websites on a hosting server with different Class C address, as you will discover if you look at some hosting packages: http://www.page1hosting.com/packages.html.

I hope you’ve found my introduction to Internet technology helpful to your understanding of SEO. 

Google+ Comments

Google+ Comments