Imagine being a content developer for a website. You write a bunch of clever and informative articles, which should deliver a good dose of new visitors and ranking potential to the site. You submit them to the IT department for publishing online, and wait for good things to happen. But instead, it all falls flat. A look at your web analytics tools reveals that the number of site visitors has not increased over the time your new material was published. Further research reveals that your new content is not even in the search engine indexes! To quote the mighty Fred Willard, “Wha’ happened?” Perhaps some commonly seen site errors prevented your new content from being added to the index.
Just like a house, good web content needs a sturdy, reliable platform on which to reside. What good is a gorgeous, million dollar home if it’s sitting on a foundation of rickety 2x4s? No housing inspector would ever climb into such an unstable house to review it. And when a search engine crawler (aka bot) comes across a website littered with coding errors and serious problems with structure and design, it may, too, abandon its effort to crawl it. If that happens, no matter how good and compelling the content might be, it will never make into the index.
So how do you know if your site is has a rock-solid foundation or is just barely standing up? You need to get into your code. You can use a few good tools to help detect problems, but ultimately you’ll need to understand what the tools are saying when they indicate things are broken so you can fix them. Let’s get into a few of the site errors that are either pretty common or pretty important, and cover what you need to know to avoid their deleterious effects.
Invalid mark-up code
If your page mark-up code is bad, you’re bound to have crawling problems. But you might not know that the problems exist if your testing merely consists of, “How does it look in my PC’s browser?” Modern desktop browsers are pretty adept at munging through what you probably meant to do into a workable, on-screen presentation. They can often deal with code that is footloose and fancy-free when it comes to standards compliance. But the search engine bots are not as flexible as desktop browsers, and code problems can often trip them up and bring the crawling of your site to a halt. In addition to that, mobile device browsers are not likely to be as accommodating with poorly written code as desktop browsers, either. Anything you can do to make your code solid and standards compliant is good, for both your users and the bots.
To see where your code stands, use a good mark-up code validator. Most good development environments will offer either a built-in validator or references to such tools online. A particularly detailed validator is the W3C Markup Validation Service, a free, online HTML validator from the folks who bring you the coding language standards. It doesn’t validate entire websites recursively, just one page at a time, but it is still a very good source for detecting and identifying the issues behind coding errors.
Examine the results of the validator scan. What did it find? Check to see if you have some of these more common coding problems in your pages:
- Does your file contain a document type declaration (DTD) statement? (It’s not absolutely required for early versions of HTML, but you’ll need to have it for XHTML documents.) And remember to use the correct coding practices for which type of document you specify—the requirements for XHTML versus HTML are similar, but not identical.
- Are all of your tags closed properly? All of your paired tags must have corresponding openers and closers. The paragraph tag, <p>, for example, is one whose closing tags is often omitted. And if you are using XHTML, are you closing single (aka empty) tags correctly? Empty tags, like Break, need to include a forward slash before the closing greater-than sign, as in <br />.
- Are your tags written in lower case letters? It’s not required for HTML, but it is for XHTML, so it’s now considered a best practice.
- Are all of the tag attribute values, even numerals, as in <table border=”1″>, enclosed in quotes? While this is not required for earlier versions of HTML, it’s certainly a best practice and is a requirement in XHTML for creating well-formed code.
- Are the tag attributes used in your code valid? HTML has changed over the years, and some attributes have been deprecated with the introductions of the latest specifications of HTML 4.01 and XHTML 1.1. An example of this is <table align=”left”>, where the standards dictate that the newer style attribute should be used now instead of align. If you are unsure, check with a reliable mark-up tag reference, such as the Tag Reference for HTML/XHTML.
- Are you using deprecated tags? Again, changes to the standards have seen some tags become obsolete. For example, the <u></u> tags for underlining text was deprecated in HTML 4.01 and is not supported at all in XHTML using the Strict DTD.
- Are your tags positioned in the right place in your code? For example, <meta> tags can only be used within the <head> tag. Make sure you are placing your tags correctly.
- Are your tags nested correctly? If <tag 1> precedes <tag 2>, </tag 2> must be closed before </tag 1>. Remember this: first opened, last closed.
- Are you using the escape special character & for the ampersand character in your href attribute URL values? Some long, dynamic URLs may include the ampersand character. But to keep your code compliant, replace the ampersand references in URLs with its associated escape character code.
Tip: Test your pages in multiple browsers. One may be far more tolerant than another, and you really need to accommodate the least tolerant browser to allow the highest portion of your site’s visitors to have a good experience.