Chapter 4
Information Architecture

A garden is finished when there is nothing left to remove.
—Zen aphorism

In the context of web site design, information architecture (often referred to in web parlance as IA) describes the overall conceptual models and general designs used to plan, structure, and assemble a site. Every web site has an information architecture, but information architecture techniques are particularly important to large, complex web sites, where the primary aims of IA are to:

Organize the site content into taxonomies and hierarchies, systems of classification that proceed from the general to the specific. Often this hierarchy becomes the basis for both browse navigation and search systems.
Create controlled vocabularies for the major categories of content, so that similar things are labeled consistently throughout the site.
Communicate conceptual overviews and the overall content and site organization to the design team and clients.
Research and design the core site navigation systems.
Test proposed site organization and navigation concepts with representative users, often using techniques like card sorting.
Set standards and specifications for the handling and structure of content in content management systems and databases.
Define appropriate metadata standards for content (for example, controlled descriptions and keywords that describe the content).
Define standards for accessibility-related metadata (“alt” HTML tags for images, captioning standards for video, alternate navigation standards).
Design and implement search engine optimization (SEO) standards and strategies.

The information architecture of a site is much more than the details of how the content is organized and subdivided. A good information architecture process looks holistically at the total user experience, how business and cultural context affects information seeking, and what the users want the site to deliver to them. In this larger view the site content is just one aspect of a good information architecture process.

Information Architecture in Site Development

Information architecture is one of a broad range of design and planning disciplines, and the boundaries across information architecture, technical design, user interface, and graphic design are necessarily blurred by the need for all of these communities of practice to cooperate to produce a cohesive, coherent, and consistent experience for the site user. Information architecture probably overlaps the most with content strategy, as both are concerned with planning for the proper structure and deployment of content. The core of content strategy, however, concerns the creation of useful and appropriate content that supports the overall goals and messages of the site, whereas information architecture is primarily concerned with how that body of content is structured and categorized with the site to support successful navigation and search.

What’s important to remember about closely related professional fields like content strategy and information architecture is that these are not just job titles. Information architecture and content strategy are tasks that need to be performed for all site designs, regardless of the job title of the person doing the work. We estimate that about 95 percent of web projects are small and straightforward enough that content strategy and information architecture will be done by the same team member, and these days that person will probably be called a “content strategist.” A site with a huge body of content to organize, however, will need an experienced information architect, probably with a library science background, because of the complex organizational and structural challenges of such large pools of information.

Architecture is an appropriate metaphor for the assembling of complex multidimensional information spaces shared by many different users and readers, where the underlying structure of information must first be framed out before more specific disciplines such as interface and graphic design can operate effectively. The user interface and visual design of the site may be much more visible to the user initially, but if the underlying organization of the site and its content is poorly constructed, visual or interactive design cannot fix the structural and conceptual problems.

Many of the most prominent information architects have backgrounds in library science, a discipline built upon centuries of knowledge on how to categorize large bodies of information. However, in many projects the information architecture of the site will become a joint project among the design, editorial, and technical teams. Regardless of how the role is filled, the information architecture tasks form the crucial planning bridge between your general discussions of site goals and audiences and the specific design, user interface, and technical solutions you’ll use in the finished site designs.

Methods for Information Architecture

Our day-to-day professional and social lives rarely demand that we create detailed architectures of what we know and how those structures of information are linked. Yet without a solid and logical organizational foundation, your web site will not function well even if your basic content is accurate, attractive, and well written.

There are five basic steps in organizing your information:

Inventory your content: What do you have already? What do you need?
Establish a hierarchical outline of your content and create a controlled vocabulary so the major content, site structure, and navigation elements are always identified consistently.
Chunking: Divide your content into logical units with a modular structure.
Draw diagrams that show the site structure and rough outlines of pages with a list of core navigation links.
Analyze your system by testing the organization interactively with real users, through card-sorting exercises, paper prototyping, and other user research techniques; revise as needed.

Inventorying and auditing content

A content inventory is a detailed listing of basic information about all the content that exists in a site to be redesigned or, in some cases, a site to be newly created from existing content resources. Although a content inventory is often tedious and time-consuming to create, it is an essential component of any rational scope planning for a web project. Content inventories are most useful in the initial project planning and information architecture phases, but a detailed content inventory will be useful throughout the project for both planning and build-out of the site. The work of moving through an existing site and recording information on each page is detailed, but it’s also easy to divide among team members who work through different subsections or directories of the site. The team members making the site inventory must both have access to the site pages in a web browser and be able to view the site structure within the CMS or on the server to ensure that all sections of content are inventoried.

Web content inventories of existing sites commonly take the form of a spreadsheet file with multiple worksheets, containing long listings of every page in the site, along with such essential characteristics as the page title, url, people responsible for the content, and so on. Each page typically gets a row on the spreadsheet, with columns listing such basic information as:

Unique ID number for project purposes
Page name
Page template or type
Section name
URL
Short description
Date of last update
Content owner

An inventory is an important starting point. However, a strategic approach means focusing efforts on content that meets project goals and is relevant to the target audience. To aid in decision making and support moving content forward, include action-oriented columns in the content inventory document, such as:

Action (create, edit, move, delete)
Priority (high, medium, low)
Person responsible
Date due
Status (to be done, in process, published)

Site analysis applications like SEO Spider can crawl existing sites and automatically produce a spreadsheet-based listing of page headings and urls for each page of the site. It also reports on broken links, suboptimal heading and markup issues, and (as you might guess) an analysis of page content as it relates to search engine optimization (SEO). This kind of report is not a substitute for a content inventory, but it can be a way to speed the process of gathering information.

Hierarchies and taxonomies

Hierarchical organization is a virtual necessity on the web. Most sites depend on hierarchies to create their high-level navigation categories, moving from the broadest overview of the site (the home page), down through increasingly specific submenus and content pages. In information architecture you create categories for your information and rank the importance of each piece of information by how general or specific that piece is relative to the whole. General categories become high-ranking elements of the hierarchy of information; specific chunks of information are positioned lower in the hierarchy. Chunks of information are ranked in importance and organized by relevance to one of the major categories. Once you have determined a logical set of priorities and relations in your content outlines, you can build a hierarchy from the most important or general concepts down to the most specific or detailed topics.

Taxonomies and controlled vocabularies

Taxonomy is the science and practice of classification. In information architecture, a taxonomy is a hierarchical organization of content categories, using a specific, carefully designed set of descriptive terms and labels. As any experienced editor or librarian can tell you, one of the biggest challenges of organizing large amounts of information is developing a system for consistently referring to the same things the same way: a controlled vocabulary, in library science parlance. One of the most important jobs of the information architect is producing a consistent set of names and terms to describe the chief site content categories, the key navigation site links, and major terms to describe the interactive features of the site. This controlled vocabulary becomes a foundational element of the content organization, the user interface, the standard navigation links seen on every page of the site, and the file and directory structure of the site itself.

Organizing content

When designing a new web site or extensively overhauling an existing one, it can be useful to step back from the details of the content inventory and take a fresh look at both how your information is organized and the underlying paradigms that drive conversations about content and site organization.

Some common underlying paradigms for site organization are:

Identity sites: Dominated by projected organizational identity and marketing. Most general corporate sites fall into this category.
Navigation sites: Dominated by navigation and links, usually for sites with very large bodies of information, like news or reference sites.
Novelty or entertainment sites: Dominated by news and “what’s new,” like Buzzfeed or the Onion.
The org chart site: Designed around the organization of the enterprise. Department sites are often organized this way, and as long as they are not heavily used service sites this may make sense. Often the basis of confused or poor site organization (see below).
Service sites: Organized around service, content, or products categories. Fast access to services should always dominate here, as in it help desk sites, or enterprise human resources sites.
Visual identity sites: Use interaction and visual flash to define the identity of a brand and draw an audience mostly through visual sensation. Many restaurant and luxury consumer brand sites fall into this category.
Tool-oriented sites: Organized around a tool or service technology. Google or Bing search engines are obvious examples, but popular online software services like Basecamp, Dropbox, and Evernote are other tool-oriented sites.

In a given context some paradigms or site themes are clearly better than others: it’s rarely wise to fall in love with a particular site organization before you have a clear rationale for using it, or for projecting your identity to the extent that you subordinate the motivations and concerns of your potential readers and users. Good sites balance meeting your users’ needs with delivering your message to the world. There is no formula for finding the right organizational paradigm, but in the early planning you should always examine your standing prejudices and explicitly justify them.

Clumsy “org chart sites” arranged solely by how the organization is managed are a standing joke among web designers, but are much less amusing to users who can’t find what they’re looking for because they don’t understand—or care—how your management is organized. The vast majority of users want products, information, or services from your web site, but many management structures don’t follow this service-oriented organization.

In some special situations users really do want to know how you are organized and will find contact information and content more easily with navigation based on business units. For example, in business-to-business (B2B) relationships a buyer or salesperson might really want an understanding of who manages what parts of an organization. But more often than not “org charts sites” reflect a poor understanding of what your readers and users need from you.

If you see these underlying mindsets and management silos driving or distorting early site organization discussions, put them on the table for discussion and brainstorming. Everyone has mental models, favorite paradigms, and blind spots. Be sure you’ve acknowledged and examined your underlying assumptions and biases and have chosen the best organizing theme for your site.

Five hat racks: Themes to organize information

In his book Information Anxiety Richard Saul Wurman posits that there are five fundamental ways to organize information: the “five hat racks” on which you can hang information.

Time

Organization by timeline or history, where elements are presented in a sequential, step-by-step manner. This approach is commonly used in training. Other examples include television listings, a history of specific events, and measuring the response times of different systems.

Location

Organization by spatial or geographic location, most often used for orientation and direction. This most graphic of the categories obviously lends itself to maps but is also used extensively in training, repair, and user manual illustrations and other instances where information is tied to a place.

Alphabetic

Organization based on the initial letter of the names of items. Obvious examples are telephone and other name-oriented directories, dictionaries, and thesauri, whose users know the word or name they are seeking. Alphabetic systems are simple to grasp and familiar in everyday life. This method of organization is less effective for short lists of unrelated things but is powerful for long lists.

Continuum

Organization by the quantity of a measured variable over a range, such as price, score, size, or weight. Continuum organization is most effective when organizing many things that are all measured or scored the same way. Examples include rankings and reviews of all kinds, such as the U.S. News and World Report ranking of colleges and universities, the best movies in a given year, darkest or lightest items, and other instances for which a clear weight or value can be assigned to each item.

Content mapping

Even if the major categories of your content organization are clear to the design team, it still may be hard to sort through where each piece of content belongs or what organization scheme will seem most intuitive and predictable to your users. User research can be crucial in building controlled vocabularies for labeling and navigation.

For example, in large biomedical research hospitals there are many varieties of “doctors” (people with doctoral degrees), so in professional language medical doctors are often distinguished as “physicians.” But most laypeople seeking medical help are not looking for “physicians”; they want a doctor. Both “physician” and “doctor” are appropriate terms, but how and where you use each term will depend on what your audience will expect and understand.

Card sorting—also called content mapping—is a common technique for creating and evaluating content organization and web site structures, and for clarifying and refining controlled vocabularies. In classic card-sorting techniques, index cards are labeled with the names of major and secondary content categories, and individual team members or potential site users are then asked to sort through the cards and organize them in a way that seems intuitive and logical. Users may also be asked to suggest new or better names for categories. The resulting content outline from each participant is recorded, usually in a spreadsheet, and all the individual content schemes are compared for commonalities and areas of major disagreement. The best card-sorting data come from individual sessions with representative current or potential users of your site. If you have enough participants, combining the results of each card-sorting session produces a powerful “wisdom of crowds” aggregation of many individual judgments about what content organization makes sense. These user-derived (or user-informed) category taxonomies are sometimes called “folksonomies,” a neologism coined by information architect Thomas Vander Wal that combines “folk” and “taxonomy.”

Card-sorting exercises come in a few varieties, and the exercises may be done with groups of participants working together, or with individual participants each working on his or her own. In open card sorting, subjects are asked to create their own names for major categories and subcategories of the site. The subjects typically start with blank index cards and a written description of the site and what its purpose and likely content will be, but are encouraged to name major categories as they see fit, and to group subcategories using their own logic about where things belong within each major category. The information architect running the card-sorting research then combines the most popular category names to form a taxonomy. This is often helpful when you suspect that your internal team might not use the same vocabulary used by your target audiences.

The more common closed card sorting uses preprinted index cards, and a full set of major and subcategory cards is given to each participant. The participant is then asked to select a few major categories—or create her or his own names for major categories if the existing names don’t seem right—and to place the remaining cards logically within each category to form a site taxonomy that makes sense to that participant. The cards should be carefully printed by hand for maximum legibility, or you could use Avery index card sheets compatible with laser printers to make the cards from your computer. This is often the best option when you need to create large groups of cards for testing.

The resulting taxonomies—one from each participant—are then compared both statistically and informally to the taxonomy that was created by the site designers or the information architect. More often than not the summary of the research participants’ taxonomies will be similar to the design team’s taxonomy, but the primary value of card sorting is to find those instances where the design team’s logic differs from the target audience’s about where to find a particular category within the site’s navigation.

In a reverse card sort, a taxonomy of major categories and subcategories is laid out in front of the participant, and the participant is given a sample task to perform by finding the category card that most closely represents the place in the taxonomy where the participant can complete the task. Reverse card sorts are usually done late in the research phase, as the primary value of reverse card sorts is to evaluate the effectiveness of a proposed taxonomy, not to generate a new one.

For smaller or less formal site projects, you can have group whiteboard sessions with techniques similar to card sorting. Participants are asked to sort through cards or sticky notes labeled with the names of major content elements, which are then posted on a whiteboard and sorted by the group until there is consensus about what overall organization or taxonomy makes the most sense. In most cases you’ll achieve quick consensus on the major categories of content and navigation, and the whiteboard organization becomes a useful first look at the site org chart that can help the group resolve the more problematic questions of what content belongs in which category. Use your phone to take snapshots of the various steps along the way, and of the finished whiteboard. Post-it has a smartphone app that can be useful in recording and sharing whiteboard Post-it sessions.

Some practical tips for card sorting:

Name the major categories as clearly as possible, without duplications or redundancies in terminology.
If category names are not obvious or are ambiguous, try using an “open” card sort early in your research, to allow the users to create their own names for categories.
Have a complete inventory of all your major categories and subcategories of content, with each category on its own card.
Limit the total number of cards to about forty. If your site is large and complex, divide the content into more manageable chunks of no more than forty category cards each.
Use real card stock, not cut-up paper. Paper “cards” won’t last more than a session or two.
Prepare thorough instructions for individual card-sorting sessions.
Assure all participants that there are no “wrong” answers, and that they have complete freedom to arrange things and rename things as they see fit.
Not every card will find a place in the organization. Tell participants that if they can’t find a logical place for a particular card, they should just set the card aside and move on.
Refrain from prompting or coaching participants.
Never discourage an idea from a user—even if you think it is a mistaken one—and allow free brainstorming.
Have plenty of supplies for new categories and improved terminology.
Bring a good digital camera with enough resolution to record the proposed card-sort organizations and whiteboard layouts. Make sure to check your photos for the legibility of all labels and notes.

While small projects with a few dozen categories and subcategories might not require software tools, larger projects with extensive content and/or many research participants to record and analyze may benefit from card-sorting software. xSort (Macintosh only) and UXSort (Windows only) are free but quite capable applications for designing and conducting card-sorting exercises. OptimalSort is a professional web-based card-sorting application well suited to complex bodies of information and large numbers of research participants, particularly where the research participants are scattered geographically. OptimalSort allows you to build a preliminary information architecture and controlled vocabulary, design card-sorting exercises, and coordinate communication with test participants via the web and email. The main value of tools like OptimalSort is probably in the analysis and reporting phase, where it can process the data and produce finished charts that summarize the results of all of your card-sorting sessions.

Regardless of the exact methods that you use, card sorting can be an inexpensive but valuable means to test your ideas on representative members of your potential audience. You don’t need huge numbers of research participants to get useful data. Methods research has shown that you can get about 80 percent of the value of user testing from as few as five test participants, and virtually 100 percent of the value of research with as few as fifteen test users. Card-sorting techniques have been used for many years, and if you carefully choose representative or potential users of your site, they offer “real-world” validation of ideas from your sponsors, stakeholders, and team members.

Segmenting information

Most information on the web is gathered in short reference documents that are intended to be read nonsequentially. This is particularly true of sites where the contents are mostly technical or administrative documents. Long before the web was invented, technical writers discovered that readers appreciate short chunks of information on pages that can be quickly scanned for titles, subtitles, and bulleted lists.This method of presenting information translates well to the web for several reasons:

Few web readers who are hunting for information will read long unstructured passages of text onscreen. Visual scanning aids like lots of titles and subtitles, lists, and tables help readers to quickly home in on relevant information.
Discrete chunks of information lend themselves to web links. The user of a web link typically expects the link to provide a specific unit of relevant information, not a book’s worth of general content.
Chunking can help organize and present information in a modular layout that is consistent throughout the site. This allows users not only to apply past experience with a site to future searches and explorations but to predict how an unfamiliar section of a web site will be organized.
Concise chunks of information are better suited to the computer screen, which provides a limited view of long documents. The limited viewports of mobile devices like tablets and smartphones make it even more important to keep your content concise and carefully designed to highlight major topics and keywords.

Content chunks

In linked hypertext systems like the web, content is often organized in modular, consistently organized “chunks” (also called “rhetorical clusters” by some authors) of information on specific topics. When you click on a web link about a particular topic, you expect the link to take you to a specific piece of content and not to the home page of Wikipedia. For example, you’d expect a web link for “chicken saltimbocca” to take you to a recipe or a short article on the dish, not to the first page of a whole book on Italian cooking. In print volumes such specific linking or footnoting is done using the basic unit of print, a numbered page. On the web a “page” can be of any arbitrary length, but if a web link on saltimbocca brings you to the top of a five thousand–word web page that (somewhere) includes the term “chicken saltimbocca,” you would feel misled. Why didn’t the link bring you to specific information on chicken saltimbocca? To meet these user expectations for specific blocks of information, neither too large nor cluttered with irrelevancies, your content must ideally be “chunked” into modular units structured and organized to meet the expectations of web users.

The concept of a chunk of information must be flexible and consistent with common sense, logically organized in the context of your topics, and convenient to use. Let the nature of the content suggest how it should be subdivided and organized. Although short carefully structured web pages are in general better than long rambling pages, it sometimes makes little sense to divide a long document arbitrarily into multiple short pages, particularly if you want users to be able to print easily or save the document in one step.

Your content management system (CMS) may also be structured to allow a set of content “chunks.” For example, a longer article or blog post often has both a main article field and a separate field for short or abstract versions of the article, as well as keywords and links to illustrations. CMS software like Drupal also allows you to create custom article structures and controlled vocabularies of keywords if your content needs more structure than the typical title–short version–full article–keywords default configurations for CMS content. Content is much more powerful and adaptable when it is consistently organized in modular formats that can be deployed flexibly across web sites, mobile apps, and social media platforms, all from the same core database.

Information Architecture Design

When confronted with a new and complex information system, users build mental models. They use these models to assess relations among topics and to guess where to find things they haven’t seen before. The success of the organization of your web site will be determined largely by how well your site’s information architecture matches your users’ expectations. A logical, consistently named site organization allows users to make successful predictions about where to find things. Consistent methods of organizing and displaying information permit users to extend their knowledge from familiar pages to unfamiliar ones. If you mislead users with a structure that is neither logical nor predictable, or if you use inconsistent or ambiguous terms to describe site features, users will be frustrated by the difficulties of getting around and understanding what you have to offer. You don’t want your user’s mental model of your web site to look like Figure 4.9.

Supporting browse and search

Once you have created your site in outline form, analyze its ability to support browsing by testing it interactively, both within the site development team and with small groups of real users. Efficient web site design is largely a matter of balancing the relation of major menu or home pages with individual content pages. The goal is to build a hierarchy of menus and content pages that feels natural to users and doesn’t mislead them or interfere with their use of the site.

Web sites with too shallow an information hierarchy depend on massive menu pages that can degenerate into a confusing laundry list of unrelated information. Menu schemes can also be too deep, burying information beneath too many layers of menus, requiring too many “clicks” on links. Having to navigate through layers of nested menus before reaching real content is frustrating.

Although it is always tempting to limit your top-level content categories to as few as possible, beware of creating too deep a site hierarchy. Deep hierarchies tend to confuse users, who prefer a broad range of choices to survey over just a few necessary vague or generic categories at the top of a deep hierarchy. A deep hierarchy with many nested categories also forces the user to remember more as she clicks down through the layers. A wide range of content categories also provides more “scent of information,” the ability to quickly survey many organizational categories and accurately guess where the desired information or product might be located.

If your web site is actively growing, the proper balance of menus and content pages is a moving target. Feedback from users (and analyzing your own use of the site) can help you decide whether your menu scheme has outlived its usefulness or has weak areas. Complex document structures require deeper menu hierarchies, but users should never be forced into page after page of menus if direct access is possible. With a well-balanced, functional hierarchy you can offer users menus that provide quick access to information and reflect the organization of your site.

If your site has more than a few dozen pages, your users will expect web search options to find content in the site. In a larger site, with maybe hundreds or thousands of pages of content, web search is the only efficient means to locate particular content pages or to find all pages that mention a keyword or search phrase. Browse interfaces composed of major site and content landmarks are essential in the initial phases of a user’s visit to your site. However, once the user has decided that your site may offer the sought-after information, he or she crosses a threshold of specificity that only a search engine can help with.

No browse interface of links can assure the user that he or she has found all instances of a given keyword or search phrase.

Search is the most efficient means to reach specific content, particularly if that content is not heavily visited by other users and is therefore unlikely to appear as a link in a major navigation page.

As with popular books at the library or the hit songs on iTunes, content usage on large web sites is a classic “long-tail” phenomenon: a few items get 80 percent of the attention, and the rest get dramatically less traffic. As the user’s needs get more specific than a browser interface can handle, the search engine is the means to find content out there in the long tail, where it might otherwise remain undiscovered.

Choosing a site structure

Web sites are built around basic structural themes that both form and reinforce a user’s mental model of how you have organized your content. These fundamental architectures govern the navigational interface of the web site and mold the user’s mental models of how the information is organized. Three essential structures can be used to build a web site: sequences, hierarchies, and webs.

Sequences

The simplest and most familiar way to organize information is to place it in a sequence. This is the structure of books, magazines, and all other print matter. Sequential ordering may be chronological, a logical series of topics progressing from the general to the specific, or alphabetical, as in indexes, encyclopedias, and glossaries. Straight sequences are the most appropriate organization for training or education sites, for example, in which the user is expected to progress through a fixed set of material and the only links are those that support the linear navigation path.

More complex web sites may still be organized as a logical sequence, but each page in the sequence may have links to one or more pages of digressions, parenthetical information, or information on other web sites.

Hierarchies

Information hierarchies are the best way to organize most complex bodies of information. Because web sites are usually organized around a single home page, which then links to subtopic menu pages, hierarchical architectures are particularly suited to web site organization. Hierarchical diagrams are familiar in corporate and institutional life, so most users find this structure easy to understand. A hierarchical organization also imposes a useful discipline on your own analytical approach to your content, because hierarchies are practical only with well-organized material.

The simplest form of hierarchical site structure is a star, or hub-and-spoke, set of pages arrayed off a central home page. The site is essentially a single-tier hierarchy. Navigation tends to be a simple list of subpages, plus a link on each back to the home page.

Most web sites adopt some form of multitiered hierarchical or tree architecture. This arrangement of major categories and subcategories has a powerful advantage for complex site organization in that most people are familiar with hierarchical organizations, and can readily form mental models of the site structure.

Note that although hierarchical sites organize their content and pages in a tree of site menus and submenus off the home page, this hierarchy of content subdivisions should not become a navigational straitjacket for the user who wants to jump from one area of the site to another. Most site navigation interfaces provide global navigation links that allow users to jump from one major site area to another without being forced to back up to a central home page or submenu. In Figure 4.13, primary categories in the header allow the user to move from one major content area to another, the left navigation menu provides local subcategories, and a search box allows the user to jump out of categorical navigation and find pages based on a web search engine.

Webs

Weblike organizational structures pose few restrictions on the pattern of information use. In this structure the goal is often to mimic associative thought and the free flow of ideas, allowing users to follow their interests in a unique, heuristic, idiosyncratic pattern. This organizational pattern develops with dense links both to information elsewhere in the site and to information at other sites. Although the goal of this organization is to exploit the web’s power of linkage and association to the fullest, weblike structures can just as easily propagate confusion. Ironically, associative organizational schemes are often the most impractical structure for web sites because they are so hard for the user to understand and predict. Webs work best for sites dominated by lists of links and for sites aimed at highly educated or experienced users looking for further education or enrichment, and not for a basic understanding of a topic.

The academic site Arts & Letters Daily is a great (albeit complex) example of a web organization. This site is designed for a highly educated audience that needs little context or structure provided by the site organization, because users bring a high level of prior personal knowledge to the content. Simple lists based on a few major categories are all this audience needs for pointers to recent interesting content.

Most complex web sites share aspects of all three types of information structures. Site hierarchy is created largely with standard navigational links within the site, but topical links embedded within the content create a weblike mesh of associative links that transcends the usual navigation and site structure. Except in sites that rigorously enforce a sequence of pages, users are likely to traverse your site in a free-form weblike manner, jumping across regions in the information architecture, just as they would skip through chapters in a reference book. Ironically, the clearer and more concrete your site organization is, the easier it is for users to jump freely from place to place without feeling lost.

The nonlinear usage patterns typical of web users do not absolve you of the need to organize your thinking and present it within a clear, consistent structure that complements your overall design goals. Figure 4.17 summarizes the three basic organization patterns against the linearity of the narrative and the complexity of the content.

Architecting the page

What governs how people scan pages of information, in print or on the screen? According to classical art composition theory, the corners and middle of a plane attract early attention from viewers. In a related compositional practice, the “rule of thirds” places centers of interest within a grid that divides both dimensions in thirds. These compositional rules are purely pictorial, however, and are probably most useful for displays or home pages composed almost entirely of graphics or photography. Most page composition is dominated by text, and there our reading habits are the primary forces that shape the way we scan pages. In Western languages we read from top to bottom, scanning left to right down the page in a “Gutenberg Z” pattern. This preference for attention flow down the page—and a reluctance to reverse the downward scanning—is called “reading gravity” and explains why it is rarely a good idea to place the primary headline anywhere except the top of a page. Readers who are scanning your work are unlikely to back up the page to “start again.” Search engines also have a well-known bias toward items near the top of a page.

The Poynter Institute has studied eye-tracking by readers looking at web pages and has found that readers start their scanning with many fixations in the upper left of the page. Their gaze then follows a Gutenberg Z pattern down the page, and only later do typical readers lightly scan the right area of the page. Eye-tracking studies by Jakob Nielsen show that web pages dominated by text information are scanned in an “F” pattern of intense eye fixations across the top header area, and down the left edge of the text.

When readers scan web pages, they are clearly using a combination of classic Gutenberg Z page scanning and what they have learned from the emerging standards and practices of web designers. As the web nears its twenty-fifth anniversary, some common patterns form the basis for “best practice” recommendations in web page composition. Human interface researchers have done studies on where users expect to find standard web page components and have found clear sets of expectations about where some items are located on web pages.

The web is still a young medium with no standards organizations to canonize existing typical page layout practices. Until we have a Chicago Manual of Style for the web, we can at least combine current mainstream web design practice, user interface research, and classic page composition to form recommendations for the location of identity, content, navigation, and other standard elements of pages in text-dominant, information-oriented web sites.

Presenting information architecture

Site planning with a team is often easier if you base your major structural planning and decisions on a shared master site diagram that all members of the group can work with. The site diagram or site map should evolve as the plan evolves and act as the core planning document as changes are proposed and made in the diagram. Site diagrams are excellent for planning both the broad scope of the site and the details of where each piece of content, navigation, or interactive functionality will appear.

For major planning meetings consider printing at least one large diagram of the site organization, so that everyone can see the big picture as it develops from meeting to meeting. The site diagram should dominate the conference table, becoming a tactile, malleable representation of the plan. Everyone should be free to make notes and suggest improvements on the printed plan, and the revised diagram becomes the official result of the meeting.

Site diagrams

As your team works out the information architecture and major categories of content, site diagrams visualize the developing information hierarchy and help communicate the organizational concepts to the team and to stakeholders and project sponsors. This communications role is crucial throughout the project, as the site diagram evolves in iterations from a brainstorming and planning document into a blueprint for the actual site as it will be developed.

Site diagrams can range from a simple hierarchical “org chart” diagram to a more complex and information-rich map that both shows the major divisions of the site as the user experiences them and acts as an overview of the site directory and file structure. The well-known information architect Jesse James Garrett developed a widely used visual vocabulary for site diagrams that has become the de facto standard, and the symbols are broadly useful for portraying site structure and interactive relationships and user decision points.

Major elements of a mature site diagram include:

Content structure and organization: major site content divisions and subdivisions
Logical functional groupings or structural relationships
The “click depth” of each level of the site: how many clicks are required to reach a given page?
Page type or template (menu page, internal page, major section entry point, and so on)
Site directory and file structure
Dynamic data elements like databases, rss, or applications
Major navigation terms and controlled vocabularies
Link relationships, internal and external to the site
Levels of user access, log-ins required, or other restricted areas

Site diagrams start simply and may evolve into two distinct variations: a conceptual site diagram that communicates at a general level the evolving site structure to clients and stakeholders, and a more complex blueprint diagram that is used by the technical, editorial, and graphic design teams as a guide to the structure of both the user interface and the directories and files.

Figure 4.23 depicts a simple site diagram for use in presentations and general overviews and the same site shown in greater detail for use by the site development team. These site diagrams can be developed with drawing software such as Adobe Illustrator but are usually developed with specialized diagrammatic software such as Microsoft Visio, ConceptDraw, or OmniGraffle.

Wireframes

The information architecture process is fundamentally one of avoiding the particular while insisting on the general. At various points in this conceptual phase, stakeholders, clients, and even members of your design team may find it irresistible to launch into specific proposals for the visual design of pages. In particular, concern about the possible look and feel of the home page is notorious for driving planning processes off the rails and into detailed discussions of what colors, graphics, photos, or general character the home page should have, long before anyone has given serious thought to the strategic goals, functions, and structure of the site. Visually plain page wireframe diagrams force teams to stay focused on the information architecture and navigation vocabulary without getting sidetracked by the distraction of purely visual design.

If site diagrams provide the global overview of the developing web site, then wireframes are the “rough map” that will eventually be used by graphic and interface designers to create preliminary and final page designs for the site. Wireframes are rough two-dimensional guides to where the major navigation and content elements of your site might appear on the page. When carefully designed they bring a consistent modular structure to the various page forms of your site and provide the fundamental layout and navigation structure for the finished templates to come.

Things that might appear as standard elements of a web page wireframe include:

Organizational logo
Site identity or titles
Page title headlines
Breadcrumb trail navigation
Search form
Links to a larger organization of which yours is a part
Global navigation links for the site
Local content navigation
Primary page content
Mailing address and email information
Copyright statements
Contact information

To keep the discussion focused on information architecture and navigation, keep your wireframe diagrams simple and unadorned. Avoid distinctive typography, use a single generic font, and use gray tones if you must to distinguish functional areas, but avoid color or pictures. Usually the only graphic that appears in a mature wireframe will be the organization logo, but even there it may be better simply to indicate the general location of the logo. The page wireframe will acquire more complexity as your thinking about global and local navigation matures and you are more certain about the nature and organization of the primary site content.

The page presentation functions of content management systems link Drupal and WordPress are structured by general site themes and page templates that can be extensively customized by front-developers who know the necessary HTML, CSS, and PHP methods. Drupal’s Zen 2 HTML5-based responsive “mobile-first” layout theme is a particularly flexible theme that can be used initially to wireframe a site and then can be used as a flexible basis for further visual development of the site. Regardless of which CMS you are using for your site, you might want to consider a plain, visually sparse “wireframe” version of your site templates for your planning and architecture phases, to concentrate on the interactive qualities of site navigation while deferring detailed visual designs until late in the process. Both Drupal and WordPress also allow you to change a site’s theme quickly, and examining and using your preliminary site architecture under various kinds of display themes may help to point up navigation or organizational problems, or give you some good ideas for features you may want to include in the finished site theme.

Chapter 4
Information Architecture

Information Architecture in Site Development

Methods for Information Architecture

Inventorying and auditing content

Hierarchies and taxonomies

Taxonomies and controlled vocabularies

Organizing content

Five hat racks: Themes to organize information

Category

Time

Location

Alphabetic

Continuum

Content mapping

Segmenting information

Content chunks

Information Architecture Design

Supporting browse and search

Choosing a site structure

Sequences

Hierarchies

Webs

Architecting the page

Presenting information architecture

Site diagrams

Wireframes

Recommended Reading

Chapter 4 Information Architecture

Information Architecture in Site Development

Methods for Information Architecture

Inventorying and auditing content

Hierarchies and taxonomies

Taxonomies and controlled vocabularies

Organizing content

Five hat racks: Themes to organize information

Category

Time

Location

Alphabetic

Continuum

Content mapping

Segmenting information

Content chunks

Information Architecture Design

Supporting browse and search

Choosing a site structure

Sequences

Hierarchies

Webs

Architecting the page

Presenting information architecture

Site diagrams

Wireframes

Recommended Reading

Chapter 4
Information Architecture