Category Archives for Technology
I’m reposting the following (slightly edited) internal blog entry from a few years back. We’ve recently been doing quite a lot of information architecture work, and I felt perhaps the outside world would find it useful as well. While our models have developed a lot since then, the basic concepts are still the same. Recently adapting them to the blogging world has also been interesting.
[…] Most people seem to define Information Architecture as something to do with “arranging information into useful categories”. But this is still very vague when you start considering what “useful” and “category” mean. Few seem to have an answer on this.
Information architecture is actually a generic term used to describe anything even remotely relevant to the “arrangement of information into a structure”. Some groups have shanghaied the term for their own use, but I still prefer this generic definition.
Now let’s get a little more specific, in particular web based information architectures. There’s slightly more information out there on this, and there are several popular books now available on it. In the web space, there are several types of information architecture, all layered over a single web site.
The first major layer is at the browser level, where we have the visual architecture, or the site map.
Under that, we have the user perceived information architecture. The web, by its nature, implies a notion of location within a site (thus unfortunately validating the web page as location metaphor), and so this implies a user perception of where information is located. This may map directly to the site map, but through the use of user interface and graphic design, this may not be the case, intended or otherwise.
Supporting this is what most people see as the information architecture. Again, this may map directly to the other layers, but does not need to. The smaller the site, the more direct the mapping.
Finally, we have several layers of the technical implementation of the architecture. This may be database tables or static files in directories. Like the other layers, it may map directly, or it may not.
This layering of information architectures is intertwined with several other layers of abstraction when delaminating a web site, these being the site objectives, the user requirements, the actual information content, functional specifications and a range of other influential data.
So how do we design information architectures which conform to this layering model?
Here’s a fairly rough step by step guide to designing web site information architectures. Good luck finding it on the web, because from what I can tell, nobody’s actually fully documented it. Several have come close, but they tend to focus on their main areas of expertise instead of building the full picture.
-
Catalogue the current content, so you know what’s available. If you don’t yet have any content, then plan what this content will consist of.
-
Define your typical web usage patterns. These are like mini-use cases, and define a bunch of patterns for the standard interactions users will have for the site, triggers, goals and the priority for consideration on the site.
-
From each pattern will come a range of recommendations for navigation and information achitecture. e.g. a low priority usage pattern is for users to research domain specific terms used elsewhere in a site. The entry point for this pattern could be anywhere on the site, because of the nature of the content. Thus, a recommendation that term definitions be available from every page within the site, but the full list of terms is not available until the user is at the terms page.
-
Create a skeleton information architecture and site map from the recommendations. For small sites, a site map may be all that is required, but for larger sites, a defined information hierarchy may also be required. Site maps are typically best represented by a data flow diagram. Information hierarchies are best represented by some type of tree diagram. A common mistake is to assume that these are different representations of the same thing.
-
Populate the information architecture with the content.
-
Design a wire frame for the user interface. This will come almost directly from the site map for technical purposes, and the visual design for aesthetic and usability purposes. There will be give and take, because there’s no such thing as a perfect world or an infinitely long project time line. Such influences include financial, political, forward planning, psychological and moral reasons.
The thing I like about this process, is the objective steps which lead to the final architecture, particularly steps 2, 3 and 4. […] We often discuss web site usability and navigability with clients, and by having an objective process, we find that web site review/analysis is not only supported by objective criteria, but the process is actually a lot easier to peform, and the results much more detailed and useul. This makes both us and our customers happy.
(Originally posted to Synop weblog)
This site is pretty dependant upon Google, with it bringing in around 40% of the traffic. Another 40% is from other people linking to it, and about 15% is from blogging search/monitoring engines. Google seems to crawl this site every few days, which means quality and stability, not always in that order, are more important than ever before.
I usually make most changes to the site here on the live version, which means it breaks every now and then, as you’d expect from doing live development. Big refactors I do offline, but it is still fairly habitual to make most on the live site.
Last week, while making some changes, I broke the CGI, which pretty much drives everything. After looking at the log, I found out that it happened while Google was crawling the site, which means I’ve lost much of my Google ranking, including my favourite she bangs richard.
So what do you know, when Google finally gets around to crawling the site again, this time I have a bug in my dynamic robot rules, and it only indexes the home page and the blog index. So much for the actual blog items, noindex,follow all the way. So I now have the best of both worlds, several hundred Google page views in my log, and nothing to show for it.
Recently, one of our customers at work thought they had a similar problem, in that Google wasn’t crawling their site correctly, which could have meant a massive decrease in traffic, and possibly an end to their funding if true (it wasn’t). To live and die by Google. This isn’t such an uncommon situation to be in.
Ultimately, I don’t really care if I’m missing 40% of my old traffic. While I love new visitors, that’s not really why I have a personal web site. But it is scary to think that Google has so much control over so much of the industry’s, or more significantly the Internet VC’s cash.
Throw in the current concerns about the privacy of Google’s GMail, in that they preserve and index every single email you’ll ever send through them, and their power suddenly comes into perspective. If Microsoft was in this position, the industry wouldn’t stand for it. Google’s only saving grace, is that it hasn’t yet done anything wrong. We’re all infallible, especially big companies, and especially those who must answer to investors and shareholders. This may not be Google, but for how long?
Don’t get me wrong, I love Google, but I’m starting to be wary of the idea of them having so much control over my virtual presence.
Now if you’ll excuse me, I’ll get back to fixing my damn code so I can get my high rankings back.
More on tuning. For web servers, particularly those which are Linux based, we’re always wary of what we call “the magic number” for response times for a typical web page.
Nathan recently referred to response times and speed of service, quoting Mark Fletcher of bloglines saying how the speed of their service is indirectly proportional to the exponential of the load. Of course there are usability issues relating to slow response times as well, but in a client server/web environment, they’re also affecting availablity.
This is what we refer to as the magic number, and is the amount of time it takes to return a page from a web server, at which point the server will gain load exponentially.
It can take a simple background job, a bottlenecked database request, request overloading, or some other system process to slow response time to a certain point, at which time most users decide that their browser has failed to load the page. Their immediate response is to click stop and/or refresh, at which point the load on the box almost immediately doubles. This doubling of load effectively halves the magic number, and users will now only tolerate half the time they waited the first time, before they again hit refresh. Other spin off effects include opening another browser window and trying that as well, perhaps trying to get to the failing page via another page on the site, or by getting their work mates to try the same page to see if they also have the problem.
Once response time hits the magic number, you are no longer in control trying to tune the system, you are just focussing on recovery. I’ve seen people try to tune in these conditions, but they’re really wasting their time. Sure, you can obtain some useful diagnostics before restarting the box, but you’re still no longer tuning anything.
While being a good indicator for excessive load, the magic number is also a good benchmark for tuning excercises. The highest priority for performance tuning a system, is to prevent it from hitting the magic number, and of course this is almost impossible to do if the system has already lept past it.
(Originally posted to Synop weblog)
Flawed article on OS News concludes that:
All users, after an initial bootstrapping phase, preferred the CLI “discussion” method for interacting. All reported that they felt more in control and better able to find things out. This probably was due to the higher amount of interface consistency and more task-based interface that the CLI tends to encourage
Ignores the biased task based examples being taught, focussing on command line strengths but not GUI strengths. Also, no mention was made of which GUI was used, so I assume it was one of those inconsistent and unintuitive Linux GUIs.
I would ideally like to extend my little trial into a full newbie computing course where I teach the command line first before moving up into GUIs. I feel that my experiences here show that the CLI provides a far better environment for first-time computer users to find their feet.
Pity, because the GUI desktop is a flawed metaphor anyway, and it would have been good to see a proposal for something better.
Perhaps these things could be combined into a new shell. One that also had a more unified method of job control, perhaps introducing ‘inbg’ as a built in function.
Continuing the mainstream media’s ignorant bias of Apple technology, here’s CNet announcing Virgin’s new online music store, using tainted language to report that the service only supports WMA (Windows Media)…
will support Microsoft’s Windows Media Audio, or WMA, format
…and then implies that Apple’s non-support of WMA makes it an exception…
WMA files work on a host of digital music players, except Apple’s, which does not support WMA.
…when in fact they could (and should) have said:
only works on players which support Microsoft’s proprietary WMA format
View the universe at a scale of powers of 10. The universe is a truly amazing place, and from our place between the enormous and the minute, we’re doing our best to fuck it up as much as possible.
I came across an article recently, which I’ve subsequently lost, which pointed out that the RIAA and MPAA are actually assisting the whole social networking software industry with real world testing. We don’t want organisations hacking into and monitoring our communications, and we don’t want everything we say or transfer hashed and compared against a database of possibly illegal data. While a real world implementation of Freenet is still far away, the P2P and other not so mainstream communities have a wonderful way to validate their models and technology.
Try out Musicplasma, which attempts to match your musical tastes with other artists, based on information ripped from Amazon recommendations. Interesting use of scraping technology, and an interesting test for intellectual property law.
In case you didn’t know, here’s a rough introduction to how to manage smart people.
The IT Industry depending on your perspective.