How we improved our intranet search experience

We use the Google Search Appliance (GSA) across our family of intranets. In 2009 we launched a new search experience to coincide with an upgrade to the GSA.

Analysis of original search experience

I so wish I had some screenshots of the original search results pages. If I manage to find some, I’ll post.

Back in 2008, a typical search from any one of the family of intranets produced results from all the intranets. I know it is considered best practice to include everything in the initial scope of a search and then allow users to winnow down the results. But in this case, because each intranet served a very specific part of the organisation, results from all intranets clouded the experience. Inadequate metadata and file names made the experience worse and the interface was generally busy, as if someone had got into the admin page and turned on every option. Just because you can do something, doesn’t mean you should.

Search interface redesign

For the interface, I decided to localise the results and narrow the scope of the initial search to the intranet that staff search from. I thought that it would improve orientation. So if I search from the HQ intranet then I just get results from the HQ intranet. Same goes for intranet 1, 2, 3…

For the results page I included the Google logo to psychologically increase confidence in the results.  I placed a nice wide search box at the top, pre-filled with the original search phrase to allow staff to easily refine their query. I used a drop-down menu so that staff could switch to results from a different intranet. The presentation of the results themselves also followed Google’s public design with a title, snippet, URL and links to the cached version.

I decided to use icons for downloads instead of the existing [PDF] text that came by default with the GSA template. The template also included the amount of time it took to fetch and display the results, which I dropped in favour of simplicity. I changed the *Cached* link to *Text view* in the hope of encouraging staff to use this is a quick method of viewing PDF and Word files in a text version rather than waiting ages for the application to load.

Advanced search is available, but inconspicuous, since my experience of watching users try advanced searches is that they tend to overly restrict themselves. I also rewrote the help page.

The results display by relevance with the most relevant at the top (I’ve never understood some websites which clutter the search results page with a relevance or significance scale against each result; surely the first result is ALWAYS the most relevant?) There’s also the option to sort by date.

Example search results
Example search results

GSA backend tweaks

In addition to the interface, I also had a play around with the backend configuration with the aim of improving search quality, relevance and general usability.

Date stamping

For each entry on the search results page  there is a date. But what does the date mean? It is not always the date when the page or document was last amended. Consider a date-specific news article that is published and then gets amended several months later. Which date do we show in search results; the date of the article or the date is was last amended? This is just one question that I had to face while tweaking the interface. It’s possible to extract a date for inclusion in search results from the page or document title, filename, URL, from within the content itself, from metadata or the file datestamp. That’s a lot of choices. Our search result dates are contextual so that if you see a news story, the date reflects the initial publishing date. If you see minutes from a board meeting on a certain date, that’s the date we’ll show.

Configuring datestamps
Configuring datestamps

Manual biasing

You can manually control the ranking of documents or folders within the GSA. I use this functionality to manually demote date-specific content, such as news stories and meeting minutes and to promote popular areas.
When it comes to configuring the backend, it really helps if you are working with a well organised folder structure and good file naming conventions. For example, we have a policy of putting date-specific information in correctly labelled folders. So the meeting minutes for 2007 are filed in the /whatever/2007/ folder. The same rule applies no matter whereabouts you are in the intranet structure. In GSA, we can then use regular expressions to specify any folder with /2007/ in the path and automatically apply the same rules across the intranet. In the example below, content in any /2007/ folder gets a medium decrease in result biasing, since we prefer old news stories, old meeting minutes etc. to have less importance in the rankings.
Manual result biasing
Manual result biasing
Manual biasing also allows us to exert control over specific intranet areas and I use this, for example, when we are HiPPOed into creating sections that we’d rather did not show up and cloud search results 😉

Date biasing

The GSA has functionality that automatically promotes more recent pages, across the board. But I don’t use this. I prefer to use the manual method of specifying folders to promote and demote because some of our pages and documents are old (long-standing), yet current. Example: the annual leave template is a popular search request but it was amended years ago. It’s still the correct template. But because it is old, GSA would demote it in the results page if we used date biasing. Since we have various forms, policies and guidance which are still current but old, I don’t use this built-in functionality.

Crawl frequency and freshness

I also use our folder structure to help Google know how often to crawl areas of content. I know that some areas of content, such as news stories and meeting minutes, once published to the intranet, do not often change. So I help Google by specifying these areas and instructing not to waste so much time in attempting to re-index.
URL patterns to crawl infrequently
URL patterns to crawl infrequently

Related queries

This is a manual method of promoting alternative or related queries. I don’t use it much since Google does a great job of serving results. I mainly use it where there is problem with internal office language with projects which insist on calling themselves one thing when everyone else knows it as something else. Clicking the alternative suggestion will perform another search.

Related queries
Related queries

Key matches

Another manual method of promoting results. Key matches differ from related queries because clicking the suggestion will take staff directly to the intranet page rather than performing another search. Again I don’t saturate staff with these but I do use it for staff who come to the HQ intranet expecting to find popular items on their local intranets.

Key matches
Key matches

Query expansion (synonyms)

The GSA comes with a default set of synonyms, in different languages. Unfortunately, the UK set comes with American spellings so I had to change all the -ize, -ized, -izing to -ise, -ised, -ising. Again, I  use this functionality for dealing with problems with internal language. Query expansion differs from key matches and related queries in that the GSA will incorporate the expanded search terms into the initial query. Example: I search for *organogram* and GSA will also include *org chart* and *organisation chart* into my query and return any results containing those terms.

Content metadata

The remaining improvements to the search experience rely on the quality of the content that Google is crawling. I spent weeks going through a good proportion of the HTML, PDF, DOC, XLS and PPT files on the intranet, getting the metadata into shape.

I posted last month on the importance of quality metadata, with examples of what happens if you don’t think about metadata:

Since the launch

As we publish more and more content on the intranet it is a constant battle trying to maintain the quality of search results.  However, the importance of metadata is slowly but surely becoming embedded into our processes.  We are still using an older version of the GSA (version 5.2) so we don’t have the advantages of any of the newer developments such as type-ahead search results and allowing staff to tag and rate results.  But I believe it’s important to get the basics right before adding the frills.

Analytics and statistics; what is the question?

Analytics, evaluation, data and statistics all have the gas turned up in the workplace. Rightly so. We need to make business decisions, evaluate performance, and design usability improvements based on hard evidence.

Facts and figures should generate action. Not sit in a pile of paper on a desk or add to a growing repository of electronic reports, ultimately clogging up our servers and recycle bins.

The problem with generating lists of numbers and handing them to someone is that they don’t mean anything. As Avinash Kaushik says, if you can say ‘so what?’ to a statistic then it was pointless in generating it. You got 10,000 visitors this month. So what? Is that good or bad? Better than last month? How many did you expect to get? What will you do now because you know this?

I really believe we should ban statistics reports unless they are supported by a specific question. What do you want to know? It may be the same thing every month. Fine. Great. Then we can start comparing month on month. Then, reports will start to have meaning. Then we can take action. But first we need questions.

Did my email newsletter produce more traffic than the graphically designed intranet advert? Did the recent page redesign have the desired outcome? Did the news story result in more people signing up to the company initiative? Have more people been reading my pages over the last six months?

On the intranet we have devolved analytics. Our publishers and stakeholders are free to peruse the stats and graphs. We also do a quick training session on how to get the most out of the intranet analytics. This approach reduces wasted paper reports because people only check out the analytics when they have a question. If they are really interested in their content and want to monitor ongoing stats then they can create regular automated reports. It cuts down meaningless requests to the central intranet team, allowing them to concentrate on the bigger picture, and it encourages publishers to become more familiar with the life of their content after they have published it.

Do you spew out endless statistics reports or do you answer questions?

Mentioned sites

Squiz is the shizz

Squiz provide and support the MySource Matrix content management system.

The nice men from Squiz came to visit us today and I have to admit to being excited before the demo of their product suite. I checked out their website a few weeks ago and was very impressed.

I don’t know why I haven’t bumped into them before now; they’ve been going for over 10 years. Probably because the product is Open Source and I haven’t held out any hope of ever being allowed to use it.

The Open Source suite offers:

  • Enterprise CMS
  • Search (Funnelback)
  • Analytics (based on Google Analytics)
  • Integrated social media/networking

The Australian Federal Government already uses the system, as do a handful of UK agencies. The software is obviously free. And Squiz offer a support service including secure hosting.

How could we benefit?

CMS

The system has all the functionality that we would need; simple on-screen editing, user privileges, workflow, scheduled publishing, platform-specific templates (e.g. desktop browser or mobile device), track changes and real-time pages (rather than published HTML). Video support and image catalogue. A customisable frontend means that you can weave search, analytics and social functionality into the site’s pages. Suitable for both intranet and internet.

On the web we would be able to build an engaging site with a rich user experience. On the intranet we would be able to deliver targeted content to specific business areas using one central system. With the ability to deliver mobile content we would have the chance to reach out to front-line staff.

Search

Funnelback search is sexy. Type-ahead functionality. Live results based on user activity and trending. Faceted results. User feedback. Clever. Certainly worth considering defecting from Google. Up to the minute upgrades, instead of having to jump through sulphurous hoops of red tape and third-party suppliers. Somebody pinch me. How much money could we save by cutting out inflated IT procurement and service contracts?

Analytics

Based on the equally shizzling Google Analytics we would own our own data that we could feed back into the sites, improving search results, popular page and related information lists. And being able to provide up to the minute analysis is a bonus. The module also includes A/B and multivariate testing and reporting.

Social

Integrated into the suite is a social module which allows people to use the intranet/website like we do out here on the web. Rating articles, commenting on posts, “liking” pages, sharing stuff. Polls and surveys. Plus RSS feeds and email alerts giving people the power to catch up on their preferred content. And of course there’s blogging and microblogging.

I hypothesise that the success of implementing any such system will hinge on the…

Staff directory

The backbone and starting point of a new platform and way of working is the staff directory. Focusing on the individual, who is responsible for keeping their profile up to date. Social functionality starts from the user profile where an enterprise version of a “social graph” begins to build. Over time we will be able to see which pages people like, which content generates buzz. Staff can connect and interact with each other no matter how far apart geographically.

Which rather brings me back to a post that I wrote back in March:

Whether, in these austere times, we’d be able to afford such a system, I don’t know. But maybe by cutting out our existing CMS system, our existing Google Search Appliance, our hosting and servers, our IT support and service overheads – we would actually save money.

Mentioned sites

Managing the risks from online social networks

Our public website contains videos from YouTube, photos from Flickr and we also have a Twitter stream. All open to the public so that they can see what fabulous work we do. But if you’re a member of staff, you don’t have access, at least not from the workplace desktop. C’est ridicule!

The same applies to blogs, microblogs and other social networking sites such as Facebook (where a number of government departments already have pages.) Inside the enterprise, we don’t have a policy for managing the risk of accessing such sites; we have a blanket policy that prevents staff from accessing the sites.

If I want to research some blogs or do any kind of sentiment analysis about what people are saying about us on microblogs, then I have to write a business case. IT security perceives social networking sites as bad and dangerous. Any benefit to be had from such sites is seemingly outweighed by the risk of using them.

But the blanket policy of blocking sites does not work. If IT security think that blocking access from a desktop computer will stop staff accessing these sites, they are wrong. We have iPhones and Blackberrys. Also within the building there are a number of standalone computers for internet access. So we are going to access the internet one way or another, unmonitored by IT security.

Risks

What are the IT security risks? Malicious software, leaking information, identity and phishing attacks. All valid risks when using the internet and interacting with other people. And similarly, all valid risks of using email, which staff are allowed to access by default.

It is not the technology itself that poses the risk; it is how people use the technology.

Recommendations

Instead of simply blocking access from the workplace desktop, we should educate staff. Point out the risks and dangers of using social networking sites, of interacting with strangers and posting personal information. Highlight privacy issues. Implement an acceptable use policy for online social networks and trust staff to be sensible. Manage the risks instead of ineffectively attempting to block the risks.

We’ve all completed our mandatory information assurance training which covers use of email. Perhaps the training could also be extended to cover how to interact with social networking sites.

Further reading

In June 2010, CESG (The National Technical Authority for information assurance) released a guide with recommendations on how to manage the risks from online social networking.

Plain English for corporate intranet content

At work, I teach a course on writing for the intranet. It takes an afternoon and covers online writing and editing techniques, SEO, how to handle graphics and accessibility. I also touch on writing in plain English. For this part of the course I usually use the latest IT announcement as a demonstration. IT announcements highlight how not to write plain English. Right on the button. Every time. Guaranteed.

While useful for my course, it always annoys me that people from this department (who now call themselves ICT, the C for communication!) often fail to think about the people they are writing for. Their audience is the whole organisation, yet they write from their own point of view, using acronyms, technical jargon and internal project and process language.

So here’s our latest announcement. I’ve colour-coded it to help with my analysis:

Important Windows XP and Office Security Patches will be available to all workstations on your IT network from 4th October. This is part of an ongoing process to ensure all IT equipment is secure and to minimize the risk from computer viruses.

 What do you need to do?

You are advised to manually install these updates onto your desktop PC as soon as they are available. This is a simple process that will only take a few minutes of your time. For instructions on how to manually apply these security updates please click here.

If you are unable to apply these patches for whatever reason, they will be automatically applied to your workstation overnight 9th October, 20:00. However, we do advise you to manually apply these updates where possible.

Switching off your workstations

To ensure the security updates are correctly applied, you must restart your desktop PC after installation. This can be done when you shut down your PC at the end of the working day. Users are reminded that it is recommended that they ‘Shut Down’ their PC at the end of every working day – see below.

To shut down your PC, click on the ‘Start’ button and choose the ‘Shut Down’ option. Please remember to switch off your monitor as it is not switched off automatically by ‘Shut Down’. Some monitors have a very subtle ‘on/off’ switch that is flat and touch sensitive and located on the underside of the bottom right corner of the monitor; the blue light will be steady or flash until the monitor is switched-off.

Remote Workers

It is recommended that remote workers are onsite and logged in to their workstations when the Security Patchesare being applied. If this is not possible then the patches will be available remotely, however the update may take a little longer.

Analysis

It is clear that the writer has no audience in mind. The piece switches from talking to me directly, to speaking about users, to some invisible person who isn’t specified because of using passive voice (“this can be done…”)

The message is inconsistent, taking ages to tell me how to shut down my PC but making me follow a link for the actual instructions to download the update, which is the point of the announcement. There’s a line that reads *see below*, below meaning the next sentence? And then there’s just plain daft stuff like recommending that I’m logged on to my workstation to download and install the udpate. No, really?

The message is mixed. Along the lines of “we have an update, you can download it yourself, or it will install automatically if you don’t, but, you know, you can do it yourself if you want.” Why not just install the update on my machine automatically and not bother me with having to read through this tripe? Is it that important that I know that this technical maintenance is going on?

The colour-coded headings below refer to the coloured sections in the original announcement.

Passive voice

I can spot passive voice a mile off. Years of editing online content tends to drum it into you. And there is a very good reason that IT people use it. To appear distant, to avoid the issue, to misdirect attention. Better for speech-writers and spin doctors. On the intranet, we want to get the point across quickly and clearly. Passive voice clouds the issue causing the brain to fire internal questions trying to fill in the gaps due to missing information in the text.

Captialisation

When We Capitalise The First Letter In Each Word It Makes It Really Hard To Read.

While not so bad in this IT announcement, you’ll often find capitalised project names, department names, process names and technical terms. In this announcement, what is so important about security patches or remote workers that they deserve capitalisation?

Bad names for links

Seeing a *click here* makes my blood boil. It is wrong for so many reasons. Staff using screenreaders will often request the software to group and read out all links on the page. Hearing click here, click here, more info, find out more, is not helpful. Similarly when the search engine reads the page and follows a *click here* link it does not help. It just registers a great page in the search index called “click here” that everyone is linking to. Even for staff who can see the page, it’s not clear from the link text what you will get if you click it. Link text should always accurately describe the target destination.

Fluff

Important corporate announcements should just give me the facts. Not self-promotional fluff. I don’t need to know that what they are asking me to do is part of their ongoing process. And apologies to my non-English-writing readers, but there is a Z where there shouldn’t be.

Content rewrite

Instructions for how to download and install an update, including how to shut down your PC are already available elsewhere on the intranet. Assuming that the IT department can’t automate the updates, here’s how I would rewrite the announcement:

Important Windows XP and Office updates for your workstation will be available from 4th October
What do you need to do?
You can install the updates yourself. Read instructions on how to install the updates.

If you can’t do this yourself, we’ll update your workstation automatically at 20:00 on 9th October.

Remote workers
You’ll find it faster to install the update if you login while on site. You can still update remotely but it will take longer.

 


I know what I’m talking about here isn’t so important in the greater scheme of the workings of the organisation. But by taking just a few minutes to stop and think about what we are writing, we can help staff to save time in reading and understanding the finished piece. My rewritten example is shorter and clearer. The simple instructions makes it easier (and more likely) for staff to carry out the task. I have cut down the original announcement from 302 words to 78 words. With online corporate content, less is more.