Here we examine a technique to improve usability in complex applications by introducing smarter search and “recent objects” functionalities. As usability becomes more and more a crucial feature of applications, helping users with full-text search and recent object lists may still prove insufficient. You may need to go beyond these features, by having a way to keep track of “most used” objects, which will help to:
– guess what you are looking for
– find what you are searching for
The problem
Lets see an example.
In these weeks you are working on items A, B and C of your favorite web application. Friday, you actually briefly worked on X, Y and Z before going home, as you had these for quite a while in the bottom of your to-do list. Now, you get back to work on Monday, and what you have in your “recent objects” list? Well, X, Y,Z. Useless. But you have full-text search. You search for the name of A, which actually hundreds of other objects share, and which maybe there are far more occurrences than in A, even if nobody has been using them for quite a while, so they fill results on top of your A. Useless. There is no easy way to get back to A: something here is not working.
This is a usability problem; in order to make your application more helpful, you should somehow keep track of what is being used most often by the users. How to do that? A complete answer is not trivial: as often happens in usability problems, what looks simple from the point of view of the user, is actually complex to solve and render. In the end, all complexity should be hidden, but the solution is not trivial.
What is relevant to you is not just stuff that you occasionally visited, but say projects or documents to which you recently returned to again and again: you need to keep in focus a window of attention. See it in this way: you want the projects or documents to which you are frequently linking to. You need a sort of personal page rank.
Recording hits
Well, the way to go is record what are doing; you have to record it somehow as a parallel, probably de-normalized table of “hits”, keeping it very simple, as you will probably get quickly really a lot of records there.
In the picture you see an example “hit”, when a user looks and/or works on something. Notice that as you will collect a lot of data, you will need to filter out in function also of your security model: that is why we have the “areaId” field there. |
Now however you decide to collect hits, you will have to meet the problem of how to weigh them, that is, have a hit rank function defined on users, objects and time.
In our implementation, we created a function that for every Teamwork user and every entity (be it task, issue, diary entry, document, worklog action) computes the user hit rank for the entity; if the entity is relevant for the user, the hit rank will be high. Rank gets high by “hitting” i.e. visiting an entity.
As we said before, interest is assumed to fade in time, otherwise you’d end to have too many entities with high rank: so you have to define a sort of window of attention, with a degradation of relevance.
You need a way to compute degradation of relevance; we defined degradation with the rigth side of a Gaussian curve with the constants in the code.
Hit rank can be refined to group rank notion, if your application has a notion of workgroup, so that you could define the activity of the group. Another benefit of hit rank is that you can efficiently monitor your application usage, or “activity”, and could lead to introducing badges et cetera.
Example implementation
An example implementation is in Teamwork: as it includes project management, business processes and groupware, there are many objects around. Hit rank has proven useful in a number of ways to improve usability, without impoverishing the model.
“You mostly visited” is a portlet which you may have on your dashboards, and you also see search results ranked:
In this way you should always have “at hand” what you’re really working on: you should be able to access your most relevant objects with one click.
References
Google’ page rank paper: The Anatomy of a Large-Scale Hypertextual Web Search Engine
A discussion on badges: http://stackoverflow.com/questions/135647/how-do-badges-work-in-stackoverflow
An introduction to full text search: http://www.javaworld.com/javaworld/jw-09-2006/jw-0925-lucene.html
Hibernate full-text search: http://www.hibernate.org/410.html
Our contribution to Hibernate full-text search: http://www.hibernate.org/432.html
See hit rank in action in the demo or by installing the web app.
See an interesting infographic here Anatomy of a Search Engine by First Site Guide
Because we don’t want exponential decay of days close to today, but still want some decay.
I’m curious what the reasoning is for using “the right side of a Gaussian curve” rather than exponential decay, which would seem like the default approach for modeling fading interest?
Other uses of collecting hits that I didn’t mention are as a way of collecting statistics of most used functionalities, which can help in improving the usability of your application. You can also detect “paths” in sequences of hits, and that too can be used for improving user experience.