Volume 6 Issue 1
Spring 2010
ISSN 1937-7266

Optimal Support for Information Seeking Strategies
by Adapting the User Interface

Matthias Jordan

University of Duisburg-Essen


To satisfy their information needs, information searchers employ a host of search actions, that fall into different classes. Whereas most current IR systems only support one or very few of these actions well, we aim at the design of user interfaces that support the full range of search actions. The hypothesis of the research outlined in this paper is that different search actions need different user interfaces to support them optimally. And further, that supporting all kinds of search actions optimally in one integrated system increases the performance of the searcher. For testing our ideas, we have built a new document collection of book metadata gathered from Amazon and LibraryThing. The next steps will be the verification of the hypothesis with a prototype using the HyperGrid user interface framework that allows for many kinds of adaptation to support a wide range of ISS classes. Later, multiple different interface paradigms will be examined.

1. Introduction

The central problem in the user-centered view on information retrieval is to bridge the gap between the user’s discovery of an anomalous state of knowledge (ASK, [1]) and the document, that resolves the ASK. In this setting, the user’s most important—and at the same time the most difficult—task is to make the step from his perception of the information need to a request that he can use to query an IR system. Making this step is not trivial in many cases: the ASK itself makes the user lack the means to express what he would have to know to resolve it.

To overcome the problem, searchers often use sequences of different search activities, some of which involve paraphrasing the information need in different ways and a host of other methods. Current IR systems restrict the degrees of freedom of the user in this aspect: often they only support a few search activities well and other search activities have to be performed using external means for support, making the search less integrated and less pleasant for the user.

As an example, consider the popular Google search engine. The interaction supported by its interface is entering a query and then examining the linear result set, possibly followed by entering a loop of reformulating the query and examining the new result set. This is an interaction paradigm that supports just a few kinds of information needs, like goal-directed search, and apparently serves the average web user very well. But searches that go beyond looking up single instances of popular things are not well supported. Exploratory search, e.g., is not well supported by Google.

Even known-item search is in some cases not supported well by this paradigm. Suppose the searcher can describe the look of a web page and roughly what it is about. In Google, that would mean querying by related keywords and then visiting each site to see if it looks like the site the searcher has in mind. The online book store Amazon solves a similar problem by showing thumbnails of the book covers in the result list (see figure 2). This way searchers can immediately recognize a familiar book by quickly glancing over the list. Amazon, though, is not optimal for other searches. Although using a structure-oriented search engine that does allow users to search in specific types of bibliographic information (see figure 1), it does not allow searching in book descriptions and user reviews. This kind of search, though, has its place when trying to find books on subjects that are not mentioned in their title or were missed when manually indexing the book. Also, additionally to the cover thumbnail, only bibliographical information is displayed in the result list, which makes searching for a book description the same problem as searching for a site by its layout on Google.

Since Amazon is one of the most popular databases of book metadata, similar issues come up when considering a full digital library or an OPAC system. Here, too, users try to find items based on sometimes vague descriptions and information needs, sometimes when they can best use the visual appearance of a document to describe it. A user of the ACM DL might look for a paper by an Asian author that has a graph right on the front page that the user wants to find. Or the user might remember that the paper was in single-column layout and remembers only few terms of the title.

Figure 1: The search form on www.amazon.com

Figure 2: The results on www.amazon.com

So, even though there are many different search activities performed by searchers, only few of them are supported by existing retrieval systems. The work described in this paper examines the potential gain of supporting all (classes of) search activities in a single integrated system.

Section 2 presents known approaches to the problems mentioned before and fundamental work on the concepts involved in this area. Section 3 details the research that is planned.

2. Background and Related Work

There already is a considerable body of work on the issue of activities performed by users during their search. A sequential model developed by Ellis ([2]) is used to explain the stages an information searcher goes through during the search process. The steps in this model can be regarded as a classification with the distinguishing aspect being the stage of the search process.

Belkin, Marchetti and Cool ([3, 4]) developed a classification of search activities that they dubbed “information-seeking strategies”, or ISS. An ISS is a single step during a search process. In the terminology of this concept, multiple information-seeking strategies form an information-seeking episode. The classification groups ISS’s along four binary-valued axes and thus maps each ISS to one of 16 classes. The axes are the method, the mode and the goal of seeking and the resource used for seeking. Table 1 shows the classification scheme.

The basic ISS classification has been extended to a classification of general interactions with information that also contains communication acts (see [5]).

Work done by Marcia Bates (e.g. [6]) shows that there are plenty of search methods that information workers employ.

Xiaojun and Belkin showed in a recent paper ([7]) that an adaptive system that supports multiple ISS is superior to statically configured systems. This study, though, was limited by the number of ISS supported by the improved system.

One aspect that is interesting in the context of this work, but not supported well in the ISS classification, is the use of multiple aspects of a work to describe it. Ingwersen called this idea “Polyrepresentation” (see [8]). In the ISS classification the resource used is either information or meta-information, with information being the content of an item. In the case of books, e.g., also the layout of the book (its graphical representation, the dimensions of the book, etc.) and structural aspects like author and title are relevant, as we have seen in the examples in section 1. So we propose to use a three-valued facet “aspect considered” in addition to the basic ISS classification’s facets (see [9]).

3. Proposed Research

3.1 Establishing a Classification

Based on the assumption that there are many information-seeking strategies searchers employ, the hypothesis of this research is that different ISS’s would require different user interfaces to support them optimally. And further, that supporting all ISS’s in a single system with an optimal user interface for each ISS will improve the search performance of the user.

This requires finding a suitable classification of ISS’s and for each class of ISS to elaborate the requirements for a user interface that supports this class of ISS optimally. The ISS classification this research will be based on is the initial ISS classification in [3, 4], extended by our new facet “aspect considered” with the values “layout”, “content” and “structure”. This classification has the right granularity for our intentions and is in most parts well-established.

A potential aspect for optimization is the mapping between classes of ISS’s and their optimal user interfaces considering the user of the system. Users with different expertise benefit from different user interfaces. A novice user would be confused by a too complex interface while expert users like more options to choose from. As a consequence, the optimal mapping between ISS’s and user interfaces might be different for different classes of users.

3.2 Test Collection

For verifying our hypothesis, it is necessary to build a prototype system and a collection so that user experiments can be performed. A first step has been done by building the collection. We built a Java tool chain and crawled Amazon and LibraryThing for information on 2.7 million books in the English ISBN range. This collection now contains bibliographical information such as author, title and publisher, user and other kinds of reviews, information on the physical book object such as dimensions and weight, tags, keywords, places and people from the book, and images of the book covers. Additionally there are links between similar books, based on the Amazon recommender system. This rich set of book information allows for many information needs to be examined—the only exception being those that require the full text of the books. What is possible, though, is e.g. searching for familiar looking books and exploring a subject area. So, the data present also support the layout and the structure facet. The content facet will be supported by the publishers reviews that we assume to be representations of a book’s content.

Examples of information needs possible to investigate in this collection are:

  • I recently saw an IR book that has a German co-author (structure search, structure scan)
  • I recently saw an IR book that has a greenish book cover (structure search, layout scan)
  • Which IR books have been published recently? (structure search)
  • I am looking for an IR book that also covers modern topics such as XML IR and pagerank (content search on abstract)
  • I am looking for an IR book that is a good introduction in the field (content search on reviews)

3.3 Research Prototype

One constraint, that has to be taken into account during the development of the classification and the related user interfaces, is the cognitive load of the user. Searches as information-seeking episodes comprise of multiple different information seeking strategies. So, during the course of a search, multiple different ISS have to be supported optimally. But if an IR system had a user interface that radically changed during the search multiple times, its users would likely be confused or even stressed. Because of this, it is important to find interfaces, that do not confuse the user when changing it.

With this constraint in mind, the second step, building a prototype system, will be made using a Hypergrid-based user interface. Hypergrid (see [10] for more information and Figure 3 for a screenshot) is a table-based interface with a high degree of configurability that allows to “zoom in” to table cells. Initially containing small cells with little information (predominantly single-line cells), the cells can be enlarged by clicking into them, revealing more information relevant to the cell. The contents of the cells and the configuration of the columns could be selected automatically by the IR system. This way, the interface itself would stay the same but the gradual changes within the interface could lead to better user support. The Hypergrid-based prototype will be used to verify the model and the main hypothesis, because it allows for many ways of interaction while at the same time providing a steady framework that does not confuse the user.

If successful, the hypothesis will also be examined using different interaction models and user interfaces. In both scenarios, the design of the user interface framework has to make a trade-off between the optimality of support for each single ISS and the confusion of the user when changing the ISS and thus the interface.

Both the prototype and the collection will take part in the 2009 INEX Interactive Track.

The resulting system is a classification of ISS, their optimal user interfaces and the mapping between these sets. Our claim, that a system that supports all ISS is superior to a traditional system, has to be validated by comparing user retrieval performance using such an improved system with that using a traditional system.

Figure 3: The Hypergrid user interface (Click here for a larger view of the image)

3.4 Research Questions

Given the setting as described above, the research questions of this paper are the following.

What is a useful and suitable classification of ISS? How many classes are useful? Which classes can be unified without losing information?

The research will start with the basic ISS classification, extended by the new dimension: “aspect considered”. A classification can be anything between arbitrarily fine-grained and coarse, but neither of these extremes will be suitable for the application. Establishing a good classification for this task is an important optimization criterion.

Which properties of the instances of each class are relevant for the selection or design of the UI?

It is to expect that ISS that fall into the same class share some common properties. These properties might determine aspects of the user interface. For example, if instances in a class all require relationships between works to be shown or manipulated, a graph might be more suitable than a table. So the question here is: which properties do influence the user interface design for the given class?

What are the characteristics of the optimal UI for each class of ISS?

If the significant properties of a class are known, then the user interface paradigm that optimally supports these properties is not necessarily known, yet. There might also be similar paradigms that have to be evaluated against each other.

Is an adaptive IR system that is built to support all ISS’s better than a traditional system? If so: how much?

It has been shown that users can compensate differences in MAP scores between different retrieval systems by their interaction (see [11]). This might also be the case for differences in their user interfaces. Building an IR system that supports all ISS’s might on the one hand be very expensive but on the other hand result in a very small benefit. This cost benefit ratio might not be reasonable. Maybe the gain by supporting the two most common ISS’s is similar to supporting more or even all ISS’s.

Can such a system be built without causing confusion to its user?

Let’s again assume we have an IR system that supports all ISS’s. The number of classes of ISS’s might be very high so that a confusing user interface might be the consequence. In this case, the gain through the increased set of supported ISS might be counteracted by the confusing user interface. Maybe the confusion on the user’s side is worth the gain in retrieval quality and can be decreased by training. Maybe it is possible to make the change of (parameters of) the user interface plausible to the user. This has to be evaluated.

4 Experiments

The ideal experimental design for testing the hypotheses is rather complex: For each of 16 classes of ISS’s multiple user interfaces had to be tested, each using a small sample of topics and participants. Conducting such an experiment is prohibitively expensive in both time and money, so some compromises will have to be made.

If the main hypothesis held, significant performance differences would be observable in different interfaces for single ISS classes. So the first step will be to perform experiments for a few ISS classes with the HyperGrid user interface. HyperGrid can be configured in multiple variables like number of columns, selection of data in each column and zooming model. This allows testing multiple configurations against each other to determine the order of supportiveness of the configurations. Then a system will be built that can configure a HyperGrid interface optimally for each initially examined ISS. This integrated system will be evaluated against a baseline system that uses a statically configured HyperGrid interface. If the evaluation shows a significant difference, that would indicate that the hypothesis is true and further experiments into distinct user interfaces will be performed.

5 Conclusion and Outlook

It is known that searchers employ many different search actions to solve their problems and that these search actions can be classified. Some classifications of search actions have been introduced and are widely accepted like the ISS classification of Cool, Belkin and Marchetti. Furthermore, it is clear that optimizing IR systems only for MAP scores is not the one silver bullet to increase searcher performance. Still, information retrieval systems have just begun to support different classes of search actions. The research proposed in this paper will evaluate the usefulness of the approach to support a host of search actions in a single, well-integrated system.

The first steps toward this goal have been completed: there is a suitable document collection and a—very early—research prototype. Future work will encompass the development of a more functional IR system and evaluating the different approaches outlined in this paper in user experiments.

6 Acknowledgements

I’d like to thank my colleagues Thomas Becker and Jens Kapitza who were involved in the effort of crawling Amazon and LibraryThing and building a test collection out of the collected data.


[1] N. J. Belkin, “Anomalous states of knowledge as a basis for information retrieval,” Canadian Journal of Information Science, vol. 5, pp. 133–143, May 1980.
[2] D. Ellis, “A behavioural approach to information retrieval system design,” Journal of Documentation, vol. 45, no. 3, pp. 171–212, 1989.
[3] N. J. Belkin, P. Marchetti, and C. Cool, “BRAQUE: Design of an interface to support user interaction in information retrieval,” Information Processing and Management, vol. 29, no. 3, pp. 325–344, 1993.
[4] N. J. Belkin, C. Cool, A. Stein, and U. Thiel, “Cases, scripts, and information-seeking strategies: On the design of interactive information retrieval systems,” Expert Systems with Applications, vol. 9, no. 3, pp. 379–395, 1995. http://www.scils.rutgers.edu/~belkin/articles/eswa.pdf.
[5] C. Cool and N. J. Belkin, “A classification of interactions with information,” in Emerging frameworks and methods. Proceedings of the Fourth International Conference on Conceptions of Library and Information Science (COLIS4) (H. Bruce, R. Fidel, P. Ingwersen, and P. Vakkari, eds.), (Greenwood Village), pp. 1–15, Libraries Unlimited, 2002.
[6] M. J. Bates, “Where should the person stop and the search interface start?,” Information Processing and Management, vol. 26, no. 5, pp. 575–591, 1990.
[7] X. Yuan and N. J. Belkin, “Supporting multiple information-seeking strategies in a single system framework,” in SIGIR ’07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, New York, NY, USA, ACM, 2007, pp. 247–254.
[8] P. Ingwersen, “Polyrepresentation of information needs and semantic entities, elements of a cognitive theory for information retrieval interaction,” in Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (B. W. Croft and C. J. van Rijsbergen, eds.), London, Springer-Verlag, 1994, pp. 101–111.
[9] N. Fuhr, M. Jordan, and I. Frommholz, “Combining cognitive and system-oriented approaches for designing ir user interfaces,” in Proceedings of the 2nd International Workshop on Adaptive Information Retrieval (AIR 2008), October 2008.
[10] H.-C. Jetter, J. Gerken, W. König, C. Grün, and H. Reiterer, “Hypergrid — accessing complex information spaces,” in People and Computers XIX — The Bigger Picture, Proceedings of HCI 2005, 2005.
[11] W. Hersh, A. Turpin, S. Price, B. Chan, D. Kraemer, L. Sacherek, and D. Olson, “Do batch and user evaluations give the same results? ,” in Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval (E. Yannakoudakis, N. J. Belkin, M.-K. Leong, and P. Ingwersen, eds.), New York, ACM, 2000, pp. 17–24.