Context Identification of Sentences in Related Work Sections using Conditional Random Fields: Towards Intelligent Digital Libraries (Full Paper)
Angrosh M.A., Stephen Cranefield and Nigel Stanger
Abstract. Identification of contexts associated with sentences is becoming increasingly necessary for developing intelligent information retrieval systems. This article describes a supervised learning mechanism employing conditional random fields (CRFs) for context identification and sentence classification. Specifically, we focus on sentences in related work sections in research articles. Based on the generic rhetorical pattern, a framework for modeling the sequential flow in these sections is proposed. Adopting a generalization strategy, each of these sentences is transformed into a set of features, which forms our dataset. Prominently, we distinguish between two kinds of features for each of these sentences viz., citation features and sentence features. While an overall accuracy of 96.51% is achieved by using a combination of both citation and sentence features, the use of sentence features alone yields an accuracy of 93.22%. The results also show F-Score ranging from 0.99 to 0.90 for various classes indicating the robustness of our application.
Can an intermediary help users search image databases without annotations? (Full Paper)
Robert Villa, Martin Halvey, Hideo Joho, David Hannah and Joemon Jose
Abstract. Developing methods for searching image databases is a challenging and ongoing area of research. A common approach is to use manual annotations, although generating annotations can be expensive in terms of time and money, and may not be justifiable with respect to the cost in many situations. Content-based search techniques which extract visual features from image data can be used, but users are typically forced to use search using example images or sketching interfaces. This can be difficult if no visual example of the information need is available, or can be difficult to represent with a drawing.
In this paper, we consider an alternative approach which allows a user to search for images through an intermediate database. In this approach, a user can search using text in the intermediate database as a way of finding visual examples of their information need. The visual examples can then be used to search a database that lacks annotations. An interface is developed to support this process, and a user study is presented which compare the intermediary interface to text search, where we consider text as an upper bound of performance. Results show that while performance does not match manual annotations, users are able to find relevant material without requiring collection annotations.
SNDocRank: Document Ranking Based on Social Networks (Full Paper)
Liang Gou, Hung-Hsuan Chen, Jung-Hyun Kim, Xiaolong (Luke) Zhang and C. Lee Giles
Abstract. Ranking algorithms used by search engines can be user-neutral and measure the importance and relevance of documents mainly based on the contents and relationships of documents. However, users with diverse interests may demand different documents even with the same queries. To improve search results by using user preferences, we propose a ranking framework, Social Network Document Rank (SNDocRank), that considers both document contents and the relationship between a searcher and document owners in a social network. SNDocRank combines the traditional tf-idf ranking with our Multi-level Actor Similarity (MAS) algorithm, which measures the similarity between the social networks of the searcher and document owners. We tested the SNDocRank method on video data and social network data extracted from YouTube. The results show that compared with the tf-idf algorithm, the SNDocRank algorithm returns more relevant documents. By using SNDocRank, a searcher can get more relevant search results by joining larger social networks, having more friends in a social network, and belonging to larger local communities in a social network.