Do Wikipedians Follow Domain Experts? : A Domain-specific Study on Wikipedia Knowledge Building (Full Paper)
Yi Zhang, Aixin Sun, Anwitaman Datta, Kuiyu Chang and Ee-Peng Lim
Abstract. Wikipedia is one of the most successful online knowledge bases attracting millions of visits daily. The huge success no surprisingly has gained much research interests for the better understanding of the collaborative knowledge building process. In this paper, we perform a domain-specific analysis, compare and contrast the knowledge building in Wikipedia with a knowledge base created by domain experts. In particular, we compared Wikipedia knowledge building in terrorism domain with reference to Terrorism Knowledge Base (TKB) developed by experts at MIPT. In total, revision history of 409 articles in Wikipedia each matches a TKB record have been studied from three aspects: creation, revision and link evolution. We found that the knowledge building in Wikipedia in terrorism domain had been unlikely to follow TKB despite the online availability of the latter. For an attempt to find out possible reasons, we conducted a detailed analysis on contribution behavior of Wikipedians. It is found that most Wikipedians each contributes to a relatively small set of articles with biased contribution focus on one particular article. At the same time, for a given article, its contributions are often championed by very few active contributors including the article’s creator. Our interpretation is that the contributions in Wikipedia are more for knowledge coverage at article level rather than domain level.
Spatiotemporal Mapping of Wikipedia Concepts (Full Paper)
Adrian Popescu and Gregory Grefenstette
Abstract. Space and time are important dimensions in the representation of a large number of concepts. However there exists no available resource that provides spatiotemporal mappings of concepts. Here we present a link-analysis based method for extracting the main locations and periods associated to all Wikipedia concepts. Relevant locations are selected from a set of geotagged articles, while relevant periods are discovered using a list of people with associated life periods. We analyze article versions over multiple languages and consider the strength of a spatial/temporal reference to be proportional to the number of languages in which it appears. To illustrate the utility of the spatiotemporal mapping of Wikipedia concepts, we present an analysis of cultural interactions and a temporal analysis of two domains. The Wikipedia mapping can also be used to perform rich spatiotemporal document indexing by extracting implicit spatial and temporal references from texts.
Crowdsourcing the Assembly of Concept Hierarchies (Full Paper)
Kai Eckert, Mathias Niepert, Christof Niemann, Cameron Buckner, Colin Allen and Heiner Stuckenschmidt
Abstract. The “wisdom of crowds” is accomplishing tasks that are cumbersome for single human beings but can not yet be fully automated by means of specialized computer algorithms. One such tasks is the construction of thesauri and other types of concept hierarchies. Human expert feedback on the relatedness and relative generality of terms, however, can be aggregated to construct dynamically changing concept hierarchies. The InPhO (Indiana Philosophy Ontology) project bootstraps feedback from volunteer users unskilled in ontology design into a precise representation of a specific domain. The approach combines statistical text processing methods with expert feedback and logic programming to create a dynamic semantic representation of the discipline of philosophy.
In this paper, we show that results of comparable quality can be achieved by leveraging the workforce of crowdsourcing services such as Amazon’s Mechanical Turk (AMT). In an extensive empirical study, we compare the feedback obtained from AMT’s workers with that from the InPhO volunteer users providing an insight into qualitative differences of the two groups. Furthermore, we present a set of strategies for assessing the quality of different users when gold standards are missing. We finally use these methods to construct a concept hierarchy based on the feedback acquired from AMT workers.