Modeling how humans judge the semantic similarity between documents (e.g., abstracts from two different psychology articles) is an interesting and challenging topic in cognitive psychology. It also has practical implications for developing artificial intelligence (AI) systems, especially those designed for retrieving relevant information from a large database in response to a given query (e.g., finding new research articles related to a given abstract). Conversely, AI algorithms can provide a useful tool for testing human cognitive models. They can precisely simulate the consequences of specific assumptions about cognition, and these consequences can then be compared against actual human performance. In the process of developing both human cognitive models and AI models, investigating the discrepancy between human and AI performance is essential, although it has rarely been explored with respect to document relatedness judgments. In the current study, I identified a set of document pairs whose relatedness was judged radically differently between humans and a computational model called latent semantic analysis (LSA). Based on an examination of those misjudged document pairs, I proposed a tentative model of human document relatedness judgment, called the key-features overlap model. According to this model, document relatedness judgments by humans and computational algorithms can be explained, in part, by the degree of word-pair association across documents. Critically, it suggests that, to judge document relatedness, humans focus primarily on the association between the keywords in each document, while computational algorithms including LSA typically do not. Modifications of target documents to emphasize their keywords, while also providing keyword-relevant background documents to LSA improved LSAs document relatedness judgments. Such improvement demonstrated the usefulness of the key-feature overlap model-based approach for improving AI algorithms.
Level of Degree
First Committee Member (Chair)
Second Committee Member
Cognition, Artificial Intelligence, Latent Semantic Analysis, LSA, Document similarity
Jung, Kyunghun. "Mismatches between Humans and Latent Semantic Analysis in Document Similarity Judgments." (2013). http://digitalrepository.unm.edu/psy_etds/71