Type

Conference Proceedings

Authors

Gareth J. F. Jones
Walid Magdy
Johannes Leveling
Debasis Ganguly

Subjects

Computer Science

Topics
information retrieval query reduction retrieval effectiveness patent search web search ad hoc pseudo relevance feedback patent prior art search

Patent query reduction using pseudo relevance feedback (2011)

Abstract Queries in patent prior art search, being full patent applications, are very much longer than standard ad hoc search and web search topics. Standard information retrieval (IR) techniques are not entirely effective for patent prior art search because of the presence of ambiguous terms in these massive queries. Reducing patent queries by extracting small numbers of key terms has been shown to be ineffective mainly because it is not clear what the focus of the query is. An optimal query reduction algorithm must thus seek to retain the useful terms for retrieval favouring recall of relevant patents, but remove terms which impair retrieval effectiveness. We propose a new query reduction technique decomposing a patent application into constituent text segments and computing the Language Modeling (LM) similarities by calculating the probability of generating each segment from the top ranked documents. We reduce a patent query by removing the least similar segments from the query, hypothesizing that removal of segments most dissimilar to the pseudo-relevant documents can increase the precision of retrieval by removing nonuseful context, while still retaining the useful context to achieve high recall as well. Experiments on the patent prior art search collection CLEF-IP 2010, show that the proposed method outperforms standard pseudo relevance feedback (PRF) and a naive method of query reduction based on removal of unit frequency terms (UFTs).
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: Centre for Next Generation Localisation (CNGL)
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Information retrieval
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Gareth J. F. Jones, Walid Magdy, Johannes Leveling, Debasis Ganguly

Experts in our system

1
Gareth J. F. Jones
Dublin City University
Total Publications: 265
 
2
Johannes Leveling
Dublin City University
Total Publications: 65
 
3
Debasis Ganguly
Dublin City University
Total Publications: 28