Type

Conference Proceedings

Authors

Gareth J. F. Jones
Inguna Skadiņa
Liadh Kelly
Andrew Salway

Subjects

Computer Science

Topics
english information extraction portable information technologies web extraction structured data information retrieval

Portable extraction of partially structured facts from the web (2010)

Abstract A novel fact extraction task is defined to fill a gap between current information retrieval and information extraction technologies. It is shown that it is possible to extract useful partially structured facts about different kinds of entities in a broad domain, i.e. all kinds of places depicted in tourist images. Importantly the approach does not rely on existing linguistic resources (gazetteers, taggers, parsers, etc.) and it ported easily and cheaply between two very different languages (English and Latvian). Previous fact extraction from the web has focused on the extraction of structured data, e.g. (Building-LocatedIn-Town). In contrast we extract richer and more interesting facts, such as a fact explaining why a building was built. Enough structure is maintained to facilitate subsequent processing of the information. For example, this partial structure enables straightforward template-based text generation. We report positive results for the correctness and interest of English and Latvian facts and for the utility of the extracted facts in enhancing image captions.
Collections Ireland -> Dublin City University -> Publication Type = Conference or Workshop Item
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing: School of Computing
Ireland -> Dublin City University -> Subject = Computer Science
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres: Centre for Digital Video Processing (CDVP)
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools
Ireland -> Dublin City University -> Status = Published
Ireland -> Dublin City University -> Subject = Computer Science: Information retrieval
Ireland -> Dublin City University -> DCU Faculties and Centres = DCU Faculties and Schools: Faculty of Engineering and Computing
Ireland -> Dublin City University -> DCU Faculties and Centres = Research Initiatives and Centres

Full list of authors on original publication

Gareth J. F. Jones, Inguna Skadiņa, Liadh Kelly, Andrew Salway

Experts in our system

1
Gareth J. F. Jones
Dublin City University
Total Publications: 297
 
2
Liadh Kelly
Dublin City University
Total Publications: 51