|
DRUM >
College of Computer, Mathematical & Physical Sciences >
Computer Science >
Technical Reports from UMIACS >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1903/1225
|
| Title: | Domain-Specific Term-List Expansion Using Existing Linguistic Resources |
| Authors: | Dorr, Bonnie Zhao, Tiejun |
| Type: | Technical Report |
| Issue Date: | 3-Oct-2002 |
| Series/Report no.: | UM Computer Science Department; CS-TR-4399 LAMP-TR-092 UMIACS; UMIACS-TR-2002-79 |
| Abstract: | This report describes a series of experiments involving expansion of a
domain-specific human-generated "seed list" using available linguistic
resources. The resources used for the expansion are intended to be general
purpose: two large-scale Chinese-English dictionaries and a Chinese lexical
knowledge base (HowNet). The methodology involves three steps: (1) hand
extraction of head words from each entry in the human-generated seed list;
(2) automatic comparison of these head words against entries in the
linguistic resources-where an entry matches if the head word matches the
entry exactly or is included in its the semantic definition; and (3)
collection of any resulting matching entries into a larger term list. The
terms extracted by this process were verified manually to confirm whether
they were relevant to the topic of a specific domain. An important
contribution of this work is the finding that the use of a bilingual term
list for the expansion process does not provide a significant improvement
over the use of a simpler, more easily produced, monolingual term list.
(Also LAMP-TR-092)
(Also UMIACS-TR-2002-79) |
| URI: | http://hdl.handle.net/1903/1225 |
| Appears in Collections: | Technical Reports from UMIACS Technical Reports of the Computer Science Department
|
All items in DRUM are protected by copyright, with all rights reserved.
|