Topic map conversion of WordNet

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(Association types)
(Association types)
Line 67: Line 67:
 
|-
 
|-
 
| Causes (wordnet)
 
| Causes (wordnet)
| http://www.wandora.net/wordnet/type/causes
+
| http://www.wandora.net/wordnet/type/causes http://www.w3.org/2006/03/wn/wn20/schema/clauses
http://www.w3.org/2006/03/wn/wn20/schema/clauses
+
 
|-
 
|-
 
| ClassifiedByRegion (wordnet)
 
| ClassifiedByRegion (wordnet)
| http://www.wandora.net/wordnet/type/classifiedByRegion
+
| http://www.wandora.net/wordnet/type/classifiedByRegion http://www.w3.org/2006/03/wn/wn20/schema/classifiedByRegion
http://www.w3.org/2006/03/wn/wn20/schema/classifiedByRegion
+
 
|-
 
|-
 
| ClassifiedByTopic (wordnet)
 
| ClassifiedByTopic (wordnet)
| http://www.wandora.net/wordnet/type/classifiedByTopic
+
| http://www.wandora.net/wordnet/type/classifiedByTopic http://www.w3.org/2006/03/wn/wn20/schema/classifiedByTopic
http://www.w3.org/2006/03/wn/wn20/schema/classifiedByTopic
+
 
|-
 
|-
 
| ClassifiedByUsage (wordnet)
 
| ClassifiedByUsage (wordnet)
| http://www.wandora.net/wordnet/type/classifiedByUsage
+
| http://www.wandora.net/wordnet/type/classifiedByUsage http://www.w3.org/2006/03/wn/wn20/schema/classifiedByUsage
http://www.w3.org/2006/03/wn/wn20/schema/classifiedByUsage
+
 
|-
 
|-
 
| Entails (wordnet)
 
| Entails (wordnet)
| http://www.wandora.net/wordnet/type/entails
+
| http://www.wandora.net/wordnet/type/entails http://www.w3.org/2006/03/wn/wn20/schema/entails
http://www.w3.org/2006/03/wn/wn20/schema/entails
+
 
|-
 
|-
 
| HyponymOf (wordnet)
 
| HyponymOf (wordnet)
| http://www.wandora.net/wordnet/type/hyponymOf
+
| http://www.wandora.net/wordnet/type/hyponymOf http://www.w3.org/2006/03/wn/wn20/schema/hyponymOf
http://www.w3.org/2006/03/wn/wn20/schema/hyponymOf
+
 
|-
 
|-
 
| MemberMeronymOf (wordnet)
 
| MemberMeronymOf (wordnet)
| http://www.wandora.net/wordnet/type/memberMeronymOf
+
| http://www.wandora.net/wordnet/type/memberMeronymOf http://www.w3.org/2006/03/wn/wn20/schema/memberMeronymOf
http://www.w3.org/2006/03/wn/wn20/schema/memberMeronymOf
+
 
|-
 
|-
 
| PartMeronynOf (wordnet)
 
| PartMeronynOf (wordnet)
| http://www.wandora.net/wordnet/type/partMeronymOf
+
| http://www.wandora.net/wordnet/type/partMeronymOf http://www.w3.org/2006/03/wn/wn20/schema/partMeronymOf
http://www.w3.org/2006/03/wn/wn20/schema/partMeronymOf
+
 
|-
 
|-
 
| SameVerbGroup (wordnet)
 
| SameVerbGroup (wordnet)
| http://www.wandora.net/wordnet/type/sameVerbGroupAs
+
| http://www.wandora.net/wordnet/type/sameVerbGroupAs http://www.w3.org/2006/03/wn/wn20/schema/sameVerbGroupAs
http://www.w3.org/2006/03/wn/wn20/schema/sameVerbGroupAs
+
 
|-
 
|-
 
| SimilarTo (wordnet)
 
| SimilarTo (wordnet)
| http://www.wandora.net/wordnet/type/similarTo
+
| http://www.wandora.net/wordnet/type/similarTo http://www.w3.org/2006/03/wn/wn20/schema/similarTo
http://www.w3.org/2006/03/wn/wn20/schema/similarTo
+
 
|}
 
|}
  

Revision as of 20:10, 9 July 2007

WordNet is a large lexical database for English. WordNet has been developed at the Cognitive Science Laboratory of Princeton University. Topic map conversion is based on W3's work on RDF version of WordNet 2.0.

Contents

Download WordNet topic map

There are two versions of WordNet topic map available:

Usage in Wandora

Topic map version of WordNet contains over 100 000 topic and associations, and requires at least 2 GB of memory to be used properly in Wandora. To get such a memory for Wandora, start the application with bin/Wandora-huge.bat or adjust Java's memory settings in bin/Wandora.bat. Below is a screenshot of Wandora with WordNet's meeting topic open. Note the layer structure.

Wordnet example.gif

Conversion details

The topic map conversion of WordNet is based on W3's RDF version of WordNet. The conversion had (little simplified) steps

  • Import each single RDF file of WordNet to Wandora as a separate layer. For each imported layer
    • Manually fix RDF triplets to topic map associations
    • Map RDF's subject and object to topic map roles
    • Manually fix certain subject identifiers of imported topics
  • Create light-weight topic hierarchy to connect WordNet topics to Wandora's topic tree.

I (akivela) was actually little surprised how easily the RDF version converted to a topic map. The most demanding step was to decide which roles to use in associations. Next chapters describe the most important base names and subject identifiers of the topic map conversion.

Synsets

Synsets are classes that collect all words under word categories. Categories comply with W3's and WordNet's categories. Single words are instances of these class topics.

Base name Subject identifiers
AdjectiveSatelliteSynset (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/AdjectiveSatelliteSynset
AdjectiveSynset (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/AdjectiveSynset
AdverbSynset (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/AdverbSynset
FullSynset (wordnet) http://www.wandora.net/wordnet/synset
NounSynset (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/NounSynset
VerbSynset (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/VerbSynset

Association types

Association types define separate relations between word topics. Association types comply with W3's WordNet schema. Each association type has been added extra subject identifier to connect the topic to Wandora.

Base name Subject identifiers
Attribute (wordnet) http://www.wandora.net/wordnet/type/attribute http://www.w3.org/2006/03/wn/wn20/schema/attribute
Causes (wordnet) http://www.wandora.net/wordnet/type/causes http://www.w3.org/2006/03/wn/wn20/schema/clauses
ClassifiedByRegion (wordnet) http://www.wandora.net/wordnet/type/classifiedByRegion http://www.w3.org/2006/03/wn/wn20/schema/classifiedByRegion
ClassifiedByTopic (wordnet) http://www.wandora.net/wordnet/type/classifiedByTopic http://www.w3.org/2006/03/wn/wn20/schema/classifiedByTopic
ClassifiedByUsage (wordnet) http://www.wandora.net/wordnet/type/classifiedByUsage http://www.w3.org/2006/03/wn/wn20/schema/classifiedByUsage
Entails (wordnet) http://www.wandora.net/wordnet/type/entails http://www.w3.org/2006/03/wn/wn20/schema/entails
HyponymOf (wordnet) http://www.wandora.net/wordnet/type/hyponymOf http://www.w3.org/2006/03/wn/wn20/schema/hyponymOf
MemberMeronymOf (wordnet) http://www.wandora.net/wordnet/type/memberMeronymOf http://www.w3.org/2006/03/wn/wn20/schema/memberMeronymOf
PartMeronynOf (wordnet) http://www.wandora.net/wordnet/type/partMeronymOf http://www.w3.org/2006/03/wn/wn20/schema/partMeronymOf
SameVerbGroup (wordnet) http://www.wandora.net/wordnet/type/sameVerbGroupAs http://www.w3.org/2006/03/wn/wn20/schema/sameVerbGroupAs
SimilarTo (wordnet) http://www.wandora.net/wordnet/type/similarTo http://www.w3.org/2006/03/wn/wn20/schema/similarTo

Association roles

W3's WordNet does not contain association roles as RDF has no similar structure. For this reason role topics have no corresponding entities in RDF WordNet.

Base name Subject identifiers
action (wordnet) http://www.wandora.net/wordnet/role/action
adjective (wordnet) http://www.wandora.net/wordnet/role/adjective
attribute (wordnet) http://www.wandora.net/wordnet/role/attribute
cause (wordnet) http://www.wandora.net/wordnet/role/cause
consequence (wordnet) http://www.wandora.net/wordnet/role/consequence
hypernym (wordnet) http://www.wandora.net/wordnet/role/hypernym
hyponym (wordnet) http://www.wandora.net/wordnet/role/hyponym
member-holonym (wordnet) http://www.wandora.net/wordnet/role/member-holonym
member-meronym (wordnet) http://www.wandora.net/wordnet/role/member-meronym
part-holonym (wordnet) http://www.wandora.net/wordnet/role/part-holonym
part-meronym (wordnet) http://www.wandora.net/wordnet/role/part-meronym
region (wordnet) http://www.wandora.net/wordnet/role/region
similar-word (wordnet) http://www.wandora.net/wordnet/role/similar-word
topic (wordnet) http://www.wandora.net/wordnet/role/topic
usage (wordnet) http://www.wandora.net/wordnet/role/usage
verb-1 (wordnet) http://www.wandora.net/wordnet/verb-1
verb-2 (wordnet) http://www.wandora.net/wordnet/verb-2
word (wordnet) http://www.wandora.net/wordnet/role/word

Occurrence types

synsetId (wordnet) http://www.w3.org/2006/03/wn/wn20/schema/synsetId

Limitations of the topic map WordNet

To limit the size of resulting topic map some RDF files of WordNet have been left outside the conversion. For example the current WordNet topic map does not contain glossary. However it is very easy to extend the current version by simply importing the required RDF files to Wandora.

WordNet license

WordNet has been created originally in Cognitive Science Laboratory of Princeton University. The topic map conversion of WordNet is based on W3's work on RDF version of WordNet. Read more:

Personal tools