Moby thesaurus extractor

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
Line 9: Line 9:
 
  rootword, similar3
 
  rootword, similar3
  
Association type and roles remain same in all assocations. As Moby thesaurus is very large, you need to give JRE at least 2G of memory to successfully process whole thesaurus. Wandora's Moby thesaurus extractor starts with menu option '''File > Extract > Other > Moby thesaurus extractor'''.
+
Association type and roles remain same in all assocations. As Moby thesaurus is very large, you need to give JRE at least 2G of memory to successfully process whole thesaurus. Wandora's Moby thesaurus extractor starts with menu option '''File > Extract > Language > Moby thesaurus extractor'''.
  
 
Moby thesaurus is not included in Wandora application but you should easily find one as it is public domain. See [http://www.gutenberg.org/etext/3202 Project Gutenberg] for example.
 
Moby thesaurus is not included in Wandora application but you should easily find one as it is public domain. See [http://www.gutenberg.org/etext/3202 Project Gutenberg] for example.

Revision as of 14:50, 9 January 2010

Wandora's Moby thesaurus extractor was developed to convert Moby's thesaurus to topic map format. Moby thesaurus is a specially formatted text file where each line contains a root word and similar words:

rootword similar1 similar2 similar3

Number of similar words varies. Extractor converts previous example line to three binary associations

rootword, similar1
rootword, similar2
rootword, similar3

Association type and roles remain same in all assocations. As Moby thesaurus is very large, you need to give JRE at least 2G of memory to successfully process whole thesaurus. Wandora's Moby thesaurus extractor starts with menu option File > Extract > Language > Moby thesaurus extractor.

Moby thesaurus is not included in Wandora application but you should easily find one as it is public domain. See Project Gutenberg for example.