Topic map conversion of Moby Thesaurus II

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(Download topic map conversion of Moby Thesaurus II)
 
(18 intermediate revisions by one user not shown)
Line 1: Line 1:
Moby Thesaurus II is a large English thesaurus containing 30260 root words and over 2,5 million synonyms. Moby Thesaurus II has been collected by [http://en.wikipedia.org/wiki/Grady_Ward Grady Ward] and has been public domain since 1996. Moby Thesaurus II is part of a larger  [http://en.wikipedia.org/wiki/Moby_Project Moby Project].
+
Moby Thesaurus II is a large English thesaurus containing 30260 root words and over 2,5 million synonyms. Moby Thesaurus II has been collected by [http://en.wikipedia.org/wiki/Grady_Ward Grady Ward] and has been public domain since 1996. Moby Thesaurus II is part of a larger  [http://en.wikipedia.org/wiki/Moby_Project Moby Project]. Topic map conversion of the Moby Thesaurus II was created with Wandora's [[Moby thesaurus extractor]].
 
+
Topic map conversion of the Moby Thesaurus II was created with Wandora's [[Moby thesaurus extractor]].
+
  
 
== Download topic map conversion of Moby Thesaurus II ==
 
== Download topic map conversion of Moby Thesaurus II ==
Line 7: Line 5:
 
Topic map conversion of Moby Thesaurus II is available as
 
Topic map conversion of Moby Thesaurus II is available as
  
* [http://www.wandora.org/wandora/download/other/moby/moby_thesaurus.wpr Wandora project file] (24M)
+
* [http://www.wandora.org/download/other/moby/moby_thesaurus.wpr Wandora project file] (24M)
* [http://www.wandora.org/wandora/download/other/moby/moby_thesaurus.zip Zipped XTM2 topic map file] (26M)
+
* [http://www.wandora.org/download/other/moby/moby_thesaurus.zip XTM 2.0 topic map file] (zipped size 26M, uncompressed size 759M)
  
As the topic map conversion is relatively large, you need to give Wandora application at least 4G of memory to open the topic map successfully.
+
Wandora project file can be used in Wandora application while the XTM file should be usable in any topic map application supporting XTM serializations. As the topic map conversion is relatively large, you need to give Wandora application at least 4G of memory to open the topic map successfully. If you are importing the topic map in any other topic map applications, consult application documentation to ensure it can read large topic map serializations.
  
 
== History ==
 
== History ==
Line 16: Line 14:
 
* 2011-07-12, Initial version.
 
* 2011-07-12, Initial version.
  
== Conversion details ==
+
== Conversion details and metrics ==
  
Each word in Moby thesaurus is converted to a topic. Topic's base name is the word it self. Topic's subject identifier is <nowiki>http://www.wandora.org/moby/word</nowiki> where ''word'' is a slightly cleaned word. The only association type used in topic map is '''moby-related-words''' with a subject identifier '''<nowiki>http://www.wandora.org/moby/schema/related-words</nowiki>'''. Root word role is '''word-1''' with subject identifier '''<nowiki>http://www.wandora.org/moby/schema/word1</nowiki>'''. Related word role is '''word-2''' with subject identifier '''<nowiki>http://www.wandora.org/moby/schema/word2</nowiki>'''.
+
Each word in Moby thesaurus is converted to a topic. Topic's base name is the word it self. Topic's subject identifier template is '''<nowiki>http://www.wandora.org/moby/word</nowiki>''' where ending '''word''' is replaced with a slightly cleaned word. The only association type used in topic map is '''moby-related-words''' with a subject identifier '''<nowiki>http://www.wandora.org/moby/schema/related-words</nowiki>'''. Root word role is '''word-1''' with subject identifier '''<nowiki>http://www.wandora.org/moby/schema/word1</nowiki>'''. Related word role is '''word-2''' with subject identifier '''<nowiki>http://www.wandora.org/moby/schema/word2</nowiki>'''.
  
 
Topic map conversion of Moby Thesaurus II has
 
Topic map conversion of Moby Thesaurus II has
Line 31: Line 29:
 
* 1 distinct types of associations
 
* 1 distinct types of associations
 
* 2 distinct roles in associations
 
* 2 distinct roles in associations
* 103305 Number of distinct players in associations
+
* 103305 distinct players in associations
  
Average coefficient for Moby Thesaurus II is 0.66491885
+
Average coefficient for topic map conversion of Moby Thesaurus II is 0.66491885
  
Layer connection statistics give an association distribution as shown below.
+
[[Topic_map_connection_statistics|Layer connection statistics]] are shown below.
  
 
[[Image:moby_example_02.gif|center]]
 
[[Image:moby_example_02.gif|center]]
  
Topic map conversion of Moby Thesaurus II has no connecting topics nor relations to Wandora's default ontology. As a consequence, Wandora views no thesaurus topics after the topic map has been imported to Wandora. User has to search to thesaurus topics explicitly with '''Finder''', for example. Next screen capture views Wandora after user has searched for word '''people''', and user has opened topic '''people''' by double clicking the search result row. Association table views 335 associations where topic '''people''' is either a root word and a related word.
+
 
 +
Topic map conversion of Moby Thesaurus II is not connected to Wandora's default ontology. As a consequence, Wandora views no thesaurus topics when the topic map is imported to Wandora. Instead, user has to search thesaurus topics explicitly. Next screen capture views Wandora after user has searched for a word '''people''', and user has opened the topic '''people''' by double clicking the search result row. Association table views 335 associations where the topic '''people''' is either a root word or a related word.
  
  
Line 47: Line 46:
  
 
Topic map conversion of Moby Thesaurus II is licensed as public domain.
 
Topic map conversion of Moby Thesaurus II is licensed as public domain.
 +
 +
 +
__NOTOC__

Latest revision as of 16:14, 26 July 2012

Moby Thesaurus II is a large English thesaurus containing 30260 root words and over 2,5 million synonyms. Moby Thesaurus II has been collected by Grady Ward and has been public domain since 1996. Moby Thesaurus II is part of a larger Moby Project. Topic map conversion of the Moby Thesaurus II was created with Wandora's Moby thesaurus extractor.

[edit] Download topic map conversion of Moby Thesaurus II

Topic map conversion of Moby Thesaurus II is available as

Wandora project file can be used in Wandora application while the XTM file should be usable in any topic map application supporting XTM serializations. As the topic map conversion is relatively large, you need to give Wandora application at least 4G of memory to open the topic map successfully. If you are importing the topic map in any other topic map applications, consult application documentation to ensure it can read large topic map serializations.

[edit] History

  • 2011-07-12, Initial version.

[edit] Conversion details and metrics

Each word in Moby thesaurus is converted to a topic. Topic's base name is the word it self. Topic's subject identifier template is http://www.wandora.org/moby/word where ending word is replaced with a slightly cleaned word. The only association type used in topic map is moby-related-words with a subject identifier http://www.wandora.org/moby/schema/related-words. Root word role is word-1 with subject identifier http://www.wandora.org/moby/schema/word1. Related word role is word-2 with subject identifier http://www.wandora.org/moby/schema/word2.

Topic map conversion of Moby Thesaurus II has

  • 103308 topics
  • 2520086 associations
  • 103308 base names
  • 103308 identifiers
  • 0 subject locators
  • 0 occurrences
  • 0 topic classes
  • 1 distinct types of associations
  • 2 distinct roles in associations
  • 103305 distinct players in associations

Average coefficient for topic map conversion of Moby Thesaurus II is 0.66491885

Layer connection statistics are shown below.

Moby example 02.gif


Topic map conversion of Moby Thesaurus II is not connected to Wandora's default ontology. As a consequence, Wandora views no thesaurus topics when the topic map is imported to Wandora. Instead, user has to search thesaurus topics explicitly. Next screen capture views Wandora after user has searched for a word people, and user has opened the topic people by double clicking the search result row. Association table views 335 associations where the topic people is either a root word or a related word.


Moby example 03.gif

[edit] License

Topic map conversion of Moby Thesaurus II is licensed as public domain.


Personal tools