Topic map conversion of OpenCyc
OpenCyc is a large general knowledge base and commonsense reasoning engine. OpenCyc is open source and limited version of the Cyc. Topic map conversion of the OpenCyc is based on RDF conversion of the OpenCyc provided by Stephen L. Reed for his Texai project. Topic map conversion was created with Wandora's RDF import feature following light manual processing.
Contents |
Download
There are two versions of the OpenCyc topic map available:
- OpenCyc Wandora project file (14.6MB) is targeted for Wandora users. Wandora requires at least 1.4G of memory to open the OpenCyc project file successfully.
- OpenCyc XTM dump (zipped 14.8MB, uncompressed 250 MB) is targeted for all topic map applications capable to import XTM format.
History
- 2008-07-15. First version published.
Metrics
Metrics have been measured on OpenCYC layer of the Wandora project file. The XTM dump metrics may differ a bit.
- Number of topics: 120410
- Number of associations: 424064
- Number of topic base names: 120409
- Number of subject identifiers: 120415
- Number of subject locators: 0
- Number of occurrences: 244173
- Number of distinct topic classes: 1
- Number of distinct types of associations: 73
- Number of distinct roles in associations: 4
- Number of distinct players in associations: 116212
- Average clustering coefficient: 0.16878
Conversion details
Topic map conversion of OpenCyc has a navigation structure of topics:
- OpenCyc (http://www.wandora.org/opencyc) is a subclass of Wandora class topic. It collects both the OpenCyc types and root node of OpenCyc i.e. Thing topic together.
- OpenCyc Types (http://www.wandora.org/opencyc/types) is a subclass of OpenCyc. It collects all OpenCyc's association and occurrence types as instances.
- Thing (http://www.w3.org/2002/07/owl#Thing) is a root node of OpenCyc ontology. It can be used to navigate anywhere in the ontology. However, it appears to contain a lot more subclasses than OpenCyc Upper Ontology diagrams usually suggest. Thing is also a subclass of OpenCyc topic.
Each OpenCyc topic in topic map conversion contains a
- Subject identifier of format http://sw.cyc.com/2006/07/27/cyc/Concept where Concept is the CycLConstant i.e. #$Concept. Subject identifier resolves a WWW page of the concept. In some cases subject identifier is equivalent to a concept of RDFS and OWL vocabulary. Such concepts are domain with SI http://www.w3.org/2000/01/rdf-schema#domain and subPropertyOf with SI http://www.w3.org/2000/01/rdf-schema#subPropertyOf for example.
- Base name equal to CycLConstant. For example the topic for a concept #$DistributedFilesystem has a base name DistributedFilesystem.
- Occurrences for prettyString's of the OpenCyc concept. PrettyString is a string representation of the concept. You could think it as the variant name of the OpenCyc topic. However, variant names are not used to model the prettyString. Design decision was due to an idea to keep OpenCyc changes minimal. Occurrence's type is prettyString http://sw.cyc.com/2006/07/27/cyc/prettyString
and scope Lang.indep. http://www.wandora.org/core/langindependent.
Limitations
- The topic map conversion contains only OpenCyc's binary relations.
- Non-atomic terms are not included.
- Topic Maps do not support semantics of many OpenCyc relations.
- Each Cyc topic contains at most one arbitrary selected PrettyString and PrettyStringCanonical.
License
GNU General Public License (GPL)