Importing RDF

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(See also)
 
(19 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Wandora reads [http://www.w3.org/RDF/ RDF(S)] and N3 files. Import starts with '''File > Import > [[SimpleRDFImport|Simple RDF(S) Import...]]''' or '''File > Import > [[SimpleN3Import|Simple N3 Import...]]'''. Optionally you can drag and drop RDF(S) files to layer stack. Layer stack automatically imports dropped RDF(S) file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF(S) file. Lets see the conversion process more detailed.
+
Wandora reads [http://www.w3.org/RDF/ RDF] XML, N3, Turtle and JSON-LD files. Import starts with '''File > Import > [[SimpleRDFImport|Simple RDF XML Import...]]''' or '''File > Import > [[SimpleN3Import|Simple RDF N3 Import...]]''' or '''File > Import > Simple RDF Turtle Import...''' or '''File > Import > Simple RDF JSON-LD Import...'''. Optionally you can drag and drop RDF files to layer stack. Layer stack automatically imports dropped RDF file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF file. Lets see the conversion process more detailed.
  
 
* A topic is always created for RDF '''subject''' and '''predicate'''. Topics created for the '''subject''' and '''predicate''' are typed with Wandora's predefined type topics.
 
* A topic is always created for RDF '''subject''' and '''predicate'''. Topics created for the '''subject''' and '''predicate''' are typed with Wandora's predefined type topics.
Line 7: Line 7:
 
* If '''object''' is not RDF literal, a topic is created for the '''object''' and the topic is associated with the '''subject''' topic. Association's type is the '''predicate''' topic. Both roles are Wandora's predefined topics. '''Object''' topic is typed with Wandora's predefined type topic.
 
* If '''object''' is not RDF literal, a topic is created for the '''object''' and the topic is associated with the '''subject''' topic. Association's type is the '''predicate''' topic. Both roles are Wandora's predefined topics. '''Object''' topic is typed with Wandora's predefined type topic.
  
Created topics doesn't contain base names or variant names. Created topics inherit one subject identifier from equivalent RDF resource. Subject identifier is the URI of equivalent RDF resource. Wandora employs [http://jena.sourceforge.net/ Jena RDF framework] to read RDF(S) files. Below is the Java code snippet used to handle RDF statements in Wandora.  
+
Created topics doesn't contain base names or variant names. Created topics inherit one subject identifier from equivalent RDF resource. Subject identifier is the URI of equivalent RDF resource. Wandora employs [http://jena.apache.org/ Jena RDF framework] to read RDF files. Below is the Java code snippet used to handle RDF statements in Wandora.  
  
  
Line 53: Line 53:
  
  
==Post-processing the imported RDF(S)==
+
==Post-processing the imported RDF==
  
To make the imported RDF(S) more topic mappish you may want to modify it after import. This chapter discusses about the post-processing techniques to make the RDF-imported topic map more convenient.
+
To make the imported RDF more topic mappish you may want to modify it after import. This chapter discusses about the post-processing techniques to make the RDF-imported topic map more convenient.
  
 
=== Constructing base names ===
 
=== Constructing base names ===
  
First step is to add base names to the topics. RDF(S) originated topics contain no base name and only one subject identifier. You can create base name with topic's subject identifier using '''[[MakeBasenameWithSI|Make base name with SI]]''' tool found in topic table's context menu under '''Topics > Base names'''. Base name is automatically constructed using filename and anchor of the subject identifier URLs. If your topic map contains subject identifiers with identical filenames you should take extra care of these topics to prevent automatic merge of topics.
+
RDF(S) originated topics contain no base names. First step is to add base names to the imported topics. You can create a base name with topic's subject identifier using '''[[MakeBasenameWithSI|Make base name with SI]]''' tool found in topic table's context menu under '''Topics > Base names'''. Base name is automatically constructed using filename and anchor of the subject identifier URLs. If the created topic map contains subject identifiers with identical filenames, take extra care of these topics to prevent automatic merge of topics.
  
Second step is to clean up base names. You can use '''Topics > Base names > [[BasenameRegexReplacer|Regex replace...]]''' to filter out undesired parts of the base names.
+
Second step is to clean up base names. You can use '''Topics > Base names > [[BasenameRegexReplacer|Regex replace...]]''' to filter out undesired parts of the base names. If you start the tool in context of layer, tool processes all base names found in layer's topic map. For example, to filter out starting '''prefix''' string in base names you could use regular expression
 +
 
 +
prefix(.+)
 +
 
 +
and replacement
 +
 
 +
$1
  
 
=== Constructing variant names ===
 
=== Constructing variant names ===
  
Third step is to generate variant names from RDF label occurrences. Generally RDF document carries labels attached to RDF concepts. Labels may be language dependent. If such labels exists, a label occurrence is associated to RDF topic. To generate variant names from RDF label occurrences, select all RDF topics and use tool '''Topics > Variant names > Make display variants with occurrences'''.
+
Third step is to generate variant names from RDF label occurrences. Generally RDF document carries labels attached to RDF concepts. Labels may be language dependent. If such labels exists, a label occurrence is associated to RDF topic. To generate variant names from RDF label occurrences, select all RDF topics and use tool '''Topics > Variant names > Make display variants with occurrences'''. Tool copies occurrence texts to variant names.
 +
 
 +
If variant construction was successful, you may want to remove label occurrences. To remove occurrences of given type use tool '''Topics > Occurrences > Delete occurrences with type...'''. Tool seeks all possible occurrence types and asks which occurrences to remove. Once again, if you want to process every topic in topic map, start the tool in context of layer.
  
 
=== Processing associations ===
 
=== Processing associations ===
  
Final step is to change roles of RDF(S) originated associations. By default these roles are  
+
Final step is to change roles of RDF originated associations. By default these roles are  
  
* http://www.wandora.net/core/rdf-subject
+
* http://wandora.net/si/core/rdf-subject
* http://www.wandora.net/core/rdf-object
+
* http://wandora.net/si/core/rdf-object
  
You can not just edit these two role topics as all players share role topics. Instead you need to modify associations with '''[[ChangeAssociationRole|Change association role...]]''' and '''[[ChangeAssociationType|Change association type...]]''' tools found in context menu association table. Depending on the original RDF(S) structure changing roles may be a big task. In general this step includes subtasks:
+
You can not rename role topics as all players share same roles. Instead you need to modify associations with '''[[ChangeAssociationRole|Change association role...]]''' and '''[[ChangeAssociationType|Change association type...]]''' tools found in context menu of association table. In general this step includes subtasks:
  
 +
* Create all '''new''' role and association type topics
 
* For each association type
 
* For each association type
 
** Open association type topic
 
** Open association type topic
 
** Select all associations within the association table
 
** Select all associations within the association table
** For each role use tool '''[[ChangeAssociationRole|Change association role...]]'''
+
** Use tool '''[[ChangeAssociationRole|Change association role...]]''' to change each role
 
** Use tool '''[[ChangeAssociationType|Change association type...]]''' if necessary
 
** Use tool '''[[ChangeAssociationType|Change association type...]]''' if necessary
 +
 +
== See also ==
 +
 +
Wandora contains also several different RDF extractors that can automatically recognize RDF's name space and create valid base names and association roles for extracted topics and associations. By default these simple RDF extractors locate in '''File > Extract > Simple RDF extract''' menu. Current RDF extractors are
 +
 +
* [[Twine RDF extractor]]
 +
* [[SKOS RDF extractor]]
 +
* [[Dublin Core RDF extractor]]
 +
* [[FOAF RDF extractor]]
 +
* [[IIIF RDF extractor]]
 +
* OWL Extractor
 +
* RDFS Extractor
 +
* RSS 1.0 RDF Extractor
 +
 +
__NOTOC__

Latest revision as of 15:02, 29 May 2015

Wandora reads RDF XML, N3, Turtle and JSON-LD files. Import starts with File > Import > Simple RDF XML Import... or File > Import > Simple RDF N3 Import... or File > Import > Simple RDF Turtle Import... or File > Import > Simple RDF JSON-LD Import.... Optionally you can drag and drop RDF files to layer stack. Layer stack automatically imports dropped RDF file and creates a new layer for the file. Wandora converts imported RDF triplets to topics, associations and occurrences. Convert schema is very simple and pays no attention to semantics of RDF file. Lets see the conversion process more detailed.

  • A topic is always created for RDF subject and predicate. Topics created for the subject and predicate are typed with Wandora's predefined type topics.
  • If object is RDF literal, an occurrence (text data) is created for the subject topic. Occurrence's type is the predicate topic and occurrence's value the RDF literal. Occurrence's scope is derived from lang attribute. If lang attribute is not found, scope is language independent.
  • If object is not RDF literal, a topic is created for the object and the topic is associated with the subject topic. Association's type is the predicate topic. Both roles are Wandora's predefined topics. Object topic is typed with Wandora's predefined type topic.

Created topics doesn't contain base names or variant names. Created topics inherit one subject identifier from equivalent RDF resource. Subject identifier is the URI of equivalent RDF resource. Wandora employs Jena RDF framework to read RDF files. Below is the Java code snippet used to handle RDF statements in Wandora.


   public void handleStatement(Statement stmt, TopicMap map,
                               Topic subjectType,
                               Topic predicateType,
                               Topic objectType) throws TopicMapException {
       
       Resource subject   = stmt.getSubject();     // get the subject
       Property predicate = stmt.getPredicate();   // get the predicate
       RDFNode object     = stmt.getObject();      // get the object
       String lan         = null;                  // language attribute
       
       Topic subjectTopic = getOrCreateTopic(map, subject.toString());
       Topic predicateTopic = getOrCreateTopic(map, predicate.toString());
       
       subjectTopic.addType(subjectType);
       predicateTopic.addType(predicateType);
      
       if(object.isLiteral()) {
           try { lan = stmt.getLanguage(); } catch(Exception e) { /* LANG ATTRIBUTE NOT FOUND! */ }
           if(lan==null || lan.length()==0) {
              subjectTopic.setData(predicateTopic,
                                getOrCreateTopic(map, occurrenceScopeSI),
                                                 ((Literal) object).getString());
           }
           else {
              subjectTopic.setData(predicateTopic,
                                getOrCreateTopic(map, XTMPSI.getLang(lan)),
                                                 ((Literal) object).getString());
           }
       }
       else if(object.isResource()) {
           Topic objectTopic = getOrCreateTopic(map, object.toString());
           Association association = map.createAssociation(predicateTopic);
           association.addPlayer(subjectTopic, subjectType);
           association.addPlayer(objectTopic, objectType);
           objectTopic.addType(objectType);
       }
       else if(object.isURIResource()) {
           log("URIResource found but not handled!");
       }        
   }


[edit] Post-processing the imported RDF

To make the imported RDF more topic mappish you may want to modify it after import. This chapter discusses about the post-processing techniques to make the RDF-imported topic map more convenient.

[edit] Constructing base names

RDF(S) originated topics contain no base names. First step is to add base names to the imported topics. You can create a base name with topic's subject identifier using Make base name with SI tool found in topic table's context menu under Topics > Base names. Base name is automatically constructed using filename and anchor of the subject identifier URLs. If the created topic map contains subject identifiers with identical filenames, take extra care of these topics to prevent automatic merge of topics.

Second step is to clean up base names. You can use Topics > Base names > Regex replace... to filter out undesired parts of the base names. If you start the tool in context of layer, tool processes all base names found in layer's topic map. For example, to filter out starting prefix string in base names you could use regular expression

prefix(.+)

and replacement

$1

[edit] Constructing variant names

Third step is to generate variant names from RDF label occurrences. Generally RDF document carries labels attached to RDF concepts. Labels may be language dependent. If such labels exists, a label occurrence is associated to RDF topic. To generate variant names from RDF label occurrences, select all RDF topics and use tool Topics > Variant names > Make display variants with occurrences. Tool copies occurrence texts to variant names.

If variant construction was successful, you may want to remove label occurrences. To remove occurrences of given type use tool Topics > Occurrences > Delete occurrences with type.... Tool seeks all possible occurrence types and asks which occurrences to remove. Once again, if you want to process every topic in topic map, start the tool in context of layer.

[edit] Processing associations

Final step is to change roles of RDF originated associations. By default these roles are

You can not rename role topics as all players share same roles. Instead you need to modify associations with Change association role... and Change association type... tools found in context menu of association table. In general this step includes subtasks:

  • Create all new role and association type topics
  • For each association type

[edit] See also

Wandora contains also several different RDF extractors that can automatically recognize RDF's name space and create valid base names and association roles for extracted topics and associations. By default these simple RDF extractors locate in File > Extract > Simple RDF extract menu. Current RDF extractors are


Personal tools