Database extractor

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(Using the database import tool)
Line 3: Line 3:
 
== Using the database import tool ==
 
== Using the database import tool ==
  
Database conversion is done with the '''Database Extractor''' tool found in '''File > Extract''' menu. After launching the tool, it will ask for the database to import. Either select the database if it is configured allready or create a new database entry and fill in appropriate connection details. After selecting the database the import process will start. Note that the import process may take anywhere from seconds to hours depending on the size of the database.
+
Database conversion is done with the '''[[GenericDatabaseExtractor|Database Extractor]]''' tool found in '''File > Extract''' menu. After launching the tool, it will ask for the database to import. Either select the database if it is configured allready or create a new database entry and fill in appropriate connection details. After selecting the database the import process will start. Note that the import process may take anywhere from seconds to hours depending on the size of the database.
  
 
== Postprocessing after conversion ==
 
== Postprocessing after conversion ==

Revision as of 19:06, 12 February 2007

Wandora has a feature to import any relational database. Information in the database will be converted into topic map format and then imported.

Using the database import tool

Database conversion is done with the Database Extractor tool found in File > Extract menu. After launching the tool, it will ask for the database to import. Either select the database if it is configured allready or create a new database entry and fill in appropriate connection details. After selecting the database the import process will start. Note that the import process may take anywhere from seconds to hours depending on the size of the database.

Postprocessing after conversion

The conversion works for (nearly) any kind of relational database and thus it is very general. For this reason converted databases usually need post prosessing, either manually or with other tools, before using in a production environment.

Post processing should at least include renaming association, role and topic types and possibly merging some of these topics. Some occurrences or association members may be unneeded and can be deleted with built in Wandora tools.

Traditional databases will also often refer to same concepts in several different ways. For example names of people may have been written last or first name first, may or may not include middle initials and may contain typing errors. All these should be corrected so that a single concept (in this case a person) only has a single topic.

Conversion details

Basic idea is that each row of each table is converted into a topic, there are some exceptions to this rule. Since primary key(s) identify the row inside the table, it is natural to use the pimary keys with the table name as the subject identifier of the topic.

Values of each normal column, that is column that is not a reference to some other (or same) table, is converted into either an occurrence or a new topic which is then associated with the topic representing the row. Creating topics for column values is natural when the value is shared by many rows, for example when the value represents a year. Creating occurrences is better suited for situations where most or all rows have unique values and creating separate topics would only make the topic map more complex without offering any additional linkage between topics. Note: In current version of Wandora all columns are converted into occurrences. The code is in place to make associations instead of occurrences but there is no user interface or any other mechanism to select which columns should be converted to associations.

Columns that reference some other column in another or same table are converted into associations. This is the natural thing to do as the referenced row is also a topic and the reference associates these two topics (rows). Note: Current version of Wandora does not support database schemas where several columns together reference another row.

Tables that only contain references are treated differently. Each row in such a table is converted into an association instead of a topic. This is done because each row links several topics without adding any other information. Thus creating an association for each row is more meaningful than creating a topic.

Tables with only one primary key that is also a foreign key are also treated slightly differently. This kind of table can usually be thought to be extending another row in the table that is referenced. For this reason subject identifier is created with the table name of the referenced table. This will cause topics of both this and the referenced tables to merge. This means that information from both tables will be in the same topic.

Personal tools