Query language

From WandoraWiki
(Difference between revisions)
Jump to: navigation, search
(Topic map directives)
(Topic map directives)
Line 326: Line 326:
 
''Regex'' directive can also be used to replace matches with other strings but replacement is not usable when using ''Regex'' directive as a filtering directive. See [[Query_language#Regex_2 | Regex]] entry under ''Others'' for more details about this.
 
''Regex'' directive can also be used to replace matches with other strings but replacement is not usable when using ''Regex'' directive as a filtering directive. See [[Query_language#Regex_2 | Regex]] entry under ''Others'' for more details about this.
  
=== Topic map directives ===
+
=== Topic maps directives ===
 +
 
 +
Result and input rows may contain values of any type. Most other than topic maps related directives only use string values. Most topic maps related queries assume that input values are either topics or subject identifiers of topics. Thus it is usually not necessary to specifically convert string values to topics first. The most common case where you would want to specifically do that is to ensure that the final result of the query contains topics instead of string. This allows you to use all the Wandora topic related tools on the result table. To convert subject identifier string to topics use the [[Query_language#Topics |Topics]] directive.
  
 
==== '''AllTopics''' ====
 
==== '''AllTopics''' ====
 +
 +
''Description:'' Returns all topics of the topic map.
 +
 +
''Constructor:''
 +
 +
AllTopics()
 +
 +
''Notes:''
 +
 +
Note that retrieving all topics of a large topic map can be very costly operation in terms of time. Thus whenever possible, try to avoid using this directive. If you for example want all topics that are instances of some other topic, instead of using something like ''new AllTopics().where(new IsOfType(a))'' use ''new Instances().from(a)''.
 +
  
 
==== '''BaseName''' ====
 
==== '''BaseName''' ====
 +
 +
''Description:'' Gets the base name of the active column value of input.
 +
 +
''Constructor:''
 +
 +
BaseName()
  
 
==== '''Instances''' ====
 
==== '''Instances''' ====
 +
 +
''Description:'' Gets all instances of the active column value of input.
 +
 +
''Constructor:''
 +
 +
Instances()
  
 
==== '''IsOfType''' ====
 
==== '''IsOfType''' ====
Line 341: Line 366:
  
 
==== '''Topics''' ====
 
==== '''Topics''' ====
 +
 +
''Description:'' Gets the topic with subject identifier equal to the string value of the active column value of input.
 +
 +
''Constructor:''
 +
 +
new Topics()
  
 
=== Others ===
 
=== Others ===

Revision as of 15:04, 19 August 2009

NOTE: This page describes a feature not yet available in the public version of Wandora. It should be available soon in a future release.

Wandora uses a custom query language to select topics in a topic map. Currently the query language is used only in the search tool. In the future it will also be added to Custom topic panel and Query topic map which currently use the old version of the query language.

Contents

Introduction

Wandora does not use any standard query language. Instead queries are done by invoking a method of a Java class implementing a certain interface. The class may then perform anything whatsoever as long as in the end it returns query results in the format specified by the Java interface. Wandora does however include a number of classes designed in a way that makes it possible to build complex queries by combining these simple predefined query directive classes. This somewhat resembles a traditional query language.

The queries are defined using a generic scripting language. Wandora uses Java scripting API so it should be possible to use a number of different languages. Examples in this article use Mozilla Rhino 1.6 language that should be found in most installations. This scripting language uses a syntax that is identical to regular Java syntax in nearly every way. Because of this you should be at least somewhat familiar with Java syntax to understand the examples on this page.

Following example demonstrates a query that selects the number of instances in a topic.

1 importPackage(org.wandora.query2);
2 new Count(
3   new Instances()
4 );

First line imports the query package. This is one few things where the scripting language syntax is different than normal Java syntax. Lines 2 to 4 contain the actual query. The Count directive counts the number of rows in the result of the directive inside it. The Instances directive inside Count on line 3 selects all instances of the input.

All directives may return any number of rows and get as input a single row. Each row may contain any number of values. Values are indexed with column role names which can be any text strings but are generally formed like URIs. Thus each row resembles a topic map association, only without an association type. One of the columns in a row is marked as active. This is usually the column that was last added or modified by a directive and its value is the primary input for other directives. Most directives use the default column name "#DEFAULT".

It was said above that each directive receives as input only a single row. However in the above example the Count directive counts the number of rows of the Instances directive which of course may be any number of rows. The Instances directive or its result are not actually considered input to Count directive. Count is executed first so it must have received input before we even get to Instances. The input to top-most directive is usually the currently open topic in Wandora. To be more specific, it contains a single column with the default name and the value is the currently open topic in Wandora and this single column is the active column of the row.

The Count directive passes its input to the inner Instances directive as is. The Instances directive uses the active column of its input, in this case the currently open topic in Wandora, and gets all instances of that. Generally directives add their results to the input row as new columns with the default name and set the new column as active. In this case the only column in input uses the default name and gets overwritten. Thus the result of the Instances directive is some number of rows, each containing a single column with the default name and the value is some topic which is an instance of the input.

The result of the Instances directive goes back to the Count directive which counts the rows in it. It adds this number in the input row, not the results of Instances directive. Again the single column gets overwritten because it had the default name. The final result is a single row which contains a single column with the default name and a number indicating the number of instances in the currently open topic.

#DEFAULT
9

Now lets modify the query slightly.

1 importPackage(org.wandora.query2);
2 new Count(
3   new Instances()
4 ).from(
5   new Instances()
6 )

The from method on line 4 causes the input for the Count directive to come from some other directive. In this case we use again an Instances directive. The execution of this query goes as follows. The initial input, the currently open topic, first goes to the directive inside the from part, that is Instances directive on line 5. This returns the instances of the currently open topic. Then each of these result rows are fed one at a time to the Count directive and all the results of Count are combined. The Count directive itself works exactly as in previous example, it only gets a different input this time. Now it counts the instances of all instances of the currently open topic. We still only get one column in the final result because at each step the default column gets overwritten. The final result might look something like this.

#DEFAULT
7
2
5
0
4
1
5
1
9

In the above table each number tells the number of instances of some topic which is an instance of the currently open topic. But this isn't very useful since we can't know which number corresponds to which topic. So let's modify the query a bit again.

1 importPackage(org.wandora.query2);
2 new Count(
3   new Instances()
4 ).as("#count").from(
5   new Instances().as("#instance")
6 )

The as methods on line 4 and 5 reset the column name to something other than the default. On line 5 we change the column containing the instance topics to "#instance" and on line 4 the column for the count number to "#count". Now our final result has these two columns and we can actually see which topic each of the instance counts belongs to.

#instance #count
Role 7
Wandora variant name version 2
Association type 5
Wandora class 0
Wandora language 4
Role class 1
Content type 5
Occurrence type 1
Schema type 9

As was mentioned earlier, directives usually use the active column value of the input row. This active column is the last column that was added or modified. In most cases this is the right choice for input but not always. Let's say we want to get the base name of the instance topics too and add a BaseName directive as is done on line 5 in following example.

1 importPackage(org.wandora.query2);
2 new Count(
3   new Instances()
4 ).as("#count").from(
5   new BaseName().as("#basename").from(
6     new Instances().as("#instance")
7   )
8 )

The base name column is the last column added and thus the active column. The instances directive tries to use this as input. This will fail because the base names aren't actually topics. To fix this we need to manually change the active column.

1 importPackage(org.wandora.query2);
2 new Count(
3   new Instances().of("#instance")
4 ).as("#count").from(
5   new BaseName().as("#basename").from(
6     new Instances().as("#instance")
7   )
8 )

The of method on line 3 changes the active column before the input gets to the Instances directive and this query works as expected.

Directives

Query Structure

As

Description: Changes the role name of the active column or the specified column.

Constructor:

As(String newRole) - Change the role name of the active column.

As(String original, String newRole) - Change the role name of the specified column.

Notes:

Generally you have to feed rows to As directive from some other directive through a From directive. All this is best done using the as method present in every directive. Calling A.as("role") will change the active column role name of the results of A. This will resolve to new As("role").from(A) which resolves to new From(new As("role"),A)'. The as method can also be given two parameters corresponding to the As directive constructor with two parameters.

From

Description: Takes the results from one directive and feeds them one row at a time to another combining all results.

Constructor:

From(Directive to,Directive from)

Notes:

This directive is best used using the from method present in every directive. Calling A.from(B) will resolve to new From(A,B).

The from method, but not the directive constructor, can be given several directives. In this case they will be joined using the Join directive. A.from(B,C) will resolve to new From(A,new Join(B,C)).

You may also give the from method, not the directive constructor, one or more strings. In this case a Literals directive will be implicitly created from the strings. A.from("b","c","d") resolves to new From(A,new Literals("b","c","d")).

If

Description: Returns results of one of two directives depending on the input. The conditional directive is given the input of this directive. If it returns a non empty result then the then directive is used. Otherwise the else directive is used if provided or an empty result is returned.

Constructor:

If(Directive cond,Directive then,Directive else) - Depending on cond, returns the results of either then or else directive.

If(Directive cond,Directive then) - Depending on cond, returns the results of then directive or an empty result.


Join

Description: Joins the results of inner directives by performing a cartesian product on the results of them.

Constructor:

Join(Directive d1,Directive d2,...)

Notes:

This directive is best used using the join method present in every directive. Calling A.join(B) will resolve to new Join(A,B). In many cases you do not need to refer to join specifically at all. Instead you may provide several directives to form method which will implicitly join them.

Of

Description: Changes the active column of input to the specified column.

Constructor:

Of(String role)

Notes:

This directive is best used using the of method present in every directive. Calling A.of("role") will resolve to A.from(new Of("role")) which resolves to new From(A,new Of("role")). Usually this is followed by a call to from method, for example A.of("role").from(B). This will take results from B, change the active column and then feed them to A. Call to of must be before from to get the expected results.

Union

Description: Joins the results of inner directives by concatenating them. If the results from different directives have different roles, new columns will be added with null values to make roles of each row identical. Duplicate rows are not removed.

Constructor:

Union(Directive d1,Directive d2,...)

Primitive

Empty

Description: Returns an empty result.

Constructor:

Empty()

Identity

Description: Returns the input row as is.

Constructor:

Identity()

Literals

Description: Returns the strings provided to constructor. The returned rows have one column with the default role.

Constructor:

Literals(String s1,...)

Notes:

The from method present in every directive can be given strings instead of directives. In this case a Literals directive is implicitly from the strings. A.from("b","c","d") resolves to new From(A,new Literals("b","c","d")).

Static

Description: Returns the provided result rows. This is similar to Literals but whereas Literals returns rows with single column and string values, this can return rows with any number of columns and any kind of values.

Constructor:

Static(ResultRow row) - Directive will return a single row.

Static(ArrayList<ResultRow> rows) - Directive will return the rows in the provided list.

Filtering

Note. All filtering directives can be used with the where method present in all directives. A.where(B) resolves to new From(B,A).

And

Description: Includes rows which satisfy all inner filtering directives.

Constructor:

And(WhereDirective d1,WhereDirective d2,...)

Notes:

In simple cases you can avoid using And directive by calling the where method twice. A.where(new And(B,C)) is logically same as A.where(B).where(C) though structurally it resolves to a slightly different query.

Compare

Description: Compares the values of two roles in the input column. The directive constructor is given a comparison operator as a string. It can be one of "==", "!=", "<", ">", "<=", ">=" or various aliases of these commonly used in programming languages. It can also be one of two topic map related operators "t=" or "t!=". These compare the equality or inequality, respectively, of topics instead of numeric or string representations the values.

Constructor:

Compare(String role1,String comp,String role2)

Compare(String role1,String comp,String role2,boolean numeric) - If numeric is true, converts the values to numbers before comparing

Notes:

The where method present in every directive can be given three strings corresponding to the first constructor. A new Compare directive will be implicitly created. A.where("r1","==","r2") will resolve to new From(new Compare("r1","==","r2"),A).

Exists

Description: Includes rows where the inner directive returns a non-empty result using the row itself as input.

Constructor:

Exists(Directive directive)

IsOfType

Description: Includes rows where the active column value is an instance of the specified type.

Constructor:

IsOfType(String si) - Checks against a topic with the specified subject identifier

Not

Description: Returns rows which do not satisfy the inner filtering directive.

Constructor:

Not(Directive directive)

Or

Description: Includes rows which satisfy at least one of inner filtering directive.

Constructor:

Or(Directive d1,Directive d2,...)

Regex

Description: Includes rows where the active column matches the specified regular expression. Directive can optionally be given a mode parameter which is a bit-wise or of several different options. Available options are Regex.MODE_GLOBAL and Regex.MODE_ICASE. The GLOBAL option causes the regular expression to be matched against the whole input. Otherwise it is sufficient that part of the input matches. The ICASE option causes matching to be performed ignoring case. If no mode is provided, global is assumed, ignore case is not.

Constructor:

Regex(String regex)

Regex(String regex,int mode)

Notes:

Regex directive can also be used to replace matches with other strings but replacement is not usable when using Regex directive as a filtering directive. See Regex entry under Others for more details about this.

Topic maps directives

Result and input rows may contain values of any type. Most other than topic maps related directives only use string values. Most topic maps related queries assume that input values are either topics or subject identifiers of topics. Thus it is usually not necessary to specifically convert string values to topics first. The most common case where you would want to specifically do that is to ensure that the final result of the query contains topics instead of string. This allows you to use all the Wandora topic related tools on the result table. To convert subject identifier string to topics use the Topics directive.

AllTopics

Description: Returns all topics of the topic map.

Constructor:

AllTopics()

Notes:

Note that retrieving all topics of a large topic map can be very costly operation in terms of time. Thus whenever possible, try to avoid using this directive. If you for example want all topics that are instances of some other topic, instead of using something like new AllTopics().where(new IsOfType(a)) use new Instances().from(a).


BaseName

Description: Gets the base name of the active column value of input.

Constructor:

BaseName()

Instances

Description: Gets all instances of the active column value of input.

Constructor:

Instances()

IsOfType

See IsOfType under filter directives.

Players

Topics

Description: Gets the topic with subject identifier equal to the string value of the active column value of input.

Constructor:

new Topics()

Others

Count

Description: Counts the number of rows returned by the inner directive using same input the Count directive got.

Constructor:

Count(Directive directive)

Regex

Description:

Filters and/or performs search and replace operations using regular expressions.

Directive can optionally be given a mode parameter which is a bit-wise or of several different options. Available options are Regex.MODE_MATCH, Regex.MODE_GLOBAL and Regex.MODE_ICASE. The MATCH option causes the directive to return an empty result if no match is found. Otherwise rows are returned as is if no match is found. The GLOBAL option causes the regular expression to be matched against the whole input. Otherwise it is sufficient that part of the input matches. The ICASE option causes matching to be performed ignoring case.

If a replacement string is given, any matches are replaced with that. Otherwise no replacement is performed and the directive is used only to filter rows. Replacement cannot be used when directive is used as a filtering directive.

Constructor:

Regex(String regex,String replace,int mode)

Regex(String regex,String replace) - Uses mode global

Regex(String regex) - Uses mode match and global

Regex(String regex,int mode) - Uses the specified mode but forcing match option

Personal tools