IMDB extractor
(→Setting up databases for IMDB topic maps) |
(→Setting up databases for IMDB topic maps) |
||
Line 95: | Line 95: | ||
Here is my terminal capture of previous steps: | Here is my terminal capture of previous steps: | ||
− | |||
akivela@virtual-ubuntu:~$ sudo apt-get install mysql-server | akivela@virtual-ubuntu:~$ sudo apt-get install mysql-server |
Revision as of 14:45, 2 December 2009
IMDB extractor transforms Internet Movie Database data files into a topic map browsable with Wandora. Extractor has been created for demonstration purposes only. Wandora does not contain any IMDB data files. Also you should be aware that Wandora or Wandora authors have no rights to give you any permission to use IMDB data. Wandora provides you nothing but a technology to create IMDB topic maps. If you plan to exploit IMDB topic maps you should contact IMDB Licensing department. Read more here.
You may download IMDB datafiles from
As datafiles are extremely large you can't extract data to memory topic maps but have to use database topic maps. More over Wandora does not transfer all IMDB files. Current extractor transfers only
- actors
- actresses
- keywords
- countries
- language
- locations
- genres
- movies
- biographies
- producers
- directors
- plot summaries
- running times
- release dates
To prepare the extraction download all required data files and unpack them to your local file system. Then create a database topic map and start extractor with File > Extract > Media > IMDB Extractor. Wandora requests a folder containing IMDB data files or a single data file and starts the extraction after successful data file or folder identification. IMDB data files are very large and you should be patient as the extraction may take a while.
Below is a screenshot of Wandora viewing associations of movie Dr. Strangelove.... Notice the layer structure. Each IMDB datafile has been extracted to a separate database topic map.
Contents |
Step by step example
This chapter is a step by step tutorial showing you how to use IMDB extractor and database topic maps. Tutorial extractions were made in a Ubuntu Linux 8.1 running on top of Sun's VirtualBox (running on top of Windows XP). Next screen shot views system properties of the Ubuntu Linux used for IMDB extractions. Notice the memory amount given for the Linux. We gave the Ubuntu 1500 MB of memory. Our experiences suggest you should give Linux memory as much as possible. With small memory footprints the IMDB extraction fails after heavy swapping.
Now start Ubuntu Linux and log in.
Downloading IMDB datafiles
- Download IMDB data files:
- Unzip all data files in shell with gunzip or right click each data file icon and select option Extract Here.
Now you should have all required IMDB data files ready for extraction as shown below.
Setting up Wandora
- Download Wandora application.
- Install Wandora
- Start Linux shell with menu option Applications > Accessories > Terminal
- Open Wandora's bin directory.
- Change execution rights of Wandora-huge.sh to allow execution.
- Finally add Java's bin directory to the PATH environment variable.
Here is how I did previous steps:
akivela@virtual-ubuntu:~/Desktop$ cd wandora/bin akivela@virtual-ubuntu:~/Desktop/wandora/bin$ dir SetClasspath.bat Wandora.bat Wandora-large.bat Wandora-mini.sh SetClasspath.sh Wandora-huge.bat Wandora-large.sh Wandora.sh Wandora-4g.sh Wandora-huge.sh Wandora-mini.bat akivela@virtual-ubuntu:~/Desktop/wandora/bin$ chmod a+x Wandora-huge.sh akivela@virtual-ubuntu:~/Desktop/wandora/bin$ PATH=$PATH:/home/akivela/jre1.6.0_13/bin akivela@virtual-ubuntu:~/Desktop/wandora/bin$
Now you are ready to start Wandora application in Lunux. Write ./Wandora-huge.sh to terminal and hit enter. Wandora application should start.
Setting up databases for IMDB topic maps
Start another terminal window in Ubuntu with option Applications > Accessories > Terminal. In terminal
- Install MySQL server with command sudo apt-get install mysql-server unless you already have it installed.
- Log into the MySQL server with command mysql --user=<your-username> --password=<your-password>
- Create empty databases with MySQL command create database <database-name>; (notice ending semicolon) for next database names:
- imdb_actors
- imdb_actresses
- imdb_countries
- imdb_genres
- imdb_movies
- Prepare each created database with Wandora specific database table structures in wandora/build/resources/conf/database/db_mysql.sql. In detail:
- Select database with MySQL command use <database-name>;, for example use imdb_actors; (notice ending semicolon).
- Read database table creation clauses from external file with MySQL command source wandora/build/resources/conf/database/db_mysql.sql; (notice ending semicolon). Notice that you may have to change the path of db_mysql.sql depending on you Wandora installation directory and your current directory.
Here is my terminal capture of previous steps:
akivela@virtual-ubuntu:~$ sudo apt-get install mysql-server Reading package lists... Done Building dependency tree Reading state information... Done The following extra packages will be installed: mysql-server-5.0 Suggested packages: tinyca mailx The following NEW packages will be installed: mysql-server mysql-server-5.0 0 upgraded, 2 newly installed, 0 to remove and 349 not upgraded. Need to get 26.9MB of archives. After this operation, 87.7MB of additional disk space will be used. Do you want to continue [Y/n]? y Get:1 http://fi.archive.ubuntu.com intrepid/main mysql-server-5.0 5.0.67-0ubuntu6 [26.8MB] Get:2 http://fi.archive.ubuntu.com intrepid/main mysql-server 5.0.67-0ubuntu6 [54.9kB] Fetched 26.9MB in 25s (1073kB/s) Preconfiguring packages ... Selecting previously deselected package mysql-server-5.0. (Reading database ... 100052 files and directories currently installed.) Unpacking mysql-server-5.0 (from .../mysql-server-5.0_5.0.67-0ubuntu6_i386.deb) ... Selecting previously deselected package mysql-server. Unpacking mysql-server (from .../mysql-server_5.0.67-0ubuntu6_all.deb) ... Processing triggers for man-db ... Setting up mysql-server-5.0 (5.0.67-0ubuntu6) ... * Stopping MySQL database server mysqld [ OK ] Reloading AppArmor profiles : done. * Starting MySQL database server mysqld [ OK ] * Checking for corrupt, not cleanly closed and upgrade needing tables. Setting up mysql-server (5.0.67-0ubuntu6) ... akivela@virtual-ubuntu:~$ mysql --user=root --password=mypass Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 2 Server version: 5.0.67-0ubuntu6 (Ubuntu) Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> create database imdb_actors; Query OK, 1 row affected (0.01 sec) mysql> create database imdb_actresses; Query OK, 1 row affected (0.00 sec) mysql> create database imdb_countries; Query OK, 1 row affected (0.00 sec) mysql> create database imdb_directors; Query OK, 1 row affected (0.01 sec) mysql> create database imdb_genres; Query OK, 1 row affected (0.00 sec) mysql> create database imdb_keywords; Query OK, 1 row affected (0.00 sec) mysql> create database imdb_language; Query OK, 1 row affected (0.00 sec) mysql> create database imdb_movies; Query OK, 1 row affected (0.00 sec) mysql> use imdb_actors; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.03 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.04 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> use imdb_actresses; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> use imdb_countries; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> use imdb_directors; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> use imdb_genres; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.03 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> use imdb_movies; Database changed mysql> source /home/akivela/Desktop/wandora/build/resources/conf/database/db_mysql.sql Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.00 sec) Query OK, 0 rows affected (0.01 sec) Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.02 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.01 sec) Records: 0 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected (0.00 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> mysql> mysql>