Setting up Openlink Virtuoso
published: (updated: )
by Harshvardhan J. Pandit
is part of: Semantic Web
database semantic-web triple-store
Openlink Virtuoso is a powerful triple-store (and also a traditional RDBMS) with many different features. Setting up virtuoso is easy as packages are available in most distributions. Virtuoso has a bizzare collection of documentation which is scattered, unorganised, and sometimes missing. Depite this, it is a solid tool which is easy to set up and use, and comes configured ready for production use.
Installation
The package virtuoso-opensource
is available on Debian based systems, and can be
installed with -
sudo apt-get install virtuoso-opensource
which will install virtuoso and set it up as a system service with the name
virtuoso-opensource-X.x
with X.x
being version numbers, which for me were 6.1
.
The service can be managed as:
# start, stop, restart, status
sudo service virtuoso-opensource-X.x start
sudo service virtuoso-opensource-X.x stop
sudo service virtuoso-opensource-X.x restart
sudo service virtuoso-opensource-X.x status
During installation, virtuoso will ask to set a password for two users -
DBA
and DAV
which are like admins
for the web interface and management actions.
It is essential to remember the password as this is required to make changes to
virtuoso and also to add other users.
Configuration
The config file is located at -
/etc/virtuoso-opensource-X.x/virtuoso.ini
and contains settings for storage location and server settings. Virtuoso has the option
of serving the management interface over a SSL certificate (located in the Parameters
section) which is commented out by default. The configuration for the Web interface is
in the HTTPServer
section.
ServerPort
refers to the port the virtuoso interface runs at, which is 8890
by default,
which can be changed through this option. A description of the various options is
available at link.
Conductor
The virtuoso web interface is called conductor, and offers management capabilities
for all its features. It is served by default at /conductor
URL prefixed
by wherever virtuoso is being served.
Linked Data
The linked data section in Conductor offers a SPARQL endpoint, query interface, and management capabilities for graphs and datasets. The default tab for SPARQL is a query interface which queries the (default) graph specified and displays the results in the page itself. Graphs shows all available graphs in the triple store, and virtuoso comes with a lot of RDF data and some graphs by default, which one can assume are required for its configurations and data settings. The Namespaces tab shows the stored namespaces for RDF graphs, and one can add custom namespaces here. Quad Store Upload provides a simple way to upload a RDF file as a dataset or import it from a URL. It requires the named graph IRI under which the dataset is stored in the triple store. There is no default graph, therefore the namespace has to be provided.
iSQL
Virtuoso provides a utility called Interactive SQL or iSQL which is accessed
using isql-vt
or can be symlinked from /usr/bin/isql-vt
. This utility provides
SQL-like access to the datasets which can be used to perform SPARQL queries or
upload data into the triple store.
SPARQL Endpoint
By default, /sparql
is the SPARQL endpoint provided by virtuoso, and requires no
access control to set up or access. So once you have used Conductor or iSQL to upload
the dataset, the SPARQL endpoint is ready to serve the data for the given graph IRI.
The only thing to configure is to serve datasets under a given IRI.
Exposing Virtuoso interfaces using Nginx
By default, Virtuoso runs at localhost:8890
, which Nginx can be configured with a
proxy to pass traffic to the server. However, for some reason, Nginx cannot pass
in a reverse proxy, or map URL to the localhost as required. A hack around this
is to configure all the locations virtuoso requires as URL accesses, and proxy pass
them to the Virtuoso server. A list of them is-
/virtuoso
/conductor
/about
/category
/class
/data
/describe
/delta.vsp
/fct
/issparql
/ontology
/page
/property
/rdfdesc
/resource
/services
/snorql
/sparql-auth
/sparql
/statics
/void
/wikicompany
If a particular service is to be restricted or not provided, then simply remove its URL from the Nginx configurations. An example of a proxy configuration for a URL is -
location /sparql {
proxy_set_header X-Real-IP $remote_addr ;
proxy_set_header X-Forwarded-For $remote_addr ;
proxy_set_header Host $host ;
proxy_set_header X-NginX-Proxy true;
rewrite ^/virtuoso/?(.*) /$1 break;
proxy_pass http://localhost:8890/;
proxy_redirect off;