Setting up Pubby
published: (updated: )
by Harshvardhan J. Pandit
is part of: Semantic Web
ontologies semantic-web web-dev
Pubby is a nifty little tool
that is great for exposing RDF datasets accessed through SPARQL endpoints
as browsable HTML pages. What this allows is to create a populated web-page for
resources available in SPARQL endpoints. Pubby uses DESCRIBE
queries to populate
the HTML page. To see it in action, visit [OPMW](http://opmw.org] example pages with
the Similar Words
example showing all RDF links in HTML.
Installation
Pubby can be downloaded from the download page or the source can be accessed through the Github project. Usually, the latest version is advocated to be used, but in this case, I found an unresolved issue with showing RDF prefixes in the generated documents. There was a proposed solution on StackOverflow with two answers that propose adding prefixes to the config file and setting the prefixes as URI both of which did not work in my case. Therefore, downgraded from version 0.3.3 to version 0.3.2. And important change in these two versions is that pubby changed the configuration file format from N3 to Turtle. However, they both still look fairly similar, so there is not much of a change in terms of reading and configuration.
To get pubby, use curl
and unzip the contents like -
# wget -O pubby.zip http://wifo5-03.informatik.uni-mannheim.de/pubby/download/pubby-0.3.3.zip
curl -o pubby.zip http://wifo5-03.informatik.uni-mannheim.de/pubby/download/pubby-0.3.3.zip
# use unzip or jar -xf
jar -xf pubby.zip
Serving with Jetty
Pubby can be served using Tomcat or Jetty, or any other mechanism of serving web containers.
It does not come with a WAR
file, but contains a WEB-INF
folder which is
ready to served. If pubby is to be served as the root which means it is
directly accessible from wherever jetty is running, such as localhost:8080
,
then the webapps
folder must contain the pubby contents as root
(folder name).
Otherwise, jetty can be configured to run pubby as a servelet at the desired url.
Jetty is available for download as a package, in which case, it is installed
as a service, or one can download the portable application and set it up.
In this case, jetty can be setup as a service using the file
/etc/systemd/system/pubby.service
as -
[Unit]
Description=Pubby server using Jetty
After=network.target
[Service]
User=< user >
Group=< dev >
WorkingDirectory=< folder containing jetty >
ExecStart=/usr/bin/java -jar start.jar
[Install]
WantedBy=multi-user.target
Serving using Nginx
Once jetty is running the pubby servelet, Nginx can be configured to serve this using a proxy service as -
location /<DESIRED URL/ {
proxy_set_header X-Real-IP $remote_addr ;
proxy_set_header X-Forwarded-For $remote_addr ;
proxy_set_header Host $host ;
proxy_set_header X-NginX-Proxy true;
rewrite ^/<DESIRED NAMESPACE SET IN PUBBY CONFIG>/?(.*) /$1 break;
proxy_pass http://<JETTY ADDRESS>/;
proxy_redirect off;
}
Configuration
The pubby config file is located in the WEB-INF
folder and is named either
config.n3
for N3 or config.ttl
for Turtle, depending on the version of
pubby being used.
Prefixes
The starting prefixes define the prefix seen in the HTML page output, along with those used on the page.
Server configuration section
This is the section marked as an instance of conf:Configuration
.
- projectName - this is the name of project displayed on the page
- projectHomepage - this is the URI for the project homepage
- usePrefixesFrom - this defines the location where the prefixes are loaded
from, a value of
<>
indicates the config file, or this can contain a URI from which the prefixes will be loaded - indexResource - this is the URI of the resource that will be displayed when the 'homepage' of pubby is displayed; or to put it in another way, this is the resource that will be displayed on the landing page
Dataset configuration section
This is an section in the Server configuration section, defined as annotations
of conf:dataset
property.
- sparqlEndpoint - this is the SPARQL endpoint URL from which resources will be loaded
- datasetBase - this is the common URI prefix, similar to the
@prefix
used in SPARQL queries