DPV as a SKOS vocabulary: Analysis
by Harshvardhan J. Pandit
is part of: Data Privacy Vocabulary (DPV)
is about: Data Privacy Vocabulary (DPV)
DPV DPVCG semantic-web
Table of Contents
- 1. SKOS Basics
- 2. DPV as SKOS vocabulary
- 3. Summary
- skos reference
- skos primer
1 SKOS Basics
Conceptis the equivalent of both class and instance within SKOS.
- All concepts within DPV which are now provided as classes would become
- The skos:prefLabel provides a way to state this is the preferred method of referring to a concept
- It necessitates that
prefLabelbe unique i.e. no two concepts should have the same preferred labels
- In DPV,
prefLabelis what we used to state this is recommended way of referring to the concept, and also what we use in the IRI i.e.
- For other labels, there is skos:altLabel and skos:hiddenLabel
skos:altLabelprovides alternate ways of referring to the same concept
altLabelprovides a way to incorporate other ways of referring to the same concept. For example,
altLabelprovides a way to have different labels arising from other standards or common uses. For example, instead of creating an entirely separate vocabulary for ISO, the equivalent concepts can be indicated using a customised property for ISO labels.
dpv:labelISO rdfs:subPropetyOf skos:altLabel . dpv:DataController skos:prefLabel "Data Controller" ; dpv:labelISO "PII Controller" .
hiddenLabelis a way to have labels for convenience and other uses, but which usually would be hidden from general use of that concept.
1.3 Hierarchy Relationships
- In RDFS and OWL, hierarchy is specified using the unidirectional property
- In SKOS, the properties skos:broader and skos:narrower are used to indicate relationship between two concepts bidirectionally
This is used as follows:
A skos:broader Bmeans A has B as a broader concept of itself. For example
cat skos:broader mammal
I always find it confusing to remember how
narrowerare to be used in terms of their direction. As a mnemonic, I use has as the prefix of these properties to make sense of what they are supposed to mean. Therefore, saying
A <has>broader Bhelps understand A has B as its broader concept.
- In SKOS, the properties
skos:narrowerTransitiveare properties used for transitive inferences, and are super properties of
In DPV, the hierarchies are expected to be transitive, which necessitates the use of SKOS transitive properties. Otherwise we lose inferences. See example:
# given this data (example concepts) dpv:PersonalData skos:narrower dpv:SpecialCategoryPersonalData . dpv:SpecialCategoryPersonalData skos:narrower dpv:GeneticData . # this inference would be wrong because of non-transitive properties dpv:GeneticData skos:broader dpv:PersonalData . # however, using transitive versions like this would work as expected dpv:PersonalData skos:narrowerTransitive dpv:SpecialCategoryPersonalData . dpv:SpecialCategoryPersonalData skos:narrowerTransitive dpv:GeneticData . # these two inferences are now possible and correct dpv:GeneticData skos:broader dpv:PersonalData . dpv:GeneticData skos:broaderTransitive dpv:PersonalData .
- The property skos:Related is used to indicate another concept is related without establishing any specific relationship between the two.
- SKOS thus provides a way to indicate a hierarchy using
relatedto associate unidirectionally another concept.
- SKOS specifies that concepts related via a hierarchy should not also be
associated through the related property. This means
relatedis a strictly specified relationship between concepts not present in the same hierarchy.
1.4 Definitions, and notes for Examples, Changes, and Scope
- SKOS has the general property skos:note to associate a note with a concept. A note can be anything: a literal, a string, another node.
- SKOS provides subproperty skos:definition to indicate the definition of a concept.
- In DPV, we currently use
dct:descriptionto indicate definitions. Instead, the
skos:definitionproperty is better suited for explicitly indicating definition.
- SKOS provides subproperty skos:scopeNote to provide information about the scope of a concept.
- In DPV, the
scopeNoteproperty can be used to indicate how the concept is to be interpreted, whether there are any specific considerations regarding the use of interpretation of that concept, or additional information not provided within a definition. This information could be what we usually put in notes alongside a concept in the definition.
changeNoteare subproperties used to describe the historical, editorial, and provenance related information for a concept.
- skos:example can be used to provide an example of a concept
- In DPV,
skos:examplecan be used in two ways: first in providing textual examples of a concept as it occurs in the real world; or secondly by providing examples of how that concept is used in code and use-cases. The first is better for documentation, while the second is better for adoption and use. However, other vocabularise (e.g. dct, vann) exist that could be used to indicate code examples. So this property can be used to help humans understand what the concept is about through example(s).
1.5 Structuring concept hierarchies using SKOS
1.5.1 Replicating OWL/RDFS hierarchy with SKOS broader/narrower
- Currently, the hierarchy in DPV is expressed using
rdfs:subClassOfproperty usage in the following manner:
A -subClassOf-> B
- An intuitive conversion of this would be like this:
A -narrower-> Band
B -broader-> A
- While this will work in that it will provide a taxonomy of concepts structured in a hierarchy, it is not the best method nor the only one SKOS provides.
- If this is used, there is no good reason to migrate DPV from OWL or RDFS to a similar structure within SKOS.
- For one, there is no way to indicate a relationship between a concept and a top-concept. For example, consider the following example: EmailAddress is a subclass of PersonalData with some 4 or 5 levels of abstractions between them. To indicate EmailAddress is a category of personal data, one would have to either travel up the chain of subclass relationships of use a reasoner to add statements that directly state EmailAddress is a subclass of PersonalData. This is a lot of non-intuitive usage.
1.5.2 Concept Schemes in SKOS
- SKOS provides the skos:ConceptScheme class to group related concepts together in a concept scheme or a thesaurus.
ConceptSchemecan have annotations
- Concepts can be indicated to be a part of a scheme using
- To indicate hierarchies and the top-concept within that hierarchy, the property skos:hasTopConcept is used
- The same concept can be part of different concept schemes
- The entirety of DPV can be a
skos:ConceptSchemewith each of its core concepts and modules providing the top concept. This results in a single collection of concepts with multiple hierarchies defined by the top concepts.
- Another alternative is to define each module or concept collection as a
skos:ConceptSchemeand to define the concepts within it as top concepts. However, there is no way to collect concept schemes within a package to create DPV.
1.5.3 Collections in SKOS
- skos:Collection is a way to group related concepts together under an arbitrary label which is not itself a concept. The example given in the primer refers to milk and types of milk (cow, goat, buffalo) and a collection for milk by source animal that includes only the concepts for cow, goat, and buffalo milk.
- Collections specify inclusion of a concept or another collection using the property skos:member
- The primer describes where collections may be necessary, and that the same pattern could be replicated by declaring the collection label as a Concept and using broader and narrower properties to construct a hierarchy.
- It concludes with the decision being based on whether the collection should be a concept or not. If yes, then ConceptScheme may be more suitable. If not, then Collection would be more suitable.
- For DPV, using
skos:Collectionseems to incur additional complexities without any apparent benefits. So far, we do not have any specific hierarchy or collection that cannot be represent using ConceptScheme.
2 DPV as SKOS vocabulary
- We have Concepts that have a hierarchy; this can be specified using
- We have properties that relate concepts, e.g.
dpv:hasPersonalDatawhose range we want to have as an instance of
It should be possible to use multiple concepts as types, for e.g. to declare something is an instance of two purposes as:
ex:MyPurpose a dpv:Marketing, dpv:Personalisation .
which is an issue as SKOS concepts cannot be 'combined' in a similar manner to what we assume RDFS/OWL2 classes can be.
- If possible, we would like to keep meta-modelling and OWL-DL compatibility. This would means having the T-box and A-box be disjoint sets. While not affecting the SKOS usage in any major manner, this has implications on use of DPV in OWL2 and more specifically reasoner-oriented tasks and use-cases.
- We want a way to package all concepts and hierarchies within DPV. While currently we don't explicitly declare this in the RDFS/OWL2 vocabulary, if there is a way to express this formally, we could do it.
2.2 Proposal for providing DPV using both SKOS and RDFS/OWL
- The top-level classes are declared as an instance of both
skos:Concept. This permits creating instances of that class that are compatible with both OWL (as an instance) and SKOS (as members of concept scheme). This also keeps the T-box and A-box separate by not having them mixing together.
In the example below, we have
PersonalDataas the top-level concept which is declared as also a class. This permits the following:
<dpv> a skos:ConceptScheme ; skos:hasTopConcept dpv:PersonalData . dpv:PersonalData a owl:Class, skos:Concept ; dct:title "Personal Data"@en ; skos:inScheme <dpv> . dpv:Email a dpv:PersonalData, skos:Concept ; skos:prefLabel "Email"@en ; skos:narrower dpv:EmailAddress . dpv:EmailAddress a dpv:PersonalData, skos:Concept ; skos:prefLabel "Email Address"@en ; skos:broader dpv:Email . dpv:hasPersonalData a owl:ObjectProperty ; rdfs:range dpv:PersonalData .
- This has the following implications:
- The property
hasPersonalDatacan be defined with range
PersonalDataand can correctly refer to both
- This use of
PersonalDatais okay, because we never expect the following:
<Something> hasPersonalData PersonalData
EmailAddressare related using the SKOS hierarchy instead of OWL
EmailAddresscannot be resolved using subclass mechanism anymore. For this a separate OWL equivalence ontology would have to be created which specifies
subClassOfrelationships instead of
broader. As the semantic implications of this OWL iteration are different from those of DPV, it would be better to provide it using a separate IRI.
Note that mixing SKOS and OWL for both classes and instances would turn this into OWL-Full and cause issues when using a reasoner, like this:
dpv:PersonalData a owl:Class, skos:ConceptScheme . # issue1: instances of concept scheme are incorrect # issue2: a class as instance of another class dpv:Email a owl:Class, dpv:PersonalData . # issue3: property assertions are complex # issue4: skos:Concept and skos:ConceptScheme as disjoint dpv:Email a skos:Concept ; skos:inScheme dpv:PersonalData . ex:MyEmail a skos:Concept ; skos:inScheme dpv:PersonalData ; skos:broader dpv:Email # the range of dpv:hasPersonalData cannot be stated # unless we use [ skos:inScheme dpv:PersonalData ] as path ex:PDH dpv:hasPersonalData ex:Email . ex:PDH dpv:hasPersonalData ex:MyEmail .
- The property
To create further instances of a concept provided in DPV, such as EmailAddress and a specific email address, SKOS could still be used.
dpv:EmailAddress a dpv:PersonalData, skos:Concept . ex:MyEmail a dpv:PersonalData, skos:Concept ; skos:broader dpv:EmailAddress ; skos:prefLabel "[email protected]"^^xsd:string . # okay to use property like this ex:PDH dpv:hasPersonalData ex:EmailAddress . ex:PDH dpv:hasPersonalData ex:MyEmail .
DPVas an ontology also becomes a
- Core and other top-level classes become
- Core and other top-level classes are instances of
- Taxonomies are created using instances of
- Properties are declared with domain or range as the appropriate top-level
class, for example
dpv:hasPersonalData rdfs:range dpv:PersonalData
- What used to be instances of specific concepts are now represented as
skos:Conceptand whatever top-level concept they represent. For example, as:
ex:MyEmail a dpv:PersonalData, skos:Concept; To declare what is their closest concept within DPV taxonomy, SKOS properties are used thus:
ex:MyEmail skos:broader dpv:EmailAddress, dpv:Identifier.
- T-Box and A-box are kept strictly separate thus making this OWL-DL compatible. However, SPECIAL and TRAPEZE's reasoners won't work any longer because there are no sub-class relationships. To remedy this, a separate serialisation using OWL and using a separate IRI is provided.
- For other general uses, SKOS and OWL mixed like this provide a better possibility for using as needed, whether requiring property domains and ranges, or for further extending concepts and creating instances at arbitrary levels of abstractions.
- SKOS provides a lot of useful organisational tools, like ConceptScheme which
can be further used to group concepts without declaring hierarchies. For
LegalEntity, concept schemes can be created to separate what is essentially a legal role such as a controller from what is a type of organisation such as SME. Through this, the actual legal entity taxonomy would be clean and not include these categorisation, since ConceptScheme is disjoint from Concept within SKOS.
Example RDF for dpv-skos consistency checking
I created the following minimal set of information to test whether such usage of SKOS and OWL is okay or if a reasoner might throw errors and inconsistencies for using it.
Protege with reasoners FACT++ and Pellet produces no errors or inconsistencies. The OWL profile checker indicates issues for OWL2 QL and EL profiles based on SKOS's use of transitive properties and property domain/range assertions. Other than that, this use has no issues for OWL2 DL, RL, and Full profiles.