Data Privacy Vocabulary (DPV) — Version 2.0

Wed Nov 27 2024 Conference
International Semantic Web Conference (ISWC)
✍ Harshvardhan J. Pandit* , Beatriz Esteves , Georg Philip Krog , Paul Ryan , Delaram Golpayegani , Julian Flake
Description: This article describes the version 2 iteration of the DPV in terms of its contents, methodology, current adoptions and uses, and future potential. It also describes the relevance and role of DPV in acting as a common vocabulary to support various regulatory (e.g., EU's DGA and AI Act) and community initiatives (e.g., Solid) emerging across the globe.
published version 🔓open-access archives: arXiv , harshp.com , OSF
📦resources: repo , Data Privacy Vocabulary (DPV) — Version 2.0 , website

IntroductionRequirements for a Legal VocabularyInformation and Knowledge ModellingLegal and Semantic ExtensibilityStakeholder InteroperabilityComparison with Related WorkAdoption of DPV 1.0Analysis of CitationsProjects and Industrial use of DPVUse of DPV in StandardsData Privacy Vocabulary 2.0Major Changes in 2.0Methodology & Design PrinciplesManagement by DPVCGOntology Engineering ProcessImplications of Using SKOS and OWLConclusion & Future WorkResource Availability Statement:References

Abstract The Data Privacy Vocabulary (DPV), developed by the W3C Data Privacy Vocabularies and Controls Community Group (DPVCG), enables the creation of machine-readable, interoperable, and standards-based representations for describing the processing of personal data. The group has also published extensions to the DPV to describe specific applications to support legislative requirements such as the EU's GDPR. The DPV fills a crucial niche in the state of the art by providing a vocabulary that can be embedded and used alongside other existing standards such as W3C ODRL, and which can be customised and extended for adapting to specifics of use-cases or domains. This article describes the version 2 iteration of the DPV in terms of its contents, methodology, current adoptions and uses, and future potential. It also describes the relevance and role of DPV in acting as a common vocabulary to support various regulatory (e.g., EU's DGA and AI Act) and community initiatives (e.g., Solid) emerging across the globe.

Introduction

The modern technological landscape consists of ubiquitous digital devices and services which generate vast amounts of data, which includes sensitive information that raises privacy concerns, as well as requires the protection of data from misuse and cybersecurity threats. Regulations across the globe have been developed or updated to meet this challenge, most notably the European Union’s (EU) General Data Protection Regulation (GDPR) [1] in 2016, which requires specific activities to be carried out based on defined norms and requirements, and require documenting governance processes for compliance.

‘Regulatory Technology’ (RegTech) has also evolved to provide information management capabilities and automation of tasks to support evolving regulations. However, a key barrier to their effective use is their proprietary nature, non-interoperable information, and lack of standards. As a result, the RegTech landscape for privacy and data protection is fragmented and siloed and lacks any meaningful ‘metadata’ and ‘semantics’ through which information can be reused, e.g., for processes not envisioned by a tool provider.

Based on this background, the SPECIAL H2020 project established the W3C Data Privacy Vocabularies and Controls Community Group¹ (DPVCG) in 2018, which developed the Data Privacy Vocabulary² (DPV) [2] as a machine-readable interoperable vocabulary for the exchange of legally relevant ‘metadata’. Since then, DPV has continued being a state of the art resource that is iteratively updated to match the evolving landscape of regulations and compliance requirements. It has seen high adoption in academic, industrial, and mixed settings, and has been referenced in standards.

The DPV 1.0 focused on providing a vocabulary for describing personal data activities based on the GDPR. Since then, the world has evolved at a rapid pace with new innovations and regulations – in particular those related to ‘increasing sharing of data’ such as the EU’s Data Governance Act (DGA) [3], as well as for ‘regulating AI’, such as the EU’s Artificial Intelligence Act (AI Act) [4]. To reflect these developments as well as others in domains such as cybersecurity, the DPVCG has updated the DPV and made it capable of representing a wide variety of legally relevant information for ‘AI and digital regulations’. In this article, we present the design and development of DPV – version 2.

The rest of the article is structured as follows: Section 2 describes the requirements which guided the development of DPV, Section 3 compares DPV with related work, Section 4 describes the known adoptions, Section 5 provides an overview of DPV, Section 6 describes the methodology and design principles used, and Section 7 concludes the article with a discussion on future work.

Requirements for a Legal Vocabulary

Information and Knowledge Modelling

DPV 1.0 was concerned with modelling the purposes, processing operations, legal bases, personal data categories, technical and organisational measures, and the roles of entities based on the GDPR, and to fill specific gaps [2]: (1) there were no ontologies to describe and exchange information about activities involving personal data; (2) there were no agreed upon vocabularies or taxonomies for representing practical applications and uses of personal data, e.g., different purposes such as Service Provision, or Legal Bases such as Consent; and (3) there were no vocabularies that align the terminology and requirements of regulations for developing and using machine-readable information.

Based on these, DPV 1.0 was developed to provide:

a formal ontology that defines concepts and relationships;
a (hierarchical) taxonomy that provides instances or specialisations of the ontological concepts to reflect practical uses and applications; and
a legally relevant modelling of information to support documentation needs for compliance and exchange of information between stakeholders.

For DPV 2.0, the evolving landscape of law and technologies and the initial adoption of DPV led the DPVCG to reconsider the scope of DPV. This resulted in the following additional requirements:

supporting multiple jurisdictions and laws (e.g., EU — GDPR, US — CCPA);
supporting risk management (e.g., based on ISO 31000 series);
representing use of services and technologies, e.g., cloud services;
describing how AI (as a technology) is used;
providing guidance on use of DPV to meet specific legal requirements; and
providing documentation, examples, and guidance for increasing adoption.

Legal and Semantic Extensibility

Due to the nature of legal frameworks, concepts in DPV must constantly be assessed and possibly modified to support changes in laws and case law. Unlike conventional ontologies, the meaning and semantics of concepts in DPV relies on legal norms and interpretations which differ across jurisdictions and change with laws and rulings. The design of DPV therefore requires the possibility for it to be used in a ‘jurisdiction-independent’ manner while also having a mechanism to explicitly assert a specific law, or how its concept applies. For example, ‘consent’ is a generic concept, whereas ‘consent according to GDPR’ is a jurisdiction and regulation-specific concept. Such legal customisability is an important part of DPV’s interoperability as it enables different stakeholders to express common requirements (e.g., consent) which can be explicitly asserted or interpreted in relevant contexts (e.g., regulation based on location).

Semantic interoperability is a cornerstone of DPV given its emphasis on representation and exchange of information between stakeholders. Compared to conventional ontologies, concepts in DPV must be extensible to support practical requirements which do not align neatly with design considerations of only providing classes and properties. For example, a concept ‘Email’ that represents personal data associated with emails cannot be simply modelled as a class as it may be needed as an instance to state emails are collected. At the same time, it may also need to be extended to specifically refer to aspects such as Email Address, Email Contents, or Attachments which also can be used as instances or be further specialised. This fits the SKOS modelling style better and also matches with how information is used by non semantic-web folks (e.g., to fill in forms), but does not fit well with semantic reasoning processes which largely use OWL (with Abox and Tbox assertions). DPV thus needs to satisfy both styles of using concepts as extensible instances and having support for OWL reasoning.

Stakeholder Interoperability

It is not possible for DPV to provide all relevant concepts in its ontology/taxonomy given the almost infinite potential concepts (e.g., for personal data). Instead, DPV’s community opted to provide the ‘most important’ and ‘most commonly required’ concepts, while requiring that DPV support stakeholders in extending DPV internally for reflecting the peculiarities of their own use-cases. For example, DPV can provide the purpose ‘Marketing’ which a company can extend to describe ‘Summer Sale Offers’.

Following from this, DPV’s mission to provide interoperability across stakeholders relies on the ‘common’ concept present in the DPV taxonomy as the basis for establishing shared understanding even if each stakeholder ends up creating their own unique or individual ontological representation. In the above example, Summer Sale Offers may be incomprehensible to another stakeholder, but the use of DPV enables both entities to correctly interpret that this is a form of Marketing — and thus be able to identify requirements and obligations for legal compliance associated with this concept.

DPV fills an unique and necessary niche within the state of the art by providing concepts to represent legally relevant information related to the processing of personal data and use of technologies. Here we only describe related works that have a comparable adoption, or are standards, or are outputs of larger projects.

The Open Digital Rights Language (ODRL) [5] is a W3C standard for modelling policies and agreements which can dictate the permissions, prohibitions, and duties associated with use of data or resources. The ODRL vocabulary also provides specific concepts such as ‘obtain consent’ to support commonly used agreements. While there is some overlap between DPV and ODRL, they are complimentary in their objectives and uses. ODRL focuses solely on providing a language for representation of policies and does not focus on specific legal requirements or jurisdictions. DPV’s concepts can thus provide the necessary legal and jurisdictional relevant information within ODRL policies. This has been explored and demonstrated by existing work [6]. The DPVCG intends to establish close collaboration with the ODRL CG through its common members by aligning concepts between DPV and ODRL, and providing guidance for using DPV as an ODRL profile. DPV has also been mentioned by the International Data Spaces Association (IDSA) as a vocabulary of interest to support the implementation of GDPR and regulations in their policies based on ODRL [7].

LegalRuleML³ is an OASIS standard providing a ‘rule interchange language’ that enables modelling and reasoning tasks based on legal arguments. Similar to ODRL, it focuses on providing a language for expression of ‘rules’ and does not provide any taxonomies or model regulations. Similar to ODRL, DPV is also complimentary to LegalRuleML as a vocabulary that can be used to support representation of legal information. There is existing work that has explored the use of LegalRuleML for modelling requirements from the text of GDPR [8], though it has not been maintained nor been extended to support practical requirements related to implementing GDPR.

Gist⁴ is a ‘minimalist upper ontology for the enterprise’ which provides business concepts with a focus on minimising ambiguity. The Financial Industry Business Ontology⁵ (FIBO) models concepts relevant in financial business applications, such as contracts and financial transactions. Both Gist and FIBO represent ‘industrial ontologies’ and have been developed over a significant portion of time with the involvement of corporate stakeholders. While neither support specific legal requirements or jurisdictions, Gist provides concepts relevant for modelling details about an organisation and FIBO provides modelling of contracts and contract-related processes — which DPV does not contain. Work is underway in DPVCG to study the FIBO contract concepts and identify how DPV can support contracts related to personal data processing.

Adoption of DPV 1.0

Analysis of Citations

In this Section, the adoption of DPVCG’s outputs is evaluated through a citation analysis performed over the DPV 1.0 publication [2]. In this context, 81 publications that cite DPV were found through the Google Scholar service. The gathered results underwent a review and were included in this analysis if deemed pertinent. Duplicated publications, publications without an open-access version and in languages other than English were excluded from this analysis, resulting in 76 publications to be reviewed. Table 1 presents the results of the performed publication evaluation. The publications were evaluated in relation to their use of DPV: publications that reference DPV as a state of the art resource are signalled in the Mention column, that use DPV towards an application or use case in the Use column, and that extend DPV in the Ext. column. Publications that contributed their extensions back to DPV (Contrib. column) are also marked in Table 1, as well as if the work is applied to a certain domain or sector (Domain column). The Effort column denotes the amount of work (speculated) to update the implementation to DPV 2.0, where ++ denotes more work as implementations use a pre-1.0 version and will need to update the IRIs, + denotes minor efforts to check changed concepts in DPV 2.0, and - denotes no changes. DPV 2.0 is largely compatible with DPV 1.0, which means most implementations using DPV 1.0 can update with minimal changes.

Citation analysis of academic publications that reference DPV 1.0 [2].
Work	Year	Mention	Use	Ext.	Contrib.	Domain	Effort
[9]	2020	X				Health	N/A
[10]	2020	X				Media	N/A
[11]–[13]	2020	X					N/A
[14]–[17]	2020		X				++
[18]	2020		X			Health	++
[19]–[21]	2020			X	X		++
[22]–[24]	2021	X				Health	N/A
[25]–[29]	2021	X					N/A
[6], [30]–[33]	2021		X				++
[34]	2021		X			Smart products	+
[35]	2021			X	X		+
[36]–[42]	2022	X					N/A
[43]	2022	X			X		N/A
[44]–[47]	2022			X	X		+
[48]	2022	X				Health	N/A
[49]–[53]	2022		X				+
[54]	2022		X			IoT	+
[55]	2022		X			Health	+
[56]–[60]	2023	X					N/A
[61]–[67]	2023		X				+
[68]–[71]	2023		X			Health	+
[72]	2023		X			Smart cities	+
[73], [74]	2023		X			Media	+
[75], [76]	2023			X	X		+
[77]	2023			X		Health	+
[78]	2024		X			Health	-
[79]	2024		X				+
[80]–[82]	2024		X	X	WIP	EU AI Act	++
[83]	2024		X				-

In this context, DPV’s specifications were compared against other state of the art vocabularies in the data protection domain regarding their ability to represent information related to GDPR rights and duties [43], their machine-readability, maintenance, accessibility, GDPR support and existence of compliance tools [60], and their capacity to aid with data interoperability and adhere to the FAIR principles [13]. In all mentioned surveys, DPV obtained a higher score compared with other existing solutions. When it comes to extensions performed over DPV, most were contributed back to DPV to be integrated into DPVCG’s outputs. Concerning work on GDPR requirements, there were proposed extensions focusing on consent [19], [20], in particular related to the processing of electronic health record data [77], as well as on building semantic models to represent records of processing activities [21], [32], [44], [45], data protection impact assessments [46], data breaches’ reports [76], and international data transfer notices [35]. Moreover, extensions focusing on GDPR’s data subject rights and exemptions to these rights [47] and on DGA requirements [65], [75] were also contributed back to DPVCG’s outputs.

In terms of applications to specific use cases, there is a body of work focusing on providing tools for auditing and GDPR compliance evaluation [17], [30], [67], on data minimisation [34], as well as on the documentation and annotation of privacy policies [15], [16], [33], [73] and privacy preferences [54]. Moreover, ML models were trained with DPV’s taxonomies to identify personal data processing activities in code repositories [64], [79] and textual datasets [52], [61]. DPV’s outputs were also used to model access and usage control policies [14], [18], [31], [74], and in particular applied to Solid [6], [49]–[51], [63], [66], [70] and health data-sharing use cases [68], [78], as well as to describe consent records and contracts for sensor data [53], [72]. In the context of data spaces, DPV was used to provide descriptions of health data handling activities [55] and to create user-centric privacy interfaces [62], [69], [71].

Projects and Industrial use of DPV

DPV and DPVCG are outputs of the SPECIAL H2020 project and were actively developed and used within the project. In addition to this, DPV was also used in TRAPEZE, MOSAICrOWN, smashHit, FAIRVASC, and PROTECT ITN projects funded under the EU’s H2020 programme which involved both academic and commercial partners. In addition, SPECIAL and TRAPEZE also included a Data Protection Authority who provided legal expertise in implementation and design of ontologies. TRAPEZE actively contributed back to the DPV and was instrumental in identifying the design structure where both RDFS+SKOS and OWL serialisations were developed, and supported development of a multilingual documentation framework to be implemented in future versions.

In addition to the use of DPV in industrial context in the above projects, companies that actively utilise DPV include Signatu⁶ — which develops legal compliance solutions, JLINC⁷ which develops digital data agreements, and Inrupt⁸ which develops Solid specifications and implementations. The DPVCG has received contributions from these companies, with Signatu being an active contributor by providing legal expertise and requirements for industrial applications.

Use of DPV in Standards

ISO/IEC TS 27560:2023 is a Technical Specification (TS) describing a ‘consent record information structure’, which defines the specific information to be maintained in consent records. DPV’s consent modelling played a significant part in the development of this standard based on sharing the knowledge of legal and semantic requirements regarding consent records and receipts. The annex of 27560:2023 provides examples of consent records and receipts using JSON-LD where DPV is explicitly referenced in the document and its concepts are used as to define the schema (e.g., hasPurpose and instances (e.g., Service Provision).

The IEEE P7012 Working Group is developing a specification to define how “personal privacy terms are proffered and how they can be read and agreed to by machines”. It explicitly references DPV as the vocabulary to describe activities regarding processing of personal data in an interoperable manner.

Data Privacy Vocabulary 2.0

Overview of concepts in DPV and their extensions

The motivation of the Data Privacy Vocabulary (DPV) is to provide a ‘data model’ or a ‘taxonomy’ of concepts that act as a vocabulary for the interoperable representation and exchange of information about personal data and its processing. For this, the DPV specification represents an abstract model of concepts and relationships that can be implemented and applied using technologies appropriate to the use-case’s requirements. The core concepts in DPV, as shown in Figure [fig:dpv], are broadly as follows:

Purpose: end-goal for why personal data is processed, e.g., Service Provision
Processing: representing operations over personal data, e.g., Collect, Store
Personal Data: categories of personal data involved
Legal Basis: justification in law for performing this activity
Legal Roles: Data Controller, Data Subject, etc.
Technical and Organisational Measures: safeguarding activities
Processing Context: storage conditions, automation, scale and scope
Context: concepts other than the above, e.g., necessity, duration
Rights: legally recognised rights associated with activities
Risk and Risk Mitigation: managing risks, consequences, and impacts

Each of these ‘core concepts’ are expanded into taxonomies to reflect their application in use-cases. The taxonomies also provide ‘knowledge’ by asserting categorisation based on the core concepts. For example, Personal Data is a core concept which is specialised into Sensitive Personal Data, and the taxonomy expanding upon personal data contains some instances asserted as being sensitive. An adopter therefore not only gets a taxonomy of personal data, but is also able to utilise the categorisation of it into sensitive personal data.

To assist newcomers in understanding the structure of DPV and how its concepts are organised — a Primer⁹ document has been developed. In addition, the documentation is continually refined to provide illustrative guidance and examples¹⁰, and a searchable index¹¹ of concepts is also provided. The below example shows the use of DPV in implementing consent records as used in ISO/IEC TS 27560:2023 Privacy technologies — Consent record information structure [83], where this work was recently also presented to the EU Commission:

@prefix dct: <http://purl.org/dc/terms/> .
@prefix dpv: <https://w3id.org/dpv#> .
@prefix loc: <https://w3id.org/dpv/loc#> .
@prefix eu-gdpr: <https://w3id.org/dpv/legal/eu/gdpr#> .
@prefix : <https://example.com/> .
:63ded36f-4acd-4f3c-991e-6cb636698523 a dpv:ConsentRecord ;
    dct:hasVersion "ISO-27560" ;
    dpv:hasIdentifier "63ded36f-4acd-4f3c-991e-6cb636698523" ;
    dpv:hasDataSubject "96121fde-199f-4848-8942-4436e270513a" ;
    dpv:hasNotice <https://example.com/notice> ;
    dpv:hasProcess [
        a dpv:Process ;
        dct:title "Send Newsletters with Seasonal Offers"@en ;
        dpv:hasPurpose dpv:Marketing ;
        dpv:hasLegalBasis dpv:Consent, eu-gdpr:A6-1-a ;
        dpv:hasPersonalData pd:Email ;
        dpv:hasDataController ex:Acme ;
        dpv:hasProcessing dpv:Collect, dpv:Store ;
        dpv:hasStorageCondition [ 
            dpv:hasLocation loc:IE ;
            dpv:hasDuration "P1Y"^^xsd:duration ; ] ;
        dpv:hasJurisdiction loc:EU ;
        dpv:hasRecipient :Beta, :Epsilon ; ] ;
    dpv:hasConsentStatus dpv:ConsentGiven ;
    dct:hasPart [
        a dpv:ConsentGiven, dpv:ExplicitlyExpressedConsent ;
        dpv:isIndicatedAtTime "2021-05-28T12:24:00"^^xsd:dateTime ;
        dpv:hasDuration "P1Y"^^xsd:duration ;
        dpv:hasEntity "96121fde-199f-4848-8942-4436e270513a" ] .

Extensions are a collection of concepts provided in a separate namespace. They are used to represent specific concepts in jurisdictions and regulations, for example Consent is present in the ‘main’ or ‘core’ DPV, and is expanded upon in the EU-GDPR extension to represent the specific requirements for consent under GDPR. Extensions are also used to provide a large group of concepts for a specific topic as their inclusion in the main vocabulary would not be practical or would introduce ambiguities between concepts. For example, the Personal Data extension provides a taxonomy of personal data categories, which were taken out of the main vocabulary due to ambiguity and confusion in concepts such as Location being used for both personal data and data storage location.

A list of ongoing work in DPVCG is as follows:

Data Privacy Specification (DPV) – https://w3id.org/dpv
Personal Data Concepts extension (PD) – https://w3id.org/dpv/pd
Location concepts extension (LOC) – https://w3id.org/dpv/loc
Legal concepts extension (LEGAL) – https://w3id.org/dpv/legal
EU GDPR concepts (EU-GDPR) – https://w3id.org/dpv/legal/eu/gdpr
EU DGA concepts (EU-DGA) – https://w3id.org/dpv/legal/eu/dga
EU AI Act concepts (EU-AIAct) – https://w3id.org/dpv/legal/eu/aiact
Legal Concepts for Ireland, Germany, United Kingdom, USA – https://w3id.org/dpv/legal/IE (Replace IE with ISO 3166-1 code, e.g., IE/DE/GB/US)
Risk Management Concepts (RISK) – https://w3id.org/dpv/risk
Technology Concepts (TECH) – https://w3id.org/dpv/tech
AI Technology Concepts (AI) – https://w3id.org/dpv/ai
Justifications – https://w3id.org/dpv/justifications

Major Changes in 2.0

DPV 2.0 and all its extensions contain 2394 concepts (with 2198 classes and 196 properties), with 1017 concepts added and 805 concepts removed as compare to DPV 1.0. In DPV 1.0, only DPV, Personal Data (PD), and the EU-GDPR extension were provided as ‘complete’, with the others specified to be in draft mode. In DPV 2.0, DPV along with all of its extensions of PD, LOC (locations), LEGAL (including jurisdictional laws such as EU GDPR), RISK, TECH, AI, and Justifications have been provided as finalised resources. A detailed changelog¹² has been provided expanding on information in this section, including added/removed concepts in DPV and extensions.

Change in scope: The scope of concepts in DPV 1.0 was limited to ‘processing of personal data’. In DPV 2.0, the scope was expanded to include ‘any data or technology’ to have the same semantic structure for management of both personal and non-personal data and technologies (including AI). This enables DPV to support regulations such as the Data Governance Act (DGA) which motivates ‘reuse of personal and non-personal data’ and the AI Act where existing DPV concepts such as Purpose, Rights, and Risk can be reused. While the scope of the DPVCG is still limited to personal data (and associated technologies), the expansion of scope for concepts enables the DPV to be utilised for a much broader range of use-cases and regulations. More importantly, it provides a common mechanism for representing information about activities in the so called ‘AI and Data’ regulations, and makes alignments with existing standards such as ODRL easier to manage. The expansion of scope is backwards compatible with DPV 1.0 as it does not change the application and interpretation of concepts.

Change in semantics: DPV 1.0 was provided with three different semantics: a custom extension of SKOS as the ‘default’ along with RDFS+SKOS and OWL2 variants — each with a distinct namespace. In DPV 2.0, the custom SKOS extension has been removed and replaced with RDFS+SKOS as the default with an OWL2 variant — each with a distinct namespace. This change is largely backwards-compatible as both DPV 1.0 and 2.0 use skos:Concept with the IRI for DPV 1.0 default and RDFS+SKOS redirected to DPV 2.0 RDFS+SKOS namespace, and that for DPV 1.0 OWL2 to DPV 2.0 OWL2 namespace.

Versioned IRIs: DPV 1.0 utilised unversioned IRIs (e.g., w3id.org/dpv — which is not considered best practice. DPV 2.0 introduces versioned IRIs to enable distinguishing between versions and choosing a specific version to use regardless of future changes. The versioned IRI for DPV 1.0 is w3id.org/dpv/1.0 and that for DPV 2.0 is w3id.org/dpv/2.0. Extension namespaces are constructed by suffixing the versioned DPV namespace, e.g., w3id.org/dpv/2.0/pd. The unversioned IRIs redirect to the latest DPV version.

Change in extensions: In addition to introduction of new extensions, DPV 2.0 also changes the namespaces and management of extensions. In DPV 1.0, the dpv-pd, dpv-legal, dpv-gdpr, dpv-nace, and dpv-tech extensions used the prefix dpv- in their folder structure and namespaces whereas risk and rights did not. In DPV 2.0, extensions are defined without the prefix for consistency.

The dpv-legal extension in DPV 1.0 was a draft providing legal concepts (laws, authorities) and locations based on ISO 3166-1 codes. In DPV 2.0 it has been split into legal and loc (locations) extensions for separation of concerns. Both legal and location extensions are provided as completed in DPV 2.0. Further, the location concepts have been aligned with EU Vocabularies¹³ as an example of connecting DPV locations to external vocabularies.

The namespaces and organisation of legal concepts in DPV 2.0 has been redesigned to distinguish between jurisdictions and laws by using the ISO 3166-1 codes to create a structured path. For example, the namespace for EU-GDPR extension is w3id.org/dpv/legal/eu/gdpr — which reflects that it is a legal extension associated with EU jurisdiction and models the GDPR regulation. This mechanism also enables laws with the same name in other jurisdictions to be declared without conflicts e.g. UK’s GDPR would be under the namespace /legal/gb/gdpr. And it keeps all laws associated with a specific jursidiction within the same path, e.g., DGA, NIS2, and AI Act are represented within the /legal/eu namespace as EU laws. The draft EU Rights extension in DPV 1.0 providing concepts from EU Charter of Fundamental Rights has been moved to /legal/eu/rights namespace in DPV 2.0 following this reorganisation.

The extension dpv-nace modelled the NACE¹⁴ 2.0 taxonomy of economic activities provided by the EU as RDFS concepts for use with OWL vocabularies as the EU uses SKOS to declare NACE concepts. The extension has been removed in DPV 2.0 as NACE has recently been updated to version 2.1, and the extension did not provide any meaningful benefit and increased maintenance cost. To the best of our knowledge, the extension was not being used, and the recommendation is to use the authoritative NACE taxonomy going ahead.

The expected impact of these changes for DPV-GDPR and DPV-PD in DPV 1.0 is minimal as their unversioned IRIs are redirected to DPV 2.0 which contains the same concepts. For draft extensions in DPV 1.0, there are breaking changes — most severely in the case of legal/location concepts due to their separation into two extensions. In case an adopter has been using these draft extensions without being aware of impending changes, we estimate a minimal effort to use the new (and improved) DPV 2.0 extensions instead. In any case, with the versioned IRIs the adopters can continue use of DPV 1.0 if desired.

Changes in DPV Concepts: Of the 911 concepts in DPV 2.0, 311 are new additions and 56 concepts were removed as compared to DPV 1.0. The removed concepts represent refinements and moving concepts to an extension, e.g., harms and other impacts were moved to the RISK extension.

Methodology & Design Principles

Management by DPVCG

The DPVCG used the W3C infrastructure to manage development of DPV, which consisted of the mailing list, task management, and namespace management (w3.org). After the migration of W3C to GitHub, DPVCG utilised the provided repository¹⁵ for version control, task management, discussions, and contributions. In its meetings, the group utilised spreadsheets (using Google Sheets for collaboration) to support the (lack of) technical knowledge of members and ease of discussion, commenting, and sharing. Formal discussions and approvals were undertaken via the mailing list and meetings.

Ontology Engineering Process

The DPVCG consists of experts from multiple disciplines — computer science, law, sociology, authorities, and others. The primary role of its members is to discuss and reach consensus on the scope and information to be represented in DPV. Ontology engineers are then responsible for providing the appropriate modelling of concepts and organisation of DPV as a semantic web resource. While the DPVCG did not formally establish an ontology engineering methodology, practices that were adopted and evolved in the community reflect commonly utilised engineering methodologies of NeOn [84] and LOT [85].

The development process generally contained a member of the group proposing addition or modification of a concept, with information shared potentially via the mailing list and/or GitHub repo, and discussions and decisions within the meetings. Domain experts offered their advice on the information being modelled and what aspects of this should be considered, which were then formalised and shared as proposals. The group then discussed and voted to resolve the proposal, with minutes of the meeting reflecting the discussions and resolutions.

Compared to formal ontology engineering methods, this ad-hoc approach lacked the explicit documentation of requirements, competency questions, and use-cases — which would have been expensive to maintain given the limited (regular) participation of volunteers. However, given the stability of the group and continued iteration and adoption of DPV, the group has identified this as an important step to undertake in the future.

The data in spreadsheets was structured to support ontology/taxonomy creation while still being comprehensible to the non-semantic web members. For generating the RDF serialisations, a custom documentation generator was developed which downloaded the spreadsheets and serialised them into multiple RDF formats, and generated the corresponding HTMLs. Existing tools such as WIDOCO [86] were not used due to their limitation in control of outputs — for example DPV has multiple taxonomies which would be all grouped together in a giant list in WIDOCO outputs, and which in DPV outputs are grouped separately to support adopters seeing related concepts grouped together. WIDOCO also did not support the dual RDFS+SKOS and OWL modelling outputs of DPV, or the ReSpec template¹⁶ common in W3C outputs.

DPV follows best practices and guidelines established within the semantic web community. Namely, the W3C Best Practices for Publishing Linked Data (2014) WIDOCO best practices [86] with OOPS!¹⁷ and FOOPS!¹⁸ for evaluation, W3ID¹⁹ for permanent IRIs, GitHub for version control and collaboration, and Zenodo for archival.

Implications of Using SKOS and OWL

As described in the requirements in Section 2, use-cases utilising DPV involve cases where its concepts are used as instances (taxonomy) or as a schema that is instantiated (ontology). Initially, DPV was only provided as an OWL2 ontology. This was expanded upon in DPV 1.0 which used custom SKOS extensions to define the ‘base’ vocabulary with serialisations in RDFS+SKOS and OWL2 with the goal of supporting both categories of use-cases. In DPV 2.0, the custom SKOS extension was removed in favour of using RDFS+SKOS as standards for the default serialisation and providing an alternative serialisation for OWL2.

The RDFS+SKOS serialisation defines concepts as instances of rdfs:Class and skos:Concept. To create a hierarchical taxonomy, the concepts are represented as instances of a top-concept (e.g., dpv:Marketing as an instance of dpv:Purpose) and skos:broader/narrower is used to define the relation between instances. In OWL2, concepts are defined as owl:Class and the hierarchy is defined using rdfs:subClassOf. The justification for why DPV is provided with two different semantics is illustrated in Figure [fig:semantic-implications].

Semantic implications of using DPV with RDFS+SKOS and OWL semantics

The Figure compares the implementation of RDFS+SKOS and OWL2 for an use-case that uses the purpose taxonomy from DPV. The use-case involves three steps of documentation (not shown in the figure) where the organisation first records that its planned purpose is ‘Direct Marketing’, and then later it creates ‘CampaignA’ as a specific form of direct marketing, and even later creates ‘CampaignB’ as a specific part of ‘CampaignA’ direct marketing.

In RDFS+SKOS, both dpv:DirectMarketing and ex:CampaignA are both defined as an instance of dpv:Purpose, and associated with each other using skos:broader. Using these as the object with property dpv:hasPurpose is correct as its range is (instances of) dpv:Purpose. Later, when ex:CampaignB is introduced, it does not require changes to DPV or the use-case graph (e.g., to convert instances into classes) as the new concept is also defined as an instance of dpv:Purpose and can be used with the property.

In OWL2, all DPV concepts are defined as instances of owl:Class and associated with each other using rdfs:subClassOf. The use-case concepts can now be declared as either subclasses of DPV concepts or as instances. In either case, DPV concepts cannot be directly used with the property dpv:hasPurpose as they are not instances of dpv:Purpose (shown in red). If the use-case creates instances — as shown where ex:CampaignA is created as an instance of dpv:DirectMarketing — then the use-case concept can be correctly used with dpv:hasPurpose. However, when ex:CampaignB is later introduced, it cannot be subclass of ex:CampaignA as ‘subclass of instance’ is undefined²⁰ in OWL2. The relation between the two use-case concepts thus must be defined using either SKOS (thereby mixing SKOS and OWL2) or using another property, e.g., dct:hasPart. At this point, the use-case should reengineer its ontology by creating a class representing ex:CampaignA and creating the necessary subclass and instances to represent the relationship with ex:CampaignB. This however does not solve the issue with directly using DPV concepts as instances.

Thus, the RDFS+SKOS model is suitable for when DPV is to be used as a controlled taxonomy with a ‘lightweight ontology’ that supports extending the taxonomy in use-cases. Conversely, the OWL2 model is better suited for cases where formal reasoning is needed, and where sufficient ontology engineering capabilities exist to address changes in use-cases. By providing both serialisations, DPV enables the adopters to choose the most suitable serialisation that supports their use-case and/or existing implementations, and retains semantic interoperability based on converting between SKOS and OWL2²¹

Conclusion & Future Work

The Data Privacy Vocabulary (DPV), in its second iteration, thus provides a significantly richer, extended, and state of the art resource which fills an important niche in the current landscape regarding expression of information associated with personal data processing and the use of technologies. This article highlighted the motivation for its development, described its methodology and design processes, and showcased its value evidenced through adoption across academia, industry, and standards.

DPV version 2 represents a significant milestone in the development of a vocabulary to support legally relevant processes across multiple regulations and jurisdictions, as well as recent advances in AI and data sharing regulations such as through EU’s DGA and AI Act, and in architectures such as Solid and IDSA. The DPVCG, in continuing to develop the DPV, welcomes more participation and contributions to support its vision of providing an interoperable vocabulary that provides value and supports making legal compliance processes more efficient and aligned for all stakeholders.

The DPVCG plans to refine its TECH and AI extensions based on existing works [80], [82] providing taxonomies for AI techniques, capabilities, lifecycle stages, risks and risk sources, and to enable stakeholders to express specific use-cases (e.g., involving generative AI) in a manner that supports requirements for EU AI Act and ISO standards [82]. The DPVCG is also continuing its efforts to develop vocabularies to represent key ‘data and AI regulations’ notably in EU the Digital Services Act (DSA), Data Markets Act (DMA), Data Act, and Data Spaces, and modelling laws in other jurisdictions, e.g., Ireland, USA, and UK.

To support the application of DPV in regulatory environments, the DPVCG is developing guides based on existing work that utilises DPV to support GDPR implementations. These include Records of Processing Activities (ROPA) [45], Data Protection Impact Assessment (DPIA) [46], Data Breaches [76], and Rights management. To support the implementation of decentralised and user-centric applications of DPV, such as those envisioned in the IDSA, IEEE P7012 and Solid, the DPVCG is developing vocabularies and guides — for example implementation of ISO Standards for (semantic) privacy notices. In addition, the DPV is also looking to incorporate the Standard Data Protection Model²² provided by the German data protection authority regarding implementation of concrete technical and organisational measures.

Resource Availability Statement:

The source and releases for the DPV are available via GitHub: https://github.com/w3c/dpv and have been deposited in Zenodo for long-term archival: https://doi.org/10.5281/zenodo.12505840.

Acknowledgements of Funding The DPVCG was established as part of the SPECIAL H2020 Project funded by the European Union’s Horizon 2020 programme under Grant#731601 (2017-2019). Harshvardhan J. Pandit was funded (2020-2022) by the Irish Research Council’s Government of Ireland Postdoctoral Fellowship Grant#GOIPD/2020/790. The ADAPT SFI Centre for Digital Media Technology is funded by Science Foundation Ireland through the SFI Research Centres Programme and is co-funded under the European Regional Development Fund (ERDF) through Grant#13/RC/2106 (2018-2020) and Grant#13/RC/2106_P2 (2021 onwards). Piero Bonatti and Luigi Sauro were funded by the European Union’s Horizon 2020 programme under Grant#731601 (2017-2019), and under TRAPEZE Grant#883464 (2020-2023). Beatriz Esteves and Delaram Golpayegani were funded by European Union’s Horizon 2020 programme’s Marie Skłodowska-Curie Grant#813497 PROTECT ITN Project. Beatriz is also funded by SolidLab Vlaanderen (Flemish Government, EWI and RRF project VV023/10). Julian Flake received funding from the German Federal Ministry of Education and Research (BMBF) grant#16KIS1298 (AI-NET PROTECT), from the European Union’s Horizon Europe Framework Programme grant#101129822 (TITAN) and from the European Union’s Digital Europe Programme grant#101123471 (EDGE-Skills). For the purpose of Open Access the authors have applied a CC-BY public copyright licence to any Author Accepted Manuscript arising from this submission.

References

[2]

H. J. Pandit et al., “Creating a Vocabulary for Data Privacy: The First-Year Report of Data Privacy Vocabularies and Controls Community Group (DPVCG),” in On the Move to Meaningful Internet Systems: OTM 2019 Conferences, 2019, vol. 11877, pp. 714–730, doi: 10.1007/978-3-030-33246-4_44 [Online]. Available: http://link.springer.com/10.1007/978-3-030-33246-4_44. [Accessed: 26-Jun-2020]

[3]

European Commission, “Regulation (EU) 2022/868 of the european parliament and of the council of 30 may 2022 on european data governance and amending regulation (EU) 2018/1724 (DATA GOVERNANCE ACT) (text with EEA relevance).” 2022 [Online]. Available: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32022R0868. [Accessed: 17-Apr-2024]

[4]

European Commission, “Proposal for a regulation of the european parliament and of the council laying down harmonised rules on aertificial intelligence (ARTIFICIAL INTELLIGENCE ACT) and amending certain union legislative acts.” 2021 [Online]. Available: https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=celex:52021PC0206. [Accessed: 17-Apr-2024]

[5]

R. Ianella, “Open digital rights language (ODRL),” Open Content Licensing: Cultivating the Creative Commons, 2007.

[6]

B. Esteves, H. J. Pandit, and V. Rodríguez-Doncel, “ODRL Profile for Expressing Consent through Granular Access Control Policies in Solid,” in 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS PW), 2021, pp. 298–306, doi: 10.1109/EuroSPW54576.2021.00038.

[7]

S. Bader et al., “The international data spaces information model–an ontology for sovereign exchange of digital content,” in International semantic web conference, 2020, pp. 176–192.

[8]

L. Robaldo, C. Bartolini, and G. Lenzini, “The DAPRECO knowledge base: Representing the GDPR in LegalRuleML,” in Proceedings of the twelfth language resources and evaluation conference, 2020, pp. 5688–5697.

[9]

D. Calvaresi, M. Schumacher, and J.-P. Calbimonte, “Agent-based Modeling for Ontology-driven Analysis of Patient Trajectories,” Journal of Medical Systems, vol. 44, no. 9, p. 158, 2020, doi: 10.1007/s10916-020-01620-8. [Online]. Available: https://doi.org/10.1007/s10916-020-01620-8. [Accessed: 17-Apr-2024]

[10]

S. Lieber, B. De Meester, R. Verborgh, and A. Dimou, “EcoDaLo: Federating Advertisement Targeting with Linked Data,” in Semantic Systems. In the Era of Knowledge Graphs, 2020, pp. 87–103, doi: 10.1007/978-3-030-59833-4_6.

[11]

P. A. Bonatti, S. Kirrane, I. M. Petrova, and L. Sauro, “Machine Understandable Policies and GDPR Compliance Checking,” KI - Künstliche Intelligenz, vol. 34, no. 3, pp. 303–315, 2020, doi: 10.1007/s13218-020-00677-4. [Online]. Available: https://doi.org/10.1007/s13218-020-00677-4. [Accessed: 17-Apr-2024]

[12]

R. Matulevičius, J. Tom, K. Kala, and E. Sing, “A Method for Managing GDPR Compliance in Business Processes,” in Advanced Information Systems Engineering, 2020, pp. 100–112, doi: 10.1007/978-3-030-58135-0_9.

[13]

N. Thalhath, M. Nagamori, and T. Sakaguchi, “MetaProfiles - A Mechanism to Express Metadata Schema, Privacy, Rights and Provenance for Data Interoperability,” in Digital Libraries at Times of Massive Societal Transition, 2020, pp. 364–370, doi: 10.1007/978-3-030-64452-9_34.

[14]

D. Calvaresi, M. Schumacher, and J.-P. Calbimonte, “Personal Data Privacy Semantics in Multi-Agent Systems Interactions,” in Advances in Practical Applications of Agents, Multi-Agent Systems, and Trustworthiness. The PAAMS Collection, 2020, pp. 55–67, doi: 10.1007/978-3-030-49778-1_5.

[15]

K. Krasnashchok, M. Mustapha, A. Al Bassit, and S. Skhiri, “Towards Privacy Policy Conceptual Modeling,” in Conceptual Modeling, 2020, pp. 429–438, doi: 10.1007/978-3-030-62522-1_32.

[16]

V. Leone and L. Di Caro, “The Role of Vocabulary Mediation to Discover and Represent Relevant Information in Privacy Policies,” in Legal Knowledge and Information Systems, IOS Press, 2020, pp. 73–82 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA200851. [Accessed: 17-Apr-2024]

[17]

P. Ryan, M. Crane, and R. Brennan, “Design Challenges for GDPR RegTech,” 2020, pp. 787–795 [Online]. Available: https://www.scitepress.org/Link.aspx?doi=10.5220/0009464507870795. [Accessed: 17-Apr-2024]

[18]

J.-P. Calbimonte, D. Calvaresi, and M. Schumacher, “Decentralized Management of Patient Proﬁles and Trajectories through Semantic Web Agents,” in Proceedings of the Third International Workshop on Semantic Web Meets Health Data Management (SWH 2020) co-located with the 19th International Semantic Web Conference (ISWC 2020), 2020 [Online]. Available: https://ceur-ws.org/Vol-2759/paper2.pdf

[19]

C. Debruyne, H. J. Pandit, D. Lewis, and D. O’Sullivan, “‘Just-in-time’ generation of datasets by considering structured representations of given consent for GDPR compliance,” Knowledge and Information Systems, vol. 62, no. 9, pp. 3615–3640, 2020, doi: 10.1007/s10115-020-01468-x. [Online]. Available: https://doi.org/10.1007/s10115-020-01468-x. [Accessed: 17-Apr-2024]

[20]

H. J. Pandit, “Representing Activities associated with Processing of Personal Data and Consent using Semantic Web for GDPR Compliance,” {PhD} thesis, Trinity College Dublin, 2020 [Online]. Available: http://hdl.handle.net/2262/92446

[21]

P. Ryan, H. J. Pandit, and R. Brennan, “A Common Semantic Model of the GDPR Register of Processing Activities,” Legal Knowledge and Information Systems, pp. 251–254, 2020, doi: 10.3233/FAIA200876. [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA200876. [Accessed: 12-Jul-2022]

[22]

B. Esteves, “Challenges in the Digital Representation of Privacy Terms,” in AI Approaches to the Complexity of Legal Systems XI-XII, vol. 13048, V. Rodríguez-Doncel, M. Palmirani, M. Araszkiewicz, P. Casanovas, U. Pagallo, and G. Sartor, Eds. Cham: Springer International Publishing, 2021, pp. 313–327 [Online]. Available: https://doi.org/10.1007/978-3-030-89811-3_22

[23]

N. McDonald et al., “Evaluation of an Access-Risk-Knowledge (ARK) Platform for Governance of Risk and Change in Complex Socio-Technical Systems,” International Journal of Environmental Research and Public Health, vol. 18, no. 23, p. 12572, 2021, doi: 10.3390/ijerph182312572. [Online]. Available: https://www.mdpi.com/1660-4601/18/23/12572. [Accessed: 17-Apr-2024]

[24]

B. Flesch, “Investigating the suitability of blockchain for managing patients consent in clinical trials,” Master’s thesis, Trinity College Dublin, 2021 [Online]. Available: https://publications.scss.tcd.ie/theses/diss/2021/TCD-SCSS-DISSERTATION-2021-046.pdf

[25]

E. Grünewald, P. Wille, F. Pallas, M. C. Borges, and M.-R. Ulbricht, “TIRA: An OpenAPI Extension and Toolbox for GDPR Transparency in RESTful Architectures,” in 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2021, pp. 312–319, doi: 10.1109/EuroSPW54576.2021.00039 [Online]. Available: https://ieeexplore.ieee.org/document/9583685. [Accessed: 17-Apr-2024]

[26]

P. A. Bonatti, L. Sauro, and J. Langens, “Representing Consent and Policies for Compliance,” in 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2021, pp. 283–291, doi: 10.1109/EuroSPW54576.2021.00036 [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9583722. [Accessed: 17-Apr-2024]

[27]

L. Sion, D. V. Landuyt, and W. Joosen, “An Overview of Runtime Data Protection Enforcement Approaches,” in 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2021, pp. 351–358, doi: 10.1109/EuroSPW54576.2021.00044 [Online]. Available: https://ieeexplore.ieee.org/document/9583679. [Accessed: 17-Apr-2024]

[28]

H. J. Pandit, D. O’Sullivan, and D. Lewis, “A Design Pattern Describing Use of Personal Data in Privacy Policies,” in Advances in Pattern-Based Ontology Engineering, IOS Press, 2021, pp. 107–119 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/SSW210009. [Accessed: 17-Apr-2024]

[29]

R. G. Hamed, “Enhancing the Transparency of Personal Data Access through Semantic Web Technologies,” {PhD} thesis, Trinity College Dublin, 2021 [Online]. Available: http://www.tara.tcd.ie/bitstream/handle/2262/96722/PhDThesis-RoghaiyeGachpazHamed-15338853-final%20version.pdf?sequence=1

[30]

P. Ryan, M. Crane, and R. Brennan, “GDPR Compliance Tools: Best Practice from RegTech,” in Enterprise Information Systems, 2021, pp. 905–929, doi: 10.1007/978-3-030-75418-1_41.

[31]

F. J. Ekaputra et al., “Semantic-enabled architecture for auditable privacy-preserving data analysis,” Semantic Web, vol. Preprint, no. Preprint, pp. 1–34, 2021, doi: 10.3233/SW-212883. [Online]. Available: https://content.iospress.com/articles/semantic-web/sw212883. [Accessed: 17-Apr-2024]

[32]

P. Ryan, H. Pandit, and R. Brennan, “Building a Data Processing Activities Catalog: Representing Heterogeneous Compliance-Related Information for GDPR Using DCAT-AP and DPV,” in Further with Knowledge Graphs, IOS Press, 2021, pp. 169–182 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/SSW210043. [Accessed: 17-Apr-2024]

[33]

V. <1990>. Leone, “Legal knowledge extraction in the data protection domain based on Ontology Design Patterns,” Doctoral {Thesis}, Alma Mater Studiorum - Università di Bologna, 2021 [Online]. Available: http://amsdottorato.unibo.it/9747/. [Accessed: 17-Apr-2024]

[34]

K. García, Z. Zihlmann, S. Mayer, A. Tamò-Larrieux, and J. Hooss, “Towards Privacy-Friendly Smart Products,” in 2021 18th International Conference on Privacy, Security and Trust (PST), 2021, pp. 1–7, doi: 10.1109/PST52912.2021.9647826 [Online]. Available: https://ieeexplore.ieee.org/document/9647826. [Accessed: 17-Apr-2024]

[35]

D. Hickey and R. Brennan, “A GDPR International Transfer Compliance Framework Based on an Extended Data Privacy Vocabulary (DPV),” in Legal Knowledge and Information Systems, IOS Press, 2021, pp. 161–170 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA210332. [Accessed: 17-Apr-2024]

[36]

S. Human et al., “Data Protection and Consenting Communication Mechanisms: Current Open Proposals and Challenges,” in 2022 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), 2022, pp. 231–239, doi: 10.1109/EuroSPW55150.2022.00029 [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9799369. [Accessed: 17-Apr-2024]

[37]

V. Jesus and H. J. Pandit, “Consent Receipts for a Usable and Auditable Web of Personal Data,” IEEE Access, vol. 10, pp. 28545–28563, 2022, doi: 10.1109/ACCESS.2022.3157850. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/9730898. [Accessed: 17-Apr-2024]

[38]

S. Human, “Advanced Data Protection Control (ADPC): An Interdisciplinary Overview.” arXiv, 2022 [Online]. Available: http://arxiv.org/abs/2209.09724. [Accessed: 17-Apr-2024]

[39]

P. Sangaroonsilp, H. K. Dam, M. Choetkiertikul, C. Ragkhitwetsagul, and A. Ghose, “Mining and Classifying Privacy and Data Protection Requirements in Issue Reports.” arXiv, 2022 [Online]. Available: http://arxiv.org/abs/2112.13994. [Accessed: 17-Apr-2024]

[40]

H. J. Pandit, “Proposals for Resolving Consenting Issues with Signals and User-side Dialogues.” arXiv, 2022 [Online]. Available: http://arxiv.org/abs/2208.05786. [Accessed: 17-Apr-2024]

[41]

S. C. Rasmusen, “Increasing Trust and Engagement in the Age of GDPR: A Digital Contracting Tool Supported by Knowledge Graphs,” Master’s thesis, University of Innsbruck, 2022.

[42]

B. Esteves and V. Rodríguez-Doncel, “Semantifying the Governance of Data in Europe,” in 18th International Conference on Semantic Systems - CEUR Workshop Proceedings, 2022, vol. 3235 [Online]. Available: https://ceur-ws.org/Vol-3235/paper17.pdf

[43]

B. Esteves and V. Rodríguez-Doncel, “Analysis of Ontologies and Policy Languages to Represent Information Flows in GDPR,” Semantic Web Journal, 2022, doi: 10.3233/SW-223009.

[44]

P. Ryan and R. Brennan, “Support for Enhanced GDPR Accountability with the Common Semantic Model for ROPA (CSM-ROPA),” SN Computer Science, vol. 3, no. 3, p. 224, 2022, doi: 10.1007/s42979-022-01099-9. [Online]. Available: https://doi.org/10.1007/s42979-022-01099-9. [Accessed: 17-Apr-2024]

[45]

P. Ryan, R. Brennan, and H. J. Pandit, “DPCat: Specification for an Interoperable and Machine-Readable Data Processing Catalogue Based on GDPR,” Information, vol. 13, no. 5, p. 244, 2022, doi: 10.3390/info13050244. [Online]. Available: https://www.mdpi.com/2078-2489/13/5/244. [Accessed: 05-Jul-2023]

[46]

H. Pandit, “A Semantic Specification for Data Protection Impact Assessments (DPIA),” 2022, doi: 10.5281/zenodo.6783203 [Online]. Available: http://www.tara.tcd.ie/handle/2262/100126. [Accessed: 17-Apr-2024]

[47]

B. Esteves, H. Asgarinia, A. C. Penedo, B. Mutiro, and D. Lewis, “Fostering trust with transparency in the data economy era: An integrated ethical, legal, and knowledge engineering approach,” in Proceedings of the 1st International Workshop on Data Economy, 2022, pp. 57–63, doi: 10.1145/3565011.3569061 [Online]. Available: https://doi.org/10.1145/3565011.3569061. [Accessed: 05-Jul-2023]

[48]

R. Becker et al., “Secondary Use of Personal Health Data: When Is It ‘Further Processing’ Under the GDPR, and What Are the Implications for Data Controllers?” European Journal of Health Law, vol. 30, no. 2, pp. 129–157, 2022, doi: 10.1163/15718093-bja10094. [Online]. Available: https://brill.com/view/journals/ejhl/30/2/article-p129_1.xml. [Accessed: 17-Apr-2024]

[49]

B. Esteves, V. Rodríguez-Doncel, H. J. Pandit, N. Mondada, and P. McBennett, “Using the ODRL Profile for Access Control for Solid Pod Resource Governance,” in The Semantic Web: ESWC 2022 Satellite Events, 2022, pp. 16–20, doi: 10.1007/978-3-031-11609-4_3.

[50]

L. Debackere, P. Colpaert, R. Taelman, and R. Verborgh, “A Policy-Oriented Architecture for Enforcing Consent in Solid,” in Companion Proceedings of the Web Conference 2022, 2022, pp. 516–524, doi: 10.1145/3487553.3524630 [Online]. Available: https://dl.acm.org/doi/10.1145/3487553.3524630. [Accessed: 09-May-2023]

[51]

L. Debackere, “Enforcing Data Protection in Solid: A Policy-Oriented Framework,” Master’s thesis, Ghent University, 2022.

[52]

G. Gambarelli and A. Gangemi, “PRIVAFRAME: A Frame-Based Knowledge Graph for Sensitive Personal Data,” Big Data and Cognitive Computing, vol. 6, no. 3, p. 90, 2022, doi: 10.3390/bdcc6030090. [Online]. Available: https://www.mdpi.com/2504-2289/6/3/90. [Accessed: 17-Apr-2024]

[53]

A. M. Kurteva, “Making Sense of Consent with Knowledge Graphs,” Master’s thesis, 2022.

[54]

S. Becher and A. Gerl, “ConTra Preference Language: Privacy Preference Unification via Privacy Interfaces,” Sensors, vol. 22, no. 14, p. 5428, 2022, doi: 10.3390/s22145428. [Online]. Available: https://www.mdpi.com/1424-8220/22/14/5428. [Accessed: 17-Apr-2024]

[55]

J. Hernandez, L. McKenna, and R. Brennan, “TIKD: A Trusted Integrated Knowledge Dataspace for Sensitive Data Sharing and Collaboration,” in Data Spaces : Design, Deployment and Future Directions, E. Curry, S. Scerri, and T. Tuikka, Eds. Cham: Springer International Publishing, 2022, pp. 265–291 [Online]. Available: https://doi.org/10.1007/978-3-030-98636-0_13. [Accessed: 17-Apr-2024]

[56]

H. J. Pandit, “Making Sense of Solid for Data Governance and GDPR,” Information, vol. 14, no. 2, 2023, doi: 10.3390/info14020114. [Online]. Available: https://www.mdpi.com/2078-2489/14/2/114. [Accessed: 07-Jun-2023]

[57]

G. Bushati, S. C. Rasmusen, A. Kurteva, A. Vats, P. Nako, and A. Fensel, “What is in your cookie box? Explaining ingredients of web cookies with knowledge graphs,” Semantic Web, vol. Preprint, no. Preprint, pp. 1–17, 2023, doi: 10.3233/SW-233435. [Online]. Available: https://content.iospress.com/articles/semantic-web/sw233435. [Accessed: 17-Apr-2024]

[58]

J. Breteler, T. van Gessel, G. Biagioni, and R. van Doesburg, “The FLINT Ontology: An Actor-Based Model of Legal Relations,” in Knowledge Graphs: Semantics, Machine Learning, and Languages, IOS Press, 2023, pp. 227–234 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/SSW230016. [Accessed: 17-Apr-2024]

[59]

A. Kurteva and H. J. Pandit, “Relevant Research Questions For Decentralised (Personal) Data Governance,” in Trusting Decentralised Knowledge Graphs and Web Data (TrusDeKW) Workshop at ESWC 2023, 2023, vol. 3443 [Online]. Available: https://ceur-ws.org/Vol-3443/ESWC_2023_TrusDeKW_paper_7584.pdf

[60]

H. Asgarinia, A. Chomczyk Penedo, B. Esteves, and D. Lewis, “‘Who Should I Trust with My Data?’ Ethical and Legal Challenges for Innovation in New Decentralized Data Management Technologies,” Information, vol. 14, no. 7, p. 351, 2023, doi: 10.3390/info14070351. [Online]. Available: https://www.mdpi.com/2078-2489/14/7/351. [Accessed: 06-Jul-2023]

[61]

G. Gambarelli, A. Gangemi, and R. Tripodi, “Is Your Model Sensitive? SPEDAC: A New Resource for the Automatic Classification of Sensitive Personal Data,” IEEE Access, vol. 11, pp. 10864–10880, 2023, doi: 10.1109/ACCESS.2023.3240089. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/10031607. [Accessed: 17-Apr-2024]

[62]

E. Grünewald, J. M. Halkenhäußer, N. Leschke, J. Washington, C. Paupini, and F. Pallas, “Enabling Versatile Privacy Interfaces Using Machine-Readable Transparency Information,” in Privacy Symposium 2023, 2023, pp. 119–137, doi: 10.1007/978-3-031-44939-0_7.

[63]

H. Bailly, A. Papanna, and R. Brennan, “Prototyping an End-User User Interface for theSolid Application Interoperability Specification Under GDPR,” in The Semantic Web, 2023, pp. 557–573, doi: 10.1007/978-3-031-33455-9_33.

[64]

F. Tang, &#216, B. M. Stvold, and M. Bruntink, “Helping Code Reviewer Prioritize: Pinpointing Personal Data and Its Processing,” in New Trends in Intelligent Software Methodologies, Tools and Techniques, IOS Press, 2023, pp. 109–124 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA230228. [Accessed: 17-Apr-2024]

[65]

B. Esteves, “Towards an Architecture for Data Altruism in Solid,” in ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, 2023 [Online]. Available: https://ceur-ws.org/Vol-3632/ISWC2023_paper_491.pdf

[66]

B. Esteves and H. J. Pandit, “Using Patterns to Manage Governance of Solid Apps,” in 14th Workshop on Ontology Design and Patterns (WOP 2023@ISWC 2023), 2023 [Online]. Available: https://ceur-ws.org/Vol-3636/paper5.pdf

[67]

Y. Taheri, G. Bourgne, and J.-G. Ganascia, “A Compliance Mechanism for Planning in Privacy Domain Using Policies,” in New Frontiers in Artificial Intelligence, 2023, pp. 77–92, doi: 10.1007/978-3-031-36190-6_6.

[68]

C. Sun, M. Gallofré Ocaña, J. van Soest, and M. Dumontier, “ciTIzen-centric DAta pLatform (TIDAL): Sharing distributed personal data in a privacy-preserving manner for health research,” Semantic Web, vol. 14, no. 5, pp. 977–996, 2023, doi: 10.3233/SW-223220. [Online]. Available: https://content.iospress.com/articles/semantic-web/sw223220. [Accessed: 07-Jun-2023]

[69]

A. Navarro-Gallinad, F. Orlandi, J. Scott, M. Little, and D. O’Sullivan, “Evaluating the usability of a semantic environmental health data framework: Approach and study,” Semantic Web, vol. 14, no. 5, pp. 787–810, 2023, doi: 10.3233/SW-223212. [Online]. Available: https://content.iospress.com/articles/semantic-web/sw223212. [Accessed: 17-Apr-2024]

[70]

M. Florea and B. Esteves, “Is Automated Consent in Solid GDPR-Compliant? An Approach for Obtaining Valid Consent with the Solid Protocol,” Information, vol. 14, no. 12, p. 631, 2023, doi: 10.3390/info14120631. [Online]. Available: https://www.mdpi.com/2078-2489/14/12/631. [Accessed: 30-Nov-2023]

[71]

A. N. Gallinad, “A Usable Knowledge Graph Framework for Linking Health Events with Environmental Data,” {PhD} thesis, Trinity College Dublin, 2023.

[72]

A. Kurteva et al., “The smashHitCore Ontology for GDPR-Compliant Sensor Data Sharing in Smart Cities,” Sensors, vol. 23, no. 13, p. 6188, 2023, doi: 10.3390/s23136188. [Online]. Available: https://www.mdpi.com/1424-8220/23/13/6188. [Accessed: 17-Apr-2024]

[73]

S. D. Gupta and T. Hahmann, “OPPO: An Ontology for Describing Fine-Grained Data Practices in Privacy Policies of Online Social Networks.” arXiv, 2023 [Online]. Available: http://arxiv.org/abs/2309.15971. [Accessed: 17-Apr-2024]

[74]

M. <1995>. Zichichi, “Decentralized systems for the protection and portability of personal data,” Doctoral {Thesis}, Alma Mater Studiorum - Università di Bologna, 2023 [Online]. Available: http://amsdottorato.unibo.it/10662/. [Accessed: 17-Apr-2024]

[75]

B. Esteves, V. Rodríguez-Doncel, H. J. Pandit, and D. Lewis, “Semantics for Implementing Data Reuse and Altruism Under EU’s Data Governance Act,” in Knowledge Graphs: Semantics, Machine Learning, and Languages, IOS Press, 2023, pp. 210–226 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/SSW230015. [Accessed: 01-Oct-2023]

[76]

H. J. Pandit, P. Ryan, G. P. Krog, M. Crane, and R. Brennan, “Towards a Semantic Specification for GDPR Data Breach Reporting,” in Legal Knowledge and Information Systems, IOS Press, 2023, pp. 131–136 [Online]. Available: https://ebooks.iospress.nl/doi/10.3233/FAIA230956. [Accessed: 17-Apr-2024]

[77]

H. Raza and M. Ahmed, “The Semantic data sharing platform using Blockchain: A GDPR perspective.” 2023.

[78]

H. J. Pandit and B. Esteves, “Enhancing Data Use Ontology (DUO) for Health-Data Sharing by Extending it with ODRL and DPV,” Semantic Web Journal, 2024, doi: 10.3233/SW-243583.

[79]

G. B. Herwanto, G. Quirchmayr, and A. M. Tjoa, “Leveraging NLP Techniques for Privacy Requirements Engineering in User Stories,” IEEE Access, vol. 12, pp. 22167–22189, 2024, doi: 10.1109/ACCESS.2024.3364533. [Online]. Available: https://ieeexplore.ieee.org/document/10430095. [Accessed: 17-Apr-2024]

[80]

D. Golpayegani, H. J. Pandit, and D. Lewis, “AIRO: An Ontology for Representing AI Risks Based on the Proposed EU AI Act and ISO Risk Management Standards,” in Towards a Knowledge-Aware AI, 2022, pp. 51–65, doi: 10.3233/SSW220008.

[81]

D. Golpayegani, H. J. Pandit, and D. Lewis, “To Be High-Risk, or Not To Be—Semantic Specifications and Implications of the AI Act’s High-Risk AI Applications and Harmonised Standards,” in Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023, pp. 905–915, doi: 10.1145/3593013.3594050.

[82]

D. Golpayegani et al., “AI Cards: Towards an Applied Framework for Machine-Readable AI and Risk Documentation Inspired by the EU AI Act,” in Annual Privacy Forum, 2024, doi: 10.48550/arXiv.2406.18211 [Online]. Available: https://arxiv.org/abs/2406.18211. [Accessed: 08-Jul-2024]

[83]

H. J. Pandit, J. Lindquist, and G. P. Krog, “Implementing ISO/IEC TS 27560:2023 Consent Records and Receipts for GDPR and DGA,” in Annual Privacy Forum, 2024, doi: 10.48550/arXiv.2405.04528 [Online]. Available: https://arxiv.org/abs/2405.04528. [Accessed: 08-Jul-2024]

[84]

M. C. Suárez-Figueroa, A. Gómez-Pérez, and M. Fernández-López, “The NeOn methodology for ontology engineering,” in Ontology engineering in a networked world, Springer, 2011, pp. 9–34.

[85]

M. Poveda-Villalón, A. Fernández-Izquierdo, M. Fernández-López, and R. Garcı́a-Castro, “LOT: An industrial oriented ontology engineering framework,” Engineering Applications of Artificial Intelligence, vol. 111, p. 104755, 2022.

[86]

D. Garijo, “WIDOCO: A wizard for documenting ontologies,” in The semantic web–ISWC 2017: 16th international semantic web conference, vienna, austria, october 21-25, 2017, proceedings, part II 16, 2017, pp. 94–102.