DPCat is a Specification for an interoperable and machine-readable data processing catalogue based on [[[GDPR]]] Requirements and EU DPA guidelines. It extends [[[DCAT]]] and [[[DCAT-AP]]] standards and reuses [[[DPV]]] to enable data governance of ROPA and related information across a wide variety of use-cases.
Links:
DPCat Specification - the formal specification of DPCat concepts and requirements on their use
A Common Semantic Model for ROPA (CSM-ROPA) - the semantic model of ROPA related information generated from an analysis and consolidation of GDPR requirements and 31 EU DPA guidelines and templates
DPCat demo application and data - an application demonstrating usefulness of DPCat over ROPA documents published by the EDPS
For a complete discussion on the research aspects of this work, relation to state of the art, and discussion of its practicality and merit, please see our research article: Ryan, Paul, Brennan, Rob, & Pandit, Harshvardhan J. (2022). DPCat: Specification for an Interoperable and Machine-Readable Data Processing Catalogue based on GDPR. https://doi.org/10.5281/zenodo.6448788
Introduction
[[GDPR]] requires Data Controllers and Data Protection Officers (DPO) to maintain a Register of Processing Activities (ROPA) as part of overseeing the organisation's compliance processes. The ROPA must include information from heterogeneous sources such as (internal) departments with varying IT systems and (external) Data Processors. Current practices use spreadsheets or proprietary systems that lack machine-readability and interoperability, presenting barriers to automation.
The exercise of gathering the information necessary to create a ROPA is not a one-off activity as there may be several data sources both internally (e.g. departments) and externally (e.g. Data Processors). Therefore, ROPA creation requires communication between these distinct units to collate information pooled from 'heterogeneous sources' into a singular location to produce a ROPA. This necessitates some form of information management process for the tasks associated with documents, such as - reading or viewing, writing all or parts of it, exchanging them between relevant stakeholders, and ensuring their correctness and availability (e.g. backups or version control).
However, existing RegTech solutions are primarily centralised, proprietary, and emphasise custom processes that cannot be utilised outside of vendor-defined use-cases. In particular, the information being exchanged between internal and external stakeholders has been poorly researched in academia and commercial offerings despite the need for shared business and regulatory taxonomies for facilitating semantic interoperability between stakeholders to identify feasible and compliant software solutions for data protection and privacy regulations.
We propose an approach to solving these challenges whereby we identify what data is required to complete ROPA, who are the ROPA stakeholders, how do they utilise ROPA and what are the required information flows requiring interoperability and machine-readability of ROPA. To address the identified challenges and their solutions, we present our work based on the following research objectives:
Identify information and information flows relevant for a ROPA in terms of stakeholders based on GDPR and EU DPAs guidelines and templates
Develop a machine-readable specification for representing and exchanging ROPA relevant information in an interoperable manner
Specify a mechanism for using developed machine-readable formats for aggregation, querying, validation, and exporting of information based on identified ROPA-related information flows.
The principal contributions of this paper are summarised as follows:
Use-cases exploring ROPA data governance and stakeholders (RO1)
A Common Semantic Model for ROPA (CSM-ROPA) representing information requirements from EU DPAs (RO2)
Data Processing Catalogue (DPCat) specification for representing and exchanging ROPA related information and provenance (RO2)
Demonstration of representation, querying, validation, and exchange of ROPA related information using DPCat and semantic web technologies (RO3)
Discussion on the practicality and application of DPCat as a ’common mechanism’ for exchanging compliance information
Common Semantic Model of ROPA
DPA ROPA Template Analysis
The [[GDPR]] has 31 DPAs based on EDPB membership (31 DPAs from 27 EU Member States, the EDPS, and 3 additional members comprising the EFTA EEA states, considering the German regional DPAs as part of the central DPA) representing nations and member states from the EU and the EFTA EEA. Each DPA provides guidance regarding ROPA based on its basis in GDPR Art.30, and some DPAs also provide templates to assist organisations with maintaining their ROPA documents. The DPA ROPA templates go beyond the GDPR Art.30 requirements, are not consistent (with other DPA templates), and represent a challenge in producing a collective understanding of what information is required for maintaining a ROPA.
In this work, we analysed these 31 DPAs, and found 17 DPAs provided ROPA templates varying in language and content. Of 17 DPA templates, 5 used English, and for the rest we used Google Translate to convert the rest to English and manually ensured consistency in translation between templates regarding terms used. On these, we then performed term extraction, semantic analysis, term frequency enumeration, de-duplication, and antonym/homonym identification. We found templates with minimal information restricted their contents for conforming with GDPR Art.30. Some templates, such as those provided by Belgian and Greek DPAs, were extensive in fields beyond what the GDPR or other DPAs suggested. The exercise, carried out over 2020-2022, yielded 47 unique concepts representing information to be recorded in a ROPA. Of these, 18 concepts were related to the requirements defined in GDPR Art.30, and the rest (29 concepts) were either supplementary to these or added by DPAs\footnote{We could not discern source or basis in law (EU or national) for concepts added by DPAs}. An overview of the exercise is presented in the table below, which shows the identified concepts and their relevance to each DPA template analysed.
GDPR
Field
A.30
BE
GR
GB
PL
CY
FR
PT
DE
DK
LU
FI
CZ
HR
IT
LT
SI
SK
5
Personal Data Location
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
5.1
Data Sources
⨯
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
6.1
Legal basis
⨯
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
6.1
Record of consent
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
9.1
Special Personal Data Category
⨯
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
9.1
Vulnerable Data Subject Category
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
22.1
Automated decision making, profiling
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
26.1
Joint Controller agreement
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
28
Data Processors
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
28.3
Data Processing Contract
⨯
✓
✓
✓
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
30.1
Processing Status
⨯
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
32
Tech/Org measures implementation
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
32
Security measures
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
32
Technologies used
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
33.5
Data Breach
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
35
Risk assessment and mitigation
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
35
DPIA Results
⨯
✓
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
35
Relevant DPIA
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
36.1
Impact Assessment, Prior Consultation
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
37.6
External DPO organisation
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
Business Process
⨯
✓
✓
✓
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
Owner of Process
⨯
✓
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
Type of Processing
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
13, 14, 15
Data Subject Rights
⨯
✓
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
28, 30.1(c)
Third Party Data Transfer
⨯
✓
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
30.1(a)
Data Protection Officer Contact
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(a)
Representative
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(a)
Representative Contact
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(a)
Joint Controller Name
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(a)
Joint Controller contact
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(b)
Purposes of processing
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(b)
Main/Auxiliary Processing
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
30.1(c)
Personal Data Categories
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(c)
Data Subject Categories
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(d)
Recipient Categories
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(e)
Third Countries in Data Transfer
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(e)
Appropriate Safeguards
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30.1(f)
Retention/Deletion Periods
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
⨯
30.1(g)
Tech/Org measures
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30(1)(a)
Data Controller Contact
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30(1)(a)
Data Protection Officer
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
30(1)(a)
Data Protection Officer Contact
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
44-47
Nature of Transfer
⨯
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
6.1(f)
Legitimate interests
⨯
✓
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
6.1(f)
Legitimate interests assessment
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
6, 14, 30.1(b)
Data Combination
⨯
✓
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
⨯
Nos. Fields
16
32
31
29
25
23
23
23
22
21
21
19
18
18
18
18
18
17
Mapping ROPA concepts with DPV
The DPV provides a semantic vocabulary consisting of hierarchical taxonomies of concepts relevant to GDPR, such as personal data, purposes, processing operations, technical and organisational measures, legal bases, and entities. We chose DPV as it provides the most comprehensive vocabulary for our purposes, is open and accessible, has ongoing development and mechanisms to submit contributions, and is familiar to the authors. The process of representing identified concepts using DPV used the methodology: for each term, we identified whether the DPV contained the (semantically) exact concept - which we call an 'exact match', failing which we looked for the closest relevant term(s) which could be used as a substitute - called a 'partial match', and if any existing term could not represent the term - we considered it a 'new term' to be proposed to the DPVCG for inclusion in the DPV.
Of the 47 unique concepts found through ROPA templates analysis, we found 44 exact matches, one partial match, and two new terms proposed and added to the DPV. The output of this was the CSM-ROPA consisting of 47 concepts covering information requirements from GDPR and DPA templates for representing a ROPA. CSM-ROPA, through the use of DPV concepts, provides the ability to express a ROPA as a machine-readable and interoperable 'graph' that can be utilised in technological solutions for automating processes associated with ROPA and GDPR compliance. The CSM-ROPA data and analysis are available online at https://w3id.org/dpcat/csm-ropa. The table below provide an overview of this mapping.
GDPR
Field
DPV
Mapping
Controllers
Processors
5
Location of personal data
dpv:StorageLocation
E
R
R
5.1
Data Sources
dpv:DataSource
E
R
O
6.1
Legal basis
dpv:LegalBasis
E
M
O
6.1
Link to record of consent
dpv:Consent
E
R
R
9.1
Special Personal Data
dpv:SpecialCategoryPersonalData
E
R
O
9.1
Vulnerable Data Subjects
dpv:VulnerableDataSubject
E
R
O
22.1
Automated decision-making or profiling
dpv:AutomatedDecisionMaking
E
R
R
26.1
Joint Controller agreement
dpv:JointDataControllersAgreement
E
R
N/A
28
Data Processors
dpv:DataProcessor
E
R
M
28.3
Data Processing Contract
dpv:DataProcessingAgreement
E
R
R
28.3
Data processor contract
dpv:ControllerProcessorAgreement
E
R
R
30.1
Status of processing
dpv:Status
S
M
M
32
Tech/Org measures implementation
dpv:Technology
E
R
R
32
Security measures
dpv:TechnicalOrganisationalMeasure
E
R
R
32
Technologies used
dpv:Technology
E
R
R
33.5
Data Breach
dpcat:DataBreachRecord
S
R
R
35
Risk management
dpv:RiskMitigationMeasure
E
R
O
35
DPIA Results
dpv:DPIA
E
R
O
35
Relevant DPIA
dpv:DPIA
E
R
R
36.1
Impact Assessments
dpv:ImpactAssessment
E
R
R
36.1
Prior Consulatations
dpv:Consultation
E
R
R
37.6
External DPO organisation
dpv:DataProtectionOfficer
E
R
R
⨯
Name of Business Process
dpv:PersonalDataHandling
Pt
O
O
⨯
Owner of Process
dct:contactPoint
E
M
M
⨯
Type of Processing
dpv:Processing
E
O
O
13, 14, 15
Data Subject Rights
dpv:DataSubjectRight
E
R
O
28, 30.1(c)
Data Categories Transfer to Third Parties
dpv:Transfer, dpv:PersonalData
E
R
R
30.1(a)
DPO contact
dpv:hasName, dpv:hasContact
E
MC
MC
30.1(a)
Representative
dpv:Representative
E
MC
N/A
30.1(a)
Representative contact
dpv:hasName, dpv:hasContact
E
MC
N/A
30.1(a)
Name of joint controller
dpv:JointDataController
E
MC
N/A
30.1(a)
Contact details of joint controller
dpv:hasName, dpv:hasContact
E
MC
N/A
30.1(b)
Purposes of processing
dpv: Purpose
E
M
O
30.1(b)
Main/Auxilary Processing
dpv:Importance (Primary, Secondary)
E
O
O
30.1(c)
Personal Data Categories
dpv:PersonalDataCategory
E
M
M
30.1(c)
Categories of data subjects
dpv:DataSubject
E
M
M
30.1(d)
Categories of Recipients
dpv:Recipient
E
MC
MC
30.1(e)
Third Countries Data Transfer
dpv:ThirdCountry
E
MC
MC
30.1(e)
Appropriate Safeguards
dpv:Safeguard
E
MC
MC
30.1(f)
Retention/Deletion Periods
dpv:StorageDuration,
E
M
O
30.1(g)
Technical and organisational measures
dpv:TechnicalOrganisationalMeasure
E
M
M
30(1)(a)
Data Controller contact
dpv:hasName, dpv:hasContact
E
M
M
30(1)(a)
Data Protection Officer
dpv:DataProtectionOfficer
E
MC
MC
44-47
Nature of Transfer
dpv:DataTransferLegalBasis
E
MC
MC
6.1(f)
Legitimate interests
dpv:LegitimateInterest
E
R
R
6.1(f)
Legitimate interests assessment
dpv:LegitimateInterestAssessment
E
R
R
6, 14, 30.1(b)
Data Combination
dpv:Combine
E
R
O
DPCat
Overview
DPCat extends the DCAT Application profile for data portals in Europe< (DCAT-AP) with concepts identified in CSM-ROPA using DPV to enable representation of ROPA and associated information as ’catalogues’ and ’datasets’ respectively, that can be recorded and exchanged between stakeholders. DCAT-AP is a profile of the Data Catalog Vocabulary (DCAT v2) - a W3C standard for facilitating interoperability between data catalogues. DPCat maintains compatibility with DCAT-AP, and through it with DCAT, thereby enabling it to be used in all catalogue-based information management tools and data portals that support DCAT. In particular, the choice of DCAT-AP was made to present a mechanism for sharing ROPA related information using an EU-advocated standard and to promote the possibility of reusing existing data portal infrastructures for compliance-related purposes - such as requirements for ROPA between controllers, processors, and DPAs.
DPCat, summarised in the figure above, distinguishes between ROPA (as a document or an artefact) and ‘entries’ within a ROPA where each entry represents a specific context - such as a business process or data processing purpose. To represent these, we semantically extend the DCAT(-AP) concepts ‘catalog’ and ‘dataset’ as ‘ROPA’ and ‘ROPARecord’, respectively. We also extend ‘catalog’ as ‘ROPACatalog’ to represent a collection of ROPA catalogues (i.e. a catalogue of catalogues) for when an organisation has multiple ROPA documents e.g. representing different temporal periods or activities or organisational units (e.g. departments).
ROPARecord is a dcat:Dataset that catalogues information to be documented in a ROPA, is akin to a ’single row’ in a ROPA spreadsheet and represents a single record of processing. It is used as an instance of dpv:PersonalDataHandling to associate concepts such as purposes of processing or legal bases using the relevant DPV concepts identified from the CSM-ROPA analysis. To ensure compatibility with DCAT and DCAT-AP requirements and recommendations, such as a publisher being a foaf:Agent, DPCat declares the relevant DPV concepts as a subclass of DCAT(-AP) specified concepts. In a ROPARecord instance, the concepts are coherent i.e. all purposes apply to all personal data and are shared with all recipients and so on. To indicate separation, separate instances should be created.
A (dpcat:)ROPA represents a dcat:Catalog consisting of one or more ROPARecord datasets and reflects the conventional perspective of ‘ROPA as a single document’ with each entry being a ROPARecord within the catalogue. In both ROPA and ROPARecord, the DCAT properties provide an association with relevant information such as the publisher indicating who had produced or provided that record, temporal annotations such as when the record was produced, or the time period represented, and annotations such as titles and descriptions. A ROPACatalog is the same as a ROPA in terms of being extended from dcat:Catalog and is used to bundle related ROPA catalogues together using dcat:catalog relation.
For common ROPA-related communication between stakeholders, such as associating a ’point of contact’ (e.g. department or manager) for that information, DPCat uses DCAT relation dcat:contactPoint. Additionally, to adhere to GDPR terminology, it uses the DPV properties to indicate controller (dpv:hasDataController), DPO (dpv:hasDataProtectionOfficer), and ’responsible entity’ (dpv:hasResponsibleEntity). In this, the overlap between DCAT and DPV terms, such as the controller being the publisher or the DPO being the point of contact, may not always occur - such as when representing activities limited to a department where the point of contact is a member of that department who liaises with the DPO.
Using DPCat Internally
We envision DPCat to be integrated in to a typical workflow (i.e. U2-U4) for recording ROPA as follows. The source (e.g. department representative) generates a ROPARecord containing relevant information with provenance as the department. They use mechanisms available to them - e.g. a series of forms or a script that converts spreadsheets. This information is collated into a ROPA collection representing contextual grouping as determined by the organisational structure (e.g. maintained per department). For sources external to the organisation (e.g. a processor), the provided information is similarly stored in dedicated ROPA and ROPARecord entries and optionally integrated directly into relevant datasets (e.g. controller listing processor’s technical measures in its ROPA). This can use technological solutions such as a database or a portal. To facilitate the structuring of ROPA records in an organised manner, ROPACatalog entries are used to collect and group ROPA entries according to some criteria, e.g. temporal period, legal counsel, or responsible managers.
DPCat supports and enables a wide assortment of queries and validation approaches that utilise its metadata-based structure to perform information retrieval and verification tasks. DPCat can be a vital tool in technological solutions used for compliance-related processes through these. This section presents a few examples of queries and validation tasks that motivate the use of DPCat in an organisation’s ROPA related processes.
A common query associated with ROPA is retrieving GDPR Art.30 information for a specific context, such as data transfers or covering some time period. DPCat supports such queries through DCAT and DPV metadata, e.g. indicating transfer locations as dpv:DataTransfer and dpv:hasLocation, and DCAT dcat:temporalPeriod to perform time-based filtering. An example of this expressed as a SPARQL query is provided in below.
?Entry a dpcat:ROPARecord .
?Entry dct:title ?title .
?Entry dct:publisher/dpv:hasName ?publisher .
OPTIONAL { ?Entry dcat:contactPoint/dpv:hasName ?contact } .
?Entry dct:created ?created .
?Entry dpv:hasProcessing ?transfer .
?transfer a dpv:Transfer .
# minimum date within which data transfer occurs
?Entry dct:temporal/time:hasBeginning/time:inXSDDate ?start .
FILTER (?start < "2021-01-01"^^xsd:date) .
OPTIONAL {
# maximum date, if available
?Entry dct:temporal/time:hasEnd/time:inXSDDate ?end .
FILTER (?end > "2022-01-01"^^xsd:date) .
}
OPTIONAL { ?transfer dpv:hasDataImporter/dpv:hasName ?importer . }
OPTIONAL { ?transfer dpv:hasDataExporter/dpv:hasName ?exporter . }
Similar to querying, DPCat also supports verification and validation of information, typically ensuring or assessing compliance with the GDPR. Validation refers to whether sufficient information is available, is in the correct form and format, and is sufficient according to some requirements. Verification refers to the evaluation of the information based on some norms, such as specific obligations of the GDPR.
Constraints based on mandatory fields as prescribed by DCAT and DCAT-AP specifications also apply to DPCat since it extends them. Therefore, data represented using DPCat can utilise existing validation and verification mechanisms for conformance to these standards. In addition, DPCat promotes the expression of GDPR-specific constraints that are typically expressed as guidelines by DPAs and have been the subject of research by academic and commercial offerings. However, DPCat has an advantage over these existing solutions in that it also promotes interoperability between such verification mechanisms by virtue of being an interoperable specification for information to be verified.
As an example of information validation typically involved for GDPR, Listing.[lst:shacl-ensure-purpose] presents a SHACL constraint that ensures every ROPARecord instance has an associated purpose. In addition to ensure information is present and in correct form, SHACL constraints are also useful towards GDPR compliance, such as for ensuring an appropriate legal basis (though GDPR Art.30 does not require a legal basis in a ROPA, DPA guidelines strongly recommend it) as follows: (i) It must have a corresponding legal basis from GDPR Art.6; (ii) If processing involves special categories of personal data, it must additionally have a corresponding legal basis from GDPR Art. 9; (iii) If processing involves data transfers to non-EU locations, it must additionally have a corresponding legal basis from GDPR Art.45 or Art.46 or Art. 49. We plan to provide such SHACL shapes for both information validation and GDPR-based requirements verification in the future.
dpcat:Shape_EnsurePurpose
a sh:NodeShape ;
sh:name "Ensure every processing has a denoted Purpose "@en ;
sh:description "Ensure the dpv:hasPurpose property is defined,
and has a value that is an instance of dpv:Purpose"@en ;
sh:targetClass dpcat:ROPARecord ;
sh:property [
sh:path dpv:hasPurpose ;
sh:class dpv:Purpose ;
sh:minCount 1 ;
] .
Using DPCat Externally
When they hire Data Processors, a Data Controller’s obligation includes maintaining information about the processing activities outsourced to the processor and some specifics regarding how they are carried out. For example, controllers may ask processors to provide the technical and organisational measures they implement to ensure sufficient safety and security in processing. Similarly, controllers may require information for data storage locations of data for cross border data transfers. In cases where a processor contracts another (sub-)processor to carry out the processing, it has to maintain similar records of the sub-processors operations, but it also provides them to the controller as requested. In all these, information has to be periodical - maintained independently by the entity itself, communicated to other entities as contextually necessary, and the other entities also maintain this information independently. Such information flows and requirements are also necessary for a joint controller’s relationships regarding involved controller(s).
If two entities communicating information for ROPA related tasks use DPCat for their internal information representation, they can directly exchange ROPA information using DPCat specified records. This is an ideal scenario. However, even if either or no entities do not use DPCat internally, DPCat can be utilised as a common specification for exchanging ROPA information between entities. In this case, the sender entity converts whatever internal representation it has into DPCat and sends it to the receiver entity to ensure that it can understand and interpret the information. The receiver converts DPCat based information to whatever internal representation they utilise. Thus, DPCat offers advantages for ROPA information exchanges even if organisations do not wish to adopt it completely for internal processes. DPCat is also useful for DPAs and auditors in the same manner where they can utilise it as an interoperable format for requesting information from organisations. The consistency and machine-readability of DPCat provide investigators with the potential for using automation and tools to reduce workload and repetitions.
A ROPA is typically a document that contains several entries. In DPCat, this is represented by subclassing `dcat:Catalog` as `dpcat:ROPA` which can be associated with one or more `dpcat:ROPARecord` instances.
DCAT and DCAT-AP require publishers to be an instance of `foaf:Agent`. DPCat considers `dpv:Entity` to be equivalent to `foaf:Agent`. Therefore, specifying the publisher to be an entity, e.g. `dpv:DataController` suffices this condition, provided the constraint validation mechanism has this knowledge or it has been inferred before validation.
Recommended: Every ROPA SHOULD have a dct:publisher
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require contact points to be an instance of `vcard:Kind`. In this, the contact point for a ROPA could be an agent e.g. a person, department; or a specific contact e.g. email address, telephone number. Therefore, it is difficult to indicate exactly the kind of value to be expected here as it can be either an agent or a contact. Given that ROPA documents are used for accountability purposes, DPCat recommends specifying an agent as the contact point, and additionally providing their contact information as annotations. For example, if the point of contact for a ROPA is a department, the value of this property would be an instance of `dpv:OrganisationalUnit` with information on how to reach them using the annotation `dpv:hasContact`.
Indicates date/time the ROPA was 'issued' or 'published'
rdfs:range
rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear, xsd:gYearMonth, xsd:date, or xsd:dateTime).
Optional: Every ROPA MAY have a dct:temporal. DCAT and DCAT-AP recommend: the start and end of the interval SHOULD be given by using properties `dcat:startDate` or `time:hasBeginning`, and `dcat:endDate` or `time:hasEnd`, respectively. The interval can also be open - i.e., it can have just a start or just an end.
Mandatory: Every ROPA MUST have one or more dcat:dataset
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require the value of `dcat:dataset` to be instances of `dcat:Dataset`. In DPCat, `dpcat:ROPARecord` is a subclass of `dcat:Dataset`, thereby permitting the usage of this property to link `ROPA` with `ROPARecord`. However, for backward and continued compatibility with DCAT and DCAT-AP, the DPCat serialisation still mentions the range as `dcat:Dataset` with the use of `dpcat:ROPARecord` provided here a guideline and for validations.
Mandatory: Every ROPA MUST have a dpv:hasDataController
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
This property assumes the ROPA being represented in that of a Data Controller. In the event that the entity is a Data Processor, the property `dpv:hasDataProcessor` with range `dpv:Processor` should be used.
Conditionally Mandatory: Where a ROPA MAY have a Representative nominated, it MUST have a dpv:hasRepresentative
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
A 'Representative' can be used to indicate, for example, a Controller's representative nominated for a specific operation, unit, or jurisdiction. It can also be used to indicate the entity that is 'representing' the Controller (or a Processor) in terms of accountability or legal matters.
Optional: A ROPA MAY indicate a dpv:hasResponsibleEntity
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
A ROPA can utilise `dpv:hasResponsibleEntity` to indicate the specific organisational unit (e.g. department) or processor or (joint-)controller that is 'responsible' for the specified processing activities.
A ROPA may be a collection of documents that contains several individual documents and records. In DPCat, this is represented by subclassing `dcat:Catalog` as `dpcat:ROPACatalog` which can be associated with one or more `dpcat:ROPA` instances.
Mandatory: Every ROPACatalog MUST have a dct:publisher
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require publishers to be an instance of `foaf:Agent`. DPCat considers `dpv:Entity` to be equivalent to `foaf:Agent`. Therefore, specifying the publisher to be an entity, e.g. `dpv:DataController` suffices this condition, provided the constraint validation mechanism has this knowledge or it has been inferred before validation.
Recommended: Every ROPACatalog SHOULD have a dct:publisher
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require contact points to be an instance of `vcard:Kind`. In this, the contact point for a ROPACatalog could be an agent e.g. a person, department; or a specific contact e.g. email address, telephone number. Therefore, it is difficult to indicate exactly the kind of value to be expected here as it can be either an agent or a contact. Given that ROPACatalog documents are used for accountability purposes, DPCat recommends specifying an agent as the contact point, and additionally providing their contact information as annotations. For example, if the point of contact for a ROPACatalog is a department, the value of this property would be an instance of `dpv:OrganisationalUnit` with information on how to reach them using the annotation `dpv:hasContact`.
Indicates date/time the ROPACatalog was 'issued' or 'published'
rdfs:range
rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear, xsd:gYearMonth, xsd:date, or xsd:dateTime).
vann:usageNote
Recommended: Every ROPACatalog SHOULD have a dct:issued
Optional: Every ROPACatalog MAY have a dct:temporal. DCAT and DCAT-AP recommend: the start and end of the interval SHOULD be given by using properties `dcat:startDate` or `time:hasBeginning`, and `dcat:endDate` or `time:hasEnd`, respectively. The interval can also be open - i.e., it can have just a start or just an end.
Mandatory: Every ROPACatalog MUST have a dpv:hasDataController
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
This property assumes the ROPACatalog being represented in that of a Data Controller. In the event that the entity is a Data Processor, the property `dpv:hasDataProcessor` with range `dpv:Processor` should be used.
Optional: A ROPACatalog MAY indicate a dpv:hasResponsibleEntity
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
A ROPACatalog can utilise `dpv:hasResponsibleEntity` to indicate the specific organisational unit (e.g. department) or processor or (joint-)controller that is 'responsible' for the specified processing activities.
Mandatory: A ROPACatalog MUST associate ROPA instances using dcat:catalog
dct:source
This property is present in DCAT and DCAT-AP.
DCAT and DCAT-AP require the domain and range of `dcat:catalog` to be of value `dcat:Catalog`. In DPCat, `ROPACatalog` and `ROPA` are both subclasses of `dcat:Catalog`, thereby permitting this reuse. However, to preserve backwards and future compatibility with DCAT and DCAT-AP, the serialisation of DPCat defines the range of `dcat:catalog` as `dcat:Catalog`, with the use of range as `dpcat:ROPA` provided as a guideline and for validations.
A ROPA document can consist of several entries or records. In DPCat, each such record or entry is represented by subclassing `dcat:Dataset` as `dpcat:ROPARecorc` which can be associated with one or more `dpcat:ROPA` instances.
Mandatory: Every ROPARecord MUST have a dct:publisher
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require publishers to be an instance of `foaf:Agent`. DPCat considers `dpv:Entity` to be equivalent to `foaf:Agent`. Therefore, specifying the publisher to be an entity, e.g. `dpv:DataController` suffices this condition, provided the constraint validation mechanism has this knowledge or it has been inferred before validation.
Recommended: Every ROPARecord SHOULD have a dct:publisher
dct:source
This property is present in DCAT and DCAT-AP
DCAT and DCAT-AP require contact points to be an instance of `vcard:Kind`. In this, the contact point for a ROPARecord could be an agent e.g. a person, department; or a specific contact e.g. email address, telephone number. Therefore, it is difficult to indicate exactly the kind of value to be expected here as it can be either an agent or a contact. Given that ROPARecord documents are used for accountability purposes, DPCat recommends specifying an agent as the contact point, and additionally providing their contact information as annotations. For example, if the point of contact for a ROPARecord is a department, the value of this property would be an instance of `dpv:OrganisationalUnit` with information on how to reach them using the annotation `dpv:hasContact`.
Indicates date/time the ROPARecord was 'issued' or 'published'
rdfs:range
rdfs:Literal encoded using the relevant ISO 8601 Date and Time compliant string [DATETIME] and typed using the appropriate XML Schema datatype [XMLSCHEMA11-2] (xsd:gYear, xsd:gYearMonth, xsd:date, or xsd:dateTime).
vann:usageNote
Recommended: Every ROPARecord SHOULD have a dct:issued
Optional: Every ROPARecord MAY have a dct:temporal. DCAT and DCAT-AP recommend: the start and end of the interval SHOULD be given by using properties `dcat:startDate` or `time:hasBeginning`, and `dcat:endDate` or `time:hasEnd`, respectively. The interval can also be open - i.e., it can have just a start or just an end.
Mandatory: Every ROPARecord MUST have a dpv:hasDataController
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
This property assumes the ROPARecord being represented in that of a Data Controller. In the event that the entity is a Data Processor, the property `dpv:hasDataProcessor` with range `dpv:Processor` should be used.
Conditionally Mandatory: Where a ROPARecord MAY have a Representative nominated, it MUST have a dpv:hasRepresentative
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
A 'Representative' can be used to indicate, for example, a Controller's representative nominated for a specific operation, unit, or jurisdiction. It can also be used to indicate the entity that is 'representing' the Controller (or a Processor) in terms of accountability or legal matters.
Optional: A ROPARecord MAY indicate a dpv:hasResponsibleEntity
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
A ROPARecord can utilise `dpv:hasResponsibleEntity` to indicate the specific organisational unit (e.g. department) or processor or (joint-)controller that is 'responsible' for the specified processing activities.
Mandatory: A ROPARecord MUST indicate Storage Conditions using dpv:hasStorage. DPV provides the following categories of storage conditions: `dpv:StorageDeletion` to indicate deletion conditions, `dpv:StorageDuration` to indiciate data retention or deletion periods, `dpv:StorageLocation` to indicate location of data storage, and `dpv:StorageRestoration` to indicate data restoration conditions.
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
Conditionally Mandatory: Where Joint Controller relationships exist, a ROPARecord MUST indicate them using dpv:hasJointControllers. The Data Controllers involved in Joint Controllers are indicated using `dpv:hasDataController`, which may optionally omit the Data Controller whose ROPA this is i.e. only indicate the other controllers.
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
Conditionally Mandatory: Where a Data Processor is involved, a ROPARecord MUST indicate this using dpv:hasDataProcessor. For ROPA of a Data Processor, use of this property indicates further association with other Data Processors i.e. Sub-Processors.
dct:source
This property is NOT present in DCAT and DCAT-AP. It is present in DPV.
Conditionally Recommended: Where another entity is engaged to carry out the processing, the ROPARecord SHOULD indicate this using dpv:hasOrganisationalMeasure. DPV provides the following categories of data processing agreements: `dpv:ControllerProcessorAgreement`, `dpv:JointDataControllersAgreement`, `dpv:SubProcessorAgreement`, `dpv:ThirdPartyAgreement`.
Conditionally Mandatory: Where an entity or processing activities are associated with a Third Country, the ROPARecord MUST indicate this using dpv:hasLocation.
Conditionally Mandatory: Where data transfers involve other jurisdictions, the ROPARecord MUST indicate the legal basis used using dpv:hasLegalBasis. DPV-GDPR provides the list of applicable data transfer legal basis provided for within GDPR.
Conditionally Mandatory: Where personal data processing uses or requires safeguards, the ROPARecord MUST indicate this using dpv:hasOrganisationalMeasure. DPV provides categorisation of safeguards specific to data transfers using `dpv:SafeguardForDataTransfer`.
Recommended: The ROPARecord SHOULD indicate Risks using dpv:hasRisk. DPV provides the following risk management concepts: `dpv:RiskMitigationMeasure` associated with `dpv:Risk` using `dpv:mitigatesRisk` and `dpv:isMitigatedByMeasure`; `dpv:RiskManagementProcedure` as a subclass of `dpv:OrganisationalMeasure` and used with `dpv:hasOrganisationalMeasure`, and additional concepts for indicating the consequences and impacts.
Mandatory: A ROPARecord MUST indicate Technical and Organisational measures using dpv:hasTechnicalOrganisationalMeasure. DPV provides specific properties for distinguishing between technical (`dpv:hasTechnicalMeasure`) and organisational (`dpv:hasOrganisationalMeasure`) measures, which are subproperties of `dpv:hasTechnicalMeasure`.
Recommended: A ROPARecord SHOULD indicate Impact Assessments using dpv:hasOrganisationalMeasure. DPV provides the following categories of impact assessments: `dpv:DPIA`, `dpv:PIA`, `dpv:DataTransferImpactAssessment`.
Recommended: A ROPARecord SHOULD indicate a DPIA for processing activities using dpv:hasOrganisationalMeasure. Where a DPIA is not required to be carried out, this information should be associated.
The outcome of a DPIA is being discussed within DPVCG, and can be associated using, e.g. `dpv:hasStatus` with `dpv:AuditApproved` or `dpv:AuditRejected` to indicate whether processing activities can be carried out or not.
Recommended: A ROPARecord SHOULD indicate a Consultation for processing activities using dpv:hasOrganisationalMeasure. DPV provides `dpv:ConsultationWithAuthority` to indicate a prior consultation with an authority such as the DPA.
Recommended: A ROPARecord SHOULD indicate Rights for processing activities using dpv:hasRight. DPV provides `dpv:DataSubjectRight` to specifically indicates rights for data subjects.
Recommended: A ROPARecord SHOULD indicate the assessment of Legitimate Interests where they are the legal basis for processing activities using dpv:hasOrganisationalMeasure.
Recommended: A ROPARecord SHOULD indicate the consent or consent record used as the legal basis for processing activities using dpv:hasLegalBasis.
The DPVCG is discussing proposals for more comprehensive consent record specifications, for example the concept `dpv:ConsentRecord` to explicitly indicate a record fo consent.
Recommended: A ROPARecord SHOULD indicate the Importance for processing activities using dpv:hasOrganisationalMeasure. DPV provides the following categories of importance: `dpv:PrimaryImportance` to indicate processing is the primary or main activity, and `dpv:SecondaryImportance` to indicat processing secondary or auxiliary i.e. it is not the main activity.
Recommended: A ROPARecord SHOULD indicate associated business processes for processing activities using dpv:hasPersonalDataHandling. Note that 'Business Process' is a vague and abstract term that is not clearly defined in legislations. The concept `dpv:PersonalDataHandling` offers a similar concept that can be used to indicate processes along with relevant information such as purposes, personal data, etc.
Recommended: A ROPARecord SHOULD indicate the entity responsible for processing activities using dpv:hasResponsibleEntity. For example, a Processor or a Department. Note that 'Process Owner' is a similar concept used to denote the entity associated with processing, but is a vague and abstract term that is not clearly defined in legislations. The concept `dpv:hasResponsibleEntity` offers a similar concept that can be used to indicate responsibility or accountability.
Example Application
An example of how DPCat could be used is presented by applying it to [[[EDPS-ROPA]]]. See more at https://w3id.org/dpcat/demo/edps-ropa. The below description summarises the process, and provides a snippet of how DPCat ROPA records and catalogs are represented.
EDPS is the DPA responsible for overseeing compliance by EU institutions, which consists of many employees across the various EU bodies and their associated personal data processing activities. The EDPS has published detailed ROPA documents based on GDPR Art.30 requirements that provide transparency and accountability. As of March 2022, the EDPS has made available 58 ROPA document collections - with each consisting of one more PDF (format) document providing information in English regarding the processing operations. Collections are structured based on ’topics’ - which can be a department (e.g. Administrative and Human Resources, or IT), processes (e.g. Communication, or Public Events), or specific measures (e.g. Access to documents, or Physical Security).
We analysed EDPS ROPA documents and selected four (ids: 01, 05, 13, 55) that covered the U1-U4 use-cases for departments, processors, joint controllers, and data transfers. We did not include the other documents despite their relevance due to the large labour and analysis efforts required, and because the selected documents sufficed in demonstrating DPCat’s application. The documents were PDFs, intended for human comprehension, and lacked consistent semantics - e.g. purpose field also contained legal basis.
We interpreted these documents and their structure as follows: each document (i.e. PDF) represented a single ROPA instance, and the information contained within them structured using ROPARecord instances. We utilised the criteria that each ROPARecord would adhere to a single ’contextual entry’ based on qualitative criteria regarding the complexity of information and separation of concerns. For example, document X specified two processors, which we interpreted as separate ROPARecord instances for each processor to indicate the separation of concern in the controller’s communication and data governance. The entire collection of documents and RDF graphs were then expressed as part of a single ROPACatalog instance reflecting the published set of records on EDPS’ website.
The manually created RDF graphs were enhanced using the Apache Jena RDFS reasoner to create a ‘complete graph’ for simplifying querying and validation. The limited RDFS reasoning was sufficient here to obtain the expansion of subclasses and subproperties within the graph rather than generating inferences using an OWL reasoner. For storing the information and offering a querying interface, we utilised GraphDB Free Edition triple store, as it is a freely available triple-store compliant with relevant standards (e.g. SPARQL) and has several features for convenience, e.g. friendly interface, integrated reasoners, SHACL validation.
To simulate typical tasks performed by a DPO or a DPA, we utilised SPARQL queries for two use-cases: (i) retrieval of information required by GDPR Art. 30; and (ii) overview of practices within an organisation in terms of various organisational units, purposes, legal bases, recipients, data transfers, etc. Here, query (i) relates to common compliance documentation procedures, and query (ii) shows the potential for DPCat to help create internal reports or dashboards based on ROPA information, e.g. for a DPO.
Limitations
While the approaches motivated by CSM-ROPA and DPCat provide promising solutions to the challenges in data governance associated with maintenance and use of ROPA towards GDPR compliance requirements, it also has certain limitations that need to be addressed to ensure it is effective in practice. In this section, we discuss identified limitations and propose future efforts toward addressing them.
Limitations of Scope: The DPCat specification reflects the information requirements derived from CSM-ROPA, which was constructed based on the GDPR requirements, and DPA guidelines and templates regarding ROPA. While this makes DPCat sufficient to carry out tasks associated with ROPA, it does not consider the relevance and overlap of information between a ROPA and other compliance documents - such as DPIA (Data Protection Impact Assessment), TIA (Transfer Impact Assessment), Data Breach records, and Controller-Processor or Controller-Controller agreements.
In each of these, there is an obvious overlap with some of the information stored in a ROPA and the necessity to link these to the ROPA itself. For example, a DPIA may concern several processing activities that are spread across distinct ROPARecord instances. While the ROPA can link to the DPIA document trivially through single information, it is advantageous for the DPIA information to be expressed similarly as the ROPA information so as to enable better information interoperability and governance and motivate the creation of tools that can work on all compliance based activities using the same information. This can be achieved by further developing DCAT-based solutions for all of the information necessary in legal compliance tasks - for the above mentioned requirements.
Limitations of Vocabulary: The DPV forms an important aspect of DPCat in that it provides the vocabulary for representing GDPR-associated terms in a machine-readable and interoperable manner. Therefore, any limitations of DPV will also be reflected within the capabilities of DPCat. Given that the DPV is a community-managed resource (through the DPVCG), there is a forum for proposing additions and enrichments as needed for DPCat’s applications. However, better alignment between DPCat and DPV versions will have to be established so as to provide the reliability of DPCat’s usage and interpretation - for example by pinning DPCat’s use of DPV to a specific version.
Limitations of Jurisdiction: DPCat as a solution is EU-centric in that it directly addresses (only) GDPR requirements. However, there may be a wider need for organisations to document their processing activities in a different jurisdiction or a jurisdiction-agnostic manner. For addressing cases, DPCat may be supplemented with extensive modifications, such as an adopter’s own jurisdiction-specific vocabulary, which may bring about incompatibility between implementations. A solution would be to develop DPCat into a domain and jurisdiction agnostic specification and then provide the GDPR specific concepts as an extension of the profile. This reflects current work regarding extending DCAT to DCAT-AP, and the provision of GDPR-specific concepts separate from the ’main’ DPV.
Limitations of ’Data Shapes’: As mentioned earlier in Section 6., the querying and validation of information require consistency or foreknowledge regarding how the data is structured or ’shaped’. Without this, the resulting SPARQL queries and SHACL shapes can be difficult to express or become complex without this. To ensure the consistency of DPCat implementation, especially for information exchange, it is vital to enforce the consistency of the underlying information. While DCAT (and DCAT-AP) provide this consistency to the expression of catalogues and datasets as resources, the lack of such consistency in the expression of DPV specified information needs to be addressed. For this, we propose the development of use-cases that define expectations and requirements for information, e.g. a data transfer must specify location", to create corresponding ’SHACL shapes’ that harmonise how different implementers should utilise DPCat specified information. This activity can be undertaken by the DPVCG for the larger benefit of all DPV adopters or, failing that, within DPCat to ensure its consistency in application.
Acknowledgements
Funding: This research has received funding from Uniphar PLC, and the ADAPT Centre for Digital Content Technology which is funded under the SFI Research Centres Programme (Grant 13/RC/2106_P2) and co-funded by the European Regional Development Fund. Harshvardhan J. Pandit has received funding under the Irish Research Council’s Government of Ireland Postdoctoral Fellowship Grant#GOIPD/2020/790.