Add ISO standards to tech-org measures

Proposal for linking Standards to DPV concepts and using them as TOMs
by Harshvardhan J. Pandit
is part of: Data Privacy Vocabulary (DPV)
is about: Data Privacy Vocabulary (DPV)
DPV DPVCG ISO semantic-web standard

Sent to DPVCG mailing list:

Updated 23-AUG-2022 - see additions

Relevant GitHub Issue:

1. Summary

  • GuidelinesPrinciples (rename)
    • Guideline (new)
      • DesignGuidelines (rename)
      • CodeOfConduct (change parent)
    • Principle (new)
      • PrivacyByDesign (change parent)
      • PrivacyByDefault
  • Standard (new, parent: OrganisationalMeasure)
    • ManagementStandard (new)
    • TechnicalStandard (new, parent+: TechnicalMeasure)
  • DPV-Standards as separate extension for providing standards and guidelines associated with DPV concepts
  • Reusing DCterms concepts for annotation of standard and its topics

2. Reorganising DPV concepts for Guidelines / Standards

  • dpv:GuidelinesPrinciple represents abstract guidelines and principles
    • [ ] Harmonise by renaming it to GuidelinesPrinciples (both plural)
    • Its subtypes include PrivacyByDefault but not PrivacyByDesign
    • [ ] Move PrivacyByDesign to be subtype of GuidelinesPrinciple
    • Subtype also includes DesignStandard which is defined as guidelines for design rather than being a standard
    • [ ] Rename this to DesignGuidelines to not confuse this with actual standards and standardisation outputs
    • [ ] Create two new classes for Guidelines and Principles.
    • The distinction is necessary as principles (e.g. GDPR Art-5) relate to abstract aims whereas guidelines refer to a suggested method of achieving or implementing something. The other subtypes should be moved into these as appropriate, e.g. PrivacyByDefault is a Principle
    • dpv:CodeOfConduct already exists
    • [ ] Move CodeOfConduct under Guideline
  • [ ] dpv:Standard is the class for standards
    • Standard will be under OrganisationalMeasure since it is something that is inherently expected to be interpreted and applied by humans regardless of whether it is machine- or people-oriented in nature. For example, management standards are organisational measures, but encryption security is both a technical and organisational measure since the 'controls' or specifics of that standard need to be interpreted for the use-case and applied by whoever is developing/managing the infrastructures.
    • [ ] ManagementStandard and TechnicalStandard as two subtypes of Standard
    • The management standard relate to organisational processes in the purest sense. The technical standard relate to details of technical implementations, and are a subtype of TechnicalMeasure in addition to the organisational one to reflect this.
  • The application of standard is thus possible through the same means as technical and organisational measures, i.e. hasTechnicalOrganisationalMeasure
  • Note that DCT has the class Standard which can be applied as is, but DPV defines its own concept to fit it within the organisational measures hierarchy. When alignments with other vocabularies are defined, dpv:Standard will be expressed as skos:exactMatch dct:Standard and rdfs:subClassOf dct:Standard (in different semantics).
  • Implementing a standard vs Conforming to a standard
    • implementing a standard could mean interpreting it and applying its guidelines and principles
    • Conforming to a standard means following the specified requirements (whether they be concrete or abstract) in a manner that the satisfaction of those requirements can be checked or audited. As the WCAG puts it, "Conformance to a standard means that you meet or satisfy the 'requirements' of the standard".
    • To distinguish between these, merely saying something has an organisational measure means they follow the specifics of that measure, but it does not say anything about the conformance.
    • For conformance, the concept of Certification is provided, which can be through an external audit or self-certification, depending on the specifics of the domain and application. For example, a Controller who wishes to implement a Standard can use it as an organisational measure. The same Controller when dealing with a Processor may be similarly satisfied that the Processor implements a Standard (as a tech/org measure) OR it can specifically see that the Processor has been certified for that standard. In this case, the processor would state that it is implementing a tech/org measure that is both a Standard and a Certification or express them in a different manner as applicable (e.g. details of auditing process or body).
    • Note that attempts to simplify this further can lead to surprising avenues of complexities, such as requirements to specify Certifications against the entire entity (as opposed to a process), including the auditing body as part of information, temporal coverage, jurisdictional coverage, etc. Pending further exploration of these, we should limit the initial concepts to a simpler design and later expand them as necessary for these.

3. Declaring ISO standards using DCT and DPV

  • To provide an easy way to express some ISO standard is used, what it is about, who published it, when, etc. a separate extension is proposed.
  • [ ] dpv/standards as an extension providing a list of standards and guidelines
  • The intention of this is to provide an easy way to specify some standard, its topics, and link them to DPV concepts.
  • dpv:Standard or its appropriate subtype (Management or Technical) as the type
  • title and description
    • dct:title for name of the standard
    • dct:alternative for an alternative name (e.g. common reference)
    • dct:identifier for an unique reference to the standard
    • dct:description for a short summary describing the standard
  • publication
    • dct:publisher for specifying the body who publishes the standard
    • dct:issued for when the standard was published or formally issued
  • relations
    • dct:isReplacedBy for some other standard replacing this
    • dct:replaces for indicating this standard replaces another
    • dct:hasVersion to indicate another version of the standard (note that a second version does not necessarily replace the first)
    • dct:isVersionOf to specify this standard is a version of another
    • dct:requires to specify this standard requires another
    • dct:isRequiredBy to specify this standard is required by another
  • topics / subjects
    • dct:subject to specify the topics or subjects of that standard i.e. what the standard is about (note: this only refers to the primary topics). This can be a DPV Tech/Org concept, or a Processing operation, or something else.
    • dct:coverage to specify what topics the standard 'covers' or includes, i.e. what things does the standard talk about other than primary topics. This can be a DPV Tech/Org concept, or a Processing operation, or something else.

4. Examples

standard:ISO-IEC-27018-2019 a dpv:Standard ;
    dct:title "ISO/IEC 27018:2019 Information technology ...
	      — Security techniques — Code of practice for protection of
	      personally identifiable information (PII) in public clouds
	      acting as PII processors"@en ;
    dct:alternative "ISO/IEC 27108:2019" ;
    dct:identifier "27018:2019" ;
    dct:description "This document establishes ..."@en ;
    dct:published standard:ISO, standard:IEC, standard:ISO-IEC-JTC1-SC27 ;
    dct:issued "2019-01" ;
    dct:replaces standard:ISO-IEC-27018-2014 ;
    dct:isVersionOf standard:ISO-IEC-27018-2014 ;
    dct:requires standard:ISO-IEC-29100, standard:ISO-IEC-27002 ;
    dct:subject dpv:PersonalData, dpv:DataProcessor, dpv-tech:CloudInfrastructure ;
    dct:coverage dpv:Policy, dpv:EnforceSecurity, dpv:EnforceAccessControl,
		 dpv:CryptographicSecurity, dpv:IncidentManagement,
		 dpv:ComplianceManagement .
    # note: some of these concepts are not in DPV, we should add them!

In the DPV documentation, this can be linked to as suggesting there are standards related to a concept. For example, as:

# HTML documentation of dpv:DataProcessor
Topic of standards: "ISO/IEC 27018:2019" with link to standard:ISO-IEC-27018-2019
Topic of guidelines: ... (e.g. EDPB guideline)

# HTML documentation of dpv:EnforceSecurity
Mentioned in standards: "ISO/IEC 27018:2019" with link to standard:ISO-IEC-27018-2019
Mentioned in standards: ... (e.g. EDPB guideline)
Update: 23-AUG-2022

How to extract data from ISO webpage to spreadsheets

  • The data for the ISO spreadsheet is based on the following (ordered) fields:
  • Term, dct:alternative, dct:title, dct:identifier, dct:description, dct:publisher, dct:issued, dct:replaces, dct:isVersionOf, dct:requires, dct:subject, dct:coverage, source, status, contributor, resolution,

Copy Data

  • The ISO Catalogue provides links to specific groups of standards based on sub-committees.
  • The link for a sub-comittee provide a page with standards under the management of that committee. For e.g. SC27 related to information security has the page
  • On this page, there are various options to selectively display standards. Ensure that only "Published Standards" is selected. Alternately, see the coding in the link, and ensure only /p/1 is the only attribute set, and that others are unset, i.e. /u/0
  • Copy the data from the table displayed i.e. click on text of first standard, drag to end of line, then end of page at the very bottom to cover all displayed standards. Ensure you have copied the data.
  • Paste this data into Excel, which will also recognise the hyperlinks and will insert them along with the text.

Extract Standard Details

  • Copy the table as before, and use regex to extract parts of it. I like to use as it provides ample options and is convenient. Otherwise, grep or sublime-text are other options.
  • The pattern ^(ISO\/IEC)( [A-Z]+)? ([\d\:\-]+)\n(.+) with the subtitution filter $1$2 $3 $4\n will extract only the necessary data, and discard the additional information such as stages.
  • Each line in this pattern is the dct:title value
  • On titles, the pattern $1$2 $3\n will provide dct:alternative
  • On alternatives, substituting [\/ \:] with - will provide the term IRI
  • On alternatives, substituting (^(ISO\/IEC)( [A-Z]+)? (\d+)(.+) with $3 will provide the dct:identifier
  • On alternatives, substituting (.+)(\d{4}) with $2 will provide the publication year for dct:issued
  • For dct:publisher, the dependance is on whether "ISO" or "IEC" occurs within the title. Additionally, the specific subcommittee can be manually added to the list as needed.

Extract Hyperlinks

  • This requires copying the table data from the webpage, and using a spreadsheet with macros or formulas. I prefer MS-Excel since its relatively simpler and reliable as compared to LibreOffice Calc (in this instance), and the Google Sheets macro system is too complicated for what should be a 2-min task.
  • To copy the hyperlinks only, or rather to extract them, a Macro is needed. Depending on OS and Office version, this will be through different settings or methods. Look for Tools -> Macro -> Visual Basic Editor
  • Then right-click the project area, and select Insert -> Module
  • Create a new macro with the following snippet:

    Function GetURL(rng As Range) As String
        On Error Resume Next
        GetURL = rng.Hyperlinks(1).Address 
    End Function
  • In a new column, insert the formula to call this macro, as Get_URL(COL+ROW)
  • Then select the cell, and drag it down to replicate the macro across the sheet, copying all the hyperlinks.
  • This will create a column with several blank spaces. To remove them, copy the entire row's contents and use regex substitutions to get rid of continous new lines (i.e. \n{2+}) or matching to select only the desired contents without the suffix (i.e. ^(.+)/?browse=tc). If only blank links are removed, the suffix should also be removed using the same regex.
  • Copy the results back into the spreadsheet so that the cell contents for those standards match up with the hyperlinks.