Modelling Operating Factors for Technology
published:
by Harshvardhan J. Pandit
is part of: Data Privacy Vocabulary (DPV)
DPV DPVCG semantic-web Working Note
Operating Factor as a concept
The Model Cards paper uses Factor to describe things which are relevant to the functioning of the AI model, which includes Group (which are dpv:Data
categories), Instrumentation (which are dpv:Technology
categories including hardware and software), and Environment (which is missing in DPV). Based on these, the paper later has a section on metrics which refers to how the model performs based on combinations of factors, and also to describe what are the ideal and non-ideal (or problematic) combinations of factors in relation to the model's (expected and intended) performance.
Similarly, the AI Act uses phrasing such as operating in/logic/wit and operational constraints/data/needs to describe the environment and conditions which affect the operation of AI technologies -- such as in Annex IV regarding hardware and software on which the AI system will operate. In addition, the AI Act also describes the performance of the AI system in relation to specific criteria such as groups of people (Art.13-3b-v) as well as any capabilities and characteristics of the AI system which are relevant to explain its output/functioning -- which by definition include how the technology operates and which factors affect the operation.
It should be clear from the above that there is a need to define information regarding things that affect the performance in association with AI technologies. I propose we called this concept tech:OperatingFactor
based on existing uses of the term to refer to hardware and software in technical documents. This concept is expected to contain more information -- such as the specifics involved regarding data, technology/instruments, environmental conditions, people. Therefore, we define it as a subclass of dpv:Process
so that we are effectively defining units of operation (a process) which are relevant factors for the operation of the technology. This consistency in using dpv:Process
as much as possible also allows the same modelling design patterns to be used for modular/granular expressions such as to differentiate combinations from one another (e.g. where a complex operating factor needs to specify multiple processes occurring together) and to have a uniform structure in DPV taxonomy where anything that looks and works like a process is defined by the abstract concept dpv:Process
.
To associate tech:OperatingFactor
in context, the property tech:hasOperatingFactor
should be provided. This will allow specifying the operating factors directly where relevant, instead of declaring specific processes/contexts as operating factors are making discovery and querying complicated. Here's an example that shows how this is convenient:
ex:Document a tech:Documentation ; dct:subject ex:MedicalDevice ; tech:hasOperatingFactor [ a tech:OperatingFactor ; dpv:hasData pd:BloodType ; dpv:hasHumanSubject dpv:Patient ; tech:hasHardware ex:SomeMedicalSystem ; ] .
Categories of Operating Factors
In the above, while we could specify the relevant factors, we did not specify whether the factor was specifying the technology will operate correctly, ideally, or something else. To define these, operating factors are categorised to describe different kinds of scenarios and conditions as:
tech:NecessaryOperatingFactor
: Necessary (e.g. minimum) conditions for operating the technology and without which the technology cannot operate -- e.g. specific hardware and software requirements for running a video game without which the game cannot run;tech:SuitableOperatingFactor
: Suitable conditions within which the technology will operate within acceptable parameters -- e.g. a range of conditions that represent minimum and maximum criteria for tolerance where the technology produces outputs in a known and acceptable range;tech:IdealOperatingFactor
: Ideal conditions for operating the technology where the ideal conditions are within the suitable conditions by definition -- e.g. specific hardware setup and environmental conditions;tech:UnsuitableOperatingFactor
: Unsuitable conditions which are not recommended for the use of the technology with the implication being that they can lead to problems/issues/errors -- however the phrasing here is softer than outright saying these conditions will definitely cause problems;tech:HazardousOperatingFactor
: Hazardous conditions within which the technology is known to malfunction or cause errors or produce faulty outputs -- where such conditions may not be derivable from the sufficient and necessary/minimum conditions or where they should be explicitly documented to warn the user/operator, and which by definition should be within the specified unsuitable operating factors;tech:UntestedOperatingFactor
: Untested conditions where it is not known whether the technology will operate correctly or there will be problems/issues/errors -- and the intent of documenting these could be to inform that it is not known how the technology will function or to indicate to the user/operator that they should evaluate these conditions or that the developer/provider makes no claims about the use of technology in such conditions.
Based on the above, and to ensure the taxonomy of operating factors used in documentation form a consistent hierarchy, we introduce some additional concepts with tech:OperatingFactor
as the parent:
-
tech:OperatingFactor
-
tech:NecessaryOperatingFactor
-
tech:SuitableOperatingFactor
-
tech:IdealOperatingFactor
-
-
tech:UnsuitableOperatingFactor
-
tech:HazardousOperatingFactor
-
tech:UntestedOperatingFactor
-
-
To sanity check this hierarchy, how we assert statements regarding their implications are important to phrase correctly. For example, all ideal factors are by definition suitable for the operation -- TRUE. However, if we had expressed necessary as the parent concept, it would have been - all ideal factors are by definition also necessary for the operation -- which is FALSE. Instead, what we meant was phrasing it as all ideal factors by definition also contain the necessary factors -- which is TRUE, but it is not what the hierarchy represents. Therefore, the phrasing should be as: all suitable factors will involve necessary factors (and not that all suitable factors are necessary i.e. an intersection and not a subset), and of all suitable factors, some are ideal (i.e. a subset).
Whether to define these concepts as properties depends on whether we want them to be directly associated in context e.g. whether a document should directly state these are necessary or suitable factors, or whether the document should state operating factors and within each factor there is a categorisation of it is necessary or suitable. For convenience, and to mirror how documents are typically structured, it is best to provide a property for different categories of operating factors for convenience, clarity, and explicitness. Therefore, for each of the above factor category, we define a property based on the DPV convention as: tech:has[XYZ]OperatingFactor
. An example to illustrate their use is below, which also shows the limitations of describing factors using current concepts as many are mere human-intended statements:
ex:Document a tech:Documentation ; dct:subject ex:MedicalDevice ; tech:hasNecessaryOperatingFactor [ a tech:NecessaryOperatingFactor ; dpv:hasData pd:BloodType ; dpv:hasHumanSubject dpv:Patient ; tech:hasHardware ex:SomeMedicalSystem ; ] ; tech:hasIdealOperatingFactor [ a tech:IdealOperatingFactor ; dct:description "well-lit area with temperature 20--25C"@en ; ] ; tech:hasUnsuitableFactor [ a tech:UnsuitableOperatingFactor ; dct:description "dark area or in direct sunlight"@en ; ] ; tech:hasHazardousFactor [ a tech:HazardousOperatingFactor ; dct:description "fluids other than blood may break the device"@en ; ] .
Components of Operating Factors
The above examples used existing DPV properties such as dpv:hasData
and tech:hasHardware
to describe the components of a operating factor. As highlighted in the first section, DPV contains several existing concepts which can be used to describe the operating factor, with the notable exception of environment, and more precisely the environmental condition as environment is broader and also involves what we have already covered under operating factor such as the hardware and software. Therefore, we define tech:OperatingEnvironment
as the environment consisting of surrounding conditions, including the operating factors, within which the technology operates. Based on this, we then define tech:EnvironmentalCondition
as the environmental conditions relevant to the operation of the technology. To associate these, we provide the properties tech:hasOperatingEnvironment
and tech:hasEnvironmentalCondition
. An example illustrating how they could be used:
ex:Document a tech:Documentation ; dct:subject ex:MedicalDevice ; tech:hasNecessaryOperatingFactor [ a tech:NecessaryOperatingFactor ; dpv:hasData pd:BloodType ; dpv:hasHumanSubject dpv:Patient ; tech:hasHardware ex:SomeMedicalSystem ; ] ; tech:hasSuitableOperatingFactor [ a tech:SuitableOperatingFactor ; tech:hasOperatingEnvironment loc:MedicalFacility ; ## Concept DOES NOT EXIST tech:hasEnvironmentalCondition "well-lit area with temperature 20--25C"@en ; ] .
The example states that a medical device requires blood type data from a patient (note that PD doesn't have capability to state actual physical blood is taken, a limitation IMHO) and some specific medical system which is not specified in the example. It then states that the suitable conditions for operation involve a medical facility and where the temperature is between 20 to 25 degrees celcius. For now, I am presuming that the environmental conditions such as temperature, humidity, and other things can be represented through other existing ontologies as these are extremely common industrial concepts. Our goal here is thus limited to connect them to the legal/policy type documentation and concepts.
With the addition of the (two) missing concepts, it is a good idea to reiterate the other existing DPV concepts and how they should be interpreted when used within an operating factor. To this end, the below list provides an enumeration of existing DPV properties (where available) and what they should mean:
dpv:hasData
,dpv:hasPersonalData
: involvement of data and personal data respectively;dpv:hasDataSource
: to specify the source of data;dpv:hasDuration
,dpv:hasFrequency
: to specify duration and frequency respectively;dpv:hasScale
: to specify the scale e.g. data subject scale, data volume, geographic coverage using the dedicated properties;dpv:hasEntity
: to specify an entity involved – there are specific roles within this such asdpv:hasHumanSubject
anddpv:hasActiveEntity
which provide more explicit roles;dpv:hasEntityInvolvement
: to specify what the entity may or may not be involved to perform – such as ideally having the ability to override a system;dpv:hasHumanInvolvement
: to specify what humans can and cannot do in regards to automated systems;dpv:hasLocation
to describe location concepts – both geophysical and virtual (note: this is currently under discussion);tech:hasActor
: to specify technology actors in specific roles such as developers, providers, operators, and users (note: there are specific subproperties for each role);dpv:isImplementedByEntity
: to specify the entity that implements a factor;dpv:isImplementedByTechnology
: to specify the technology/instrument for implementing the factor (note: the subpropertyai:hasAi
and its subproperties likeai:hasCapability
expand this for use of AI technologies);tech:hasSystem
,tech:hasSoftware
,tech:hasHardware
: to specify specific systems, software, and hardware involved in the factor (note: there are additional subproperties to specify devices, APIs, etc. in TECH);tech:hasDeploymentLocation
: to specify the location of the deployment for the factor if it is a technology;tech:hasDocumentation
: to specify documentation relevant to the factor e.g. to describe conditions in more detail or to outline hazards;tech:hasInput
,tech:hasOutput
: to specify inputs and outputs relevant to the factor;tech:isComponentOf
: to specify if the technology is a component of another technology - relevant to describing the operating environment where it is essential to specify the role the technology plays such as being a component or merely operating within the virtual environment;dpv:hasRisk
: to describe risks e.g. in unsuitable or hazardous factors;dpv:hasConsequence
,dpv:hasConsequenceOn
,dpv:hasImpact
,dpv:hasImpactOn
: to describe the consequences of the factor on a system, process, or on humans (as impacts);dpv:isBefore
,dpv:isAfter
,dpv:isDuring
: to state relations between factors e.g. to state a factor comes into play after another factor or that the unsuitable factor is relevant to another factor being in operation;dpv:hasContext
: to cover anydpv:Context
concept not associated by above properties, as well as to provide an open-ended method to associate more contextual information;tech:hasEnvironmentalCondition
: (PROPOSED) to specify the environmental conditions relevant to the factor (where environment refers to aspects such as air quality, temperature, etc.);tech:hasOperatingEnvironment
: (PROPOSED) to specify the operating environment for the factor – which could be further specified using the above properties;