Modelling Subjective Locations in DPV
published:
by Harshvardhan J. Pandit
is part of: Data Privacy Vocabulary (DPV)
DPV DPVCG semantic-web Working Note
These are the working notes for adding location concepts to DPV. See issue#209 for tracking this work.
Motivation
DPV’s core goal at the onset of
the DPVCG was
representing information about the processing of personal data. This
required the creation of StorageLocation
as a concept to
state where data was being stored. Since then, the DPV has expanded
significantly, and the real-world necessity of expressing and using
locations has led to creation of several location related concepts.
These include concepts such as Country
and
City
which represent geo-physical locations which act as
jurisdictions. It also includes a new concept called
LocationLocality
which is further expanded into more
concepts such as WithinDevice
and
RemoteLocation
. In DPV v2.1, the extension LOC was added that provides a large
taxonomy of countries and regions based on the ISO 3166 standard.
These concepts reflect the need to represent real/physical locations (specific countries and cities) as well as non-real/non-physical locations such as within a device. However, a third category of locations also exists between the two where subjective labels such as home or office are used to indicate physical locations that differ for each person, as well as virtual locations such as device and app that differ for each context. Going ahead, DPV should represent such concepts as these are relevant to represent information in a manner that fits the context e.g. where a particular device is said to be used at home, or that there is a CCTV that is monitoring employees in an office.
What exists? A Literature Review
Google Places has a taxonomy of location concepts for use with Google Maps. It includes concepts such as cafe and park, but does not specify them as public or private, or categorise them in any other way. The New Places Type taxonomy has categorisations such as Education and Health and Wellness which have further granular concepts (1 level deep).
OpenStreetMap also has a taxonomy of location types which are similarly structured, though with a larger taxonomy. However, there is no indication of places as being public or private.
schema.org has some location types, e.g. Hotel, which are extensions of a common top concept called Place. The taxonomy of places is a hierarchy with multiple levels, though the focus is on commercial activities and information such as addresses.
Sector-based financial classification schemes such as NAICS (e.g. search for places) and ISIC (e.g. search for places). However, these are economic activities and only some can be correlated to locations such as restaurants.
I could not find any taxonomy of virtual locations other than a few concepts that relate to technology-specific concepts such as browser storage (MDN), iOS File Manager, and Android data storage.
Regulations such as ePrivacy Directive has a definition for location data, and mentions data stored in terminal equipment which can be simplified to data stored on/in a device.
Proposal
We create and provide a taxonomy of such subjective locations which help represent information as it is used and understood in the real world, especially that related to privacy. This means also representing locations in the sense of whether it is personal, private, or public (as suggested in the group) and to categorise the subjective locations based on these where possible. The challenge here is to have a criteria for what subjective locations are, and which ones should be included - as there are potentially infinite number of locations which can be created.
The location concepts should be based on the intended use of DPV’s concepts, namely in a legal context to represent information that is necessary to represent legal requirements, interpretation, or compliance, and to support information that is relevant to the use of data and technologies at a personal as well as organisational level. Based on this, the above can be expressed as the following three categories:
- Locations that are legally relevant categorisations related to privacy e.g.
private
andpublic
- Locations that are physical and real but are subjectively described for the person or organisation e.g.
home
andoffice
- Locations that describe virtual places in context of technologies and how they operate with data e.g.
device storage
,browser storage
,within app
To decide which concepts to represent, the first criteria should be the most important - if the legal interpretation requires expression of that category, then it should be defined as a concept in DPV. For example, since the law distinguishes between private and public spaces - these are necessary to be modelled as locations. Similarly, the ePrivacy Directive and some other laws distinguish between user storage - so these must also be represented as a concept. Finally, specific laws are modelled based on the specific locations such as workplaces and hospitals - therefore these should also be modelled. For the other concepts, their involvement must be based on whether they help express commonly used use-cases - such as storage on a device.
Once a sufficient list of concepts has been identified, they should be organised in a hierarchy such as that the top concepts represent broad locations and the lower parts of the hierarchy represent more specific concepts. For example, Local
could be a concept representing storage that is 'local' in a real or virtual sense, with more specific locations as OnDevice
and ImmediatePhysicalArea
. By themselves, these concepts cannot be said to be private or public as there can be use of them as either. In contrast, Home
is Private
and Personal
by definition.
Locations by Confidentiality
The concept LocationsByConfidentiality
represents categorisations based on the confidentiality of the location, with three categories as:
PersonalPlace
- the space around or defined by an individual which defines their comfort, autonomy, privacy, and other attributes which relate to their 'person'.PrivatePlace
- a place that is controlled privately, or that is not a public space.PublicPlace
- a place which members of the public can access or are permitted to access, whether by right or by express or implied permission, or whether on payment or otherwise.
In these, the definition of public place must be aligned with legal definitions which are broad. The concept also needs to be clarified from PublicSpace
which is an open and accessible area to members of the public, and is a subset of PublicPlace
. PersonalPlace
is a subset of PrivatePlace
by definition with the distinction being that private can be for an individual person, a group, or even an organisation, whereas personal is always restricted to the individual.
To avoid misinterpretation between places marked as public or private, it would be beneficial to prefix the names with Public/Private
. For example, PublicPark
is unambiguous as a public location as compared to just Park
. Even though we may not model Park
as a concept (and as a parent to public park), the prefix helps communicate and ensure that the concept is used to indicate public parks.
In some legal contexts, it is also essential to distinguish between areas within a private place which are publicly accessible - but are governed by the private entity. For example, a hotel lobby is a public place based on the above definition as any member of the public can access it. However, as a building, the hotel is a private place. Therefore, this represents a case where the same place has to be indicated as public and private - making both concepts intersecting rather than disjoint. For data protection laws like GDPR, a CCTV placed within a hotel lobby would count towards monitoring of a publicly accessible area (or public area). To indicate such cases more explicitly, the concept PubliclyAccessiblePrivatePlace
should be defined by extending both private and public place concepts.
Locations by Existence
The concept LocationsByExistence
categorises locations based on whether they physically exist (PhysicalLocation
) or not (VirtualLocation
). Existing concepts such as city and country are kinds of physical locations, whereas within device is a kind of virtual location.
The concept VirtualLocation
is expanded to express virtual locations associated with physical equipments and devices (WithinDevice
) and those that are associated with applications and software (WithinApplication
).
Locations not considered
The below concepts were identified in 'brainstorming' with ChatGPT, but were not considered in the above.
- Private locations such as personal office.
- Personal locations such as personal vehicle (as a location?) and personal locker.
- Public virtual locations such as public website and community forum.
- Whether the location is indoor or outdoor.
- Whether the location is commercial, residential, institutional, or recreational.
- Whether the location is used as short-term, long-term, or temporary.
- Public locations such as transportation (train station, bus station), parks (community park, national park).
- Shared locations such as co-operatives and timeshares.
- Public domain locations such as government owned and community owned.
- Locations controlled regarding access such as exclusive access, restricted access, and open access.
- Locations that are monitored (e.g. using CCTV, but could also be by humans).
- Locations that are regulated e.g. military bases and healthcare facilities.
- The question of how to associate organisation taxonomy with location taxonomy e.g. hospital as an organisation category, and hospital as a location category.
- As above, how to associate personal data category for location with the abstract concept of the location itself. For example, Office as a location concept is different from Office as the personal data for a person.