FINAL - September 2017
Definition of key terms This is a glossary defining the terms describing various aspects of the Open Data Charter. It was developed by members of the Charter’s network and can be used by governments and others to help understand the Charter’s principles. Term
Definition
Accountability
Ensuring the public (including civil society and private sector organizations, academic and media representatives, and citizens) has the data and information needed to hold the government to account for its policy and service delivery performance.1
Accurate
Data that is accurate is correct, and reflects the most current information available at the time of publication.
Analytical limitations
Conditions or qualities of data that may require additional attention from users prior to using that data or drawing conclusions from it.
Anonymize
Processing data that includes personal information so that individuals can no longer be identified in the resulting data.2 This is related to the concept of de-identification, the process of conducting an analysis of the risk of personal identification based on available data, and either encrypting or removing such personal data from data sets, so that the people whom the data describe remain anonymous.
Anticorruption
Laws, policies, and practices designed to prevent, detect, investigate, or eradicate the abuse of entrusted power for private gain. Common forms of corruption include bribery, collusion, and embezzlement.
1
Source: Open Government Guide Source: Open Data Handbook
2
opendatacharter.net
1
FINAL - September 2017
Applications
A self-contained program or piece of software designed to fulfill a particular purpose (an application).
Build capacity
Supporting or developing the skills, knowledge, tools, and experience necessary for individuals and organizations to meet particular goals, particularly in the context of developing countries.
Civic participation
Also known as civic engagement, civic participation is the process of citizens and organizations actively participating in the public sphere, including, for example, social participation (e.g. volunteering or donating funds) and political participation (e.g. voting or communicating with representatives).
Civil society
An organization, group, initiative, or network may qualify as being a member of civil society if it meets any of the following criteria: ● ● ●
it works on a charitable or not-for-profit basis it is a non-government organization, academic institution, or expert network it is a corporation that engages in philanthropic investment in support of open data and sustainable development
The term “civil society” is the aggregate of all organizations meeting any of the criteria above. Co-creation
The collaborative development of datasets, or collaborative reuse of existing open datasets to develop applications, programs, and other tools, as well as graphs, infographics, and other visualizations. Usually the result of collaboration between governments and citizens, private sector, and/or civil society organizations.
Comparable
Data that is comparable should be easy to compare over time and across organizations. For example, contracting data for multiple government ministries should be generated using the same data standards to ensure the data can be compared across ministries (e.g. all ministries use a standardized date format to indicate the time period of all contracts, or record contract awardees using standardized names or identifiers).
opendatacharter.net
2
FINAL - September 2017
Comprehensive
Data that is comprehensive is both complete and detailed, without significant gaps or missing data elements. Likewise, datasets should include all data relevant to their description. For example, a dataset recording all contracts awarded by a particular ministry would be comprehensive if it is not missing any data points (e.g. no dates or amounts are missing) and it is reflective of all relevant contracts (e.g. all contracts under $25,000 awarded by that ministry, not just a sampling of contracts).
Core metadata
Metadata is the data providing information about one or more aspects of data within a dataset. It is used to summarize basic information about data, which can make it easier to track and work with specific data. Core metadata is a limited set of metadata which provides important, fundamental information about data, and should be defined by a consistent vocabulary across all datasets. Core metadata elements may include the dataset title, source, publication date, and format, as well as other relevant information that describes the dataset and supports discoverability (that is, makes it easier to search for and find the dataset). For further information on core metadata, see the Dublin Core Metadata Initiative (DCMI) Metadata Terms or the W3C Data Catalog Vocabulary (DCAT).
Data Ecosystem
The complex system of relationships between individuals, organizations, datasets, standards, resources, platforms, and other elements that define the environment in which each particular data resource exists. A data ecosystem may include “multiple data communities, types of data, institutions, laws and policy frameworks, and innovative technologies and tools.”3
Data literacy
The skills and knowledge required to access, read, understand, and manipulate data. This may include knowledge of data usage software and visualization techniques.
Data users
Any individual or organization that accesses, downloads, or republishes data, or who uses data to develop apps, visualizations, reports, and other information products or services.
3
Source: Africa Data Consensus
opendatacharter.net
3
FINAL - September 2017
Digital divide
“The gap between individuals, households, businesses and geographic areas at different socio-economic levels with regard to both their opportunities to access information and communication technologies (ICTs) and to their use of the Internet for a wide variety of activities.”4
Disaggregated
Disaggregated data is data that is broken down or separated into component parts. Data can, for example, be disaggregated by age, allowing users to view relevant data broken down by ages or age categories. Statistical data may be disaggregated prior to publication to allow users to easily group data based on categories like age, gender, or region. When data is presented in the most disaggregated way, and as it was directly collected from the source without any further processing, then it is usually referred as raw data or primary data.
Discoverable
Data that is discoverable can be easily found and accessed by users, including online and through search engines.
Domestic and international standards bodies
Groups, networks, or organizations that focus on the creation, development, revision, and/or implementation of data standards at a local, state, national, regional, or international level. These standards bodies may include, but are not limited to, the ISO, W3C, IETF, etc.
Equitable resource
A resource that is, by its nature, available to anyone, regardless of their social or economic status.
Evidence-based policy making
The set of processes or methods which advocates a more rational, rigorous and systematic approach to the creation of policy. Evidence-based policy making seeks to inform the policy process, rather than aiming to directly affect the eventual goals of the policy. The pursuit of evidence-based policy making is based on the premise that policy development and decision-making should be better informed by available evidence and should include rational analysis.5
4 5
Source: OECD Source: Overseas Development Institute report on Evidence-Based Policymaking
opendatacharter.net
4
FINAL - September 2017
Freedom of expression
Right to express one's ideas and opinions freely through speech, writing, and other forms of communication but without deliberately causing harm to others' character and/or reputation by false or misleading statements. Freedom of press is part of freedom of expression.6 Freedom of expression includes the right to criticize government policies, practices, laws, and programs without fear of retribution, unlawful detention, or violence.
Freedom of Information / Access to Information / Right to Information community
The community of organizations, groups, networks, and individuals working to support, study, or implement laws and policies requiring governments to release certain high-value data or information either proactively or on request.
Fully described data
Datasets that are associated with clearly-defined core metadata categories, and accompanied by any relevant explanatory documentation.
Global data revolution
The ongoing, global movement that has resulted from “an explosion in the volume of data, the speed with which data are produced, the number of producers of data, the dissemination of data, and the range of things on which there is data, coming from new technologies such as mobile phones and the ‘internet of things’, and from other sources, such as qualitative data, citizen-generated data and perceptions data” coupled with “a growing demand for data from all parts of society”. This movement is the sum of dozens of national, regional and global formal initiatives to foster the use of data.7
Globally agreed standards
Data standards which have been adopted or endorsed by a large number of governments or organizations, and which are recognized as contributing significantly to the improvement or standardization of high-value data.
Governance
Processes of management, oversight, or decision-making which impact a particular project or program.
Source: Business Dictionary Source: A World That Counts Report
6 7
opendatacharter.net
5
FINAL - September 2017
Human-readable formats
As defined in the Open Data Handbook, “Data in a format that can be conveniently read by a human. Some human-readable formats, such as PDF, are not machine-readable as they are not structured data.”8
Information lifecycle management practices
Information lifecycle management practices are any practices or policies related to the creation, retention, archiving, or disposition of data or information. These practices may include the length of time data and information resources are retained, and how and when they are archived to ensure future access to them.
International governmental bodies
An organization, group, or network that acts as a quasi-governmental body at the international level. Examples include the United Nations.
Interoperable
Interoperability is the ability to work with other products or systems, present or future. In order to be interoperable, data should follow established international data standards to ensure that it is interoperable across a number of different systems or analytic products. Interoperable data can be easily compared over time, across locations, and within and between organizations, as well as being easily manipulated to produce visualizations and identify trends.
Machine-readable formats
As defined by the Open Definition, machine-readable formats are those data formats which are readily processable by a computer where the individual elements of the [data] can be easily accessed and modified.9 The Open Data Handbook defines machine-readable data as “Data in a data format that can be automatically read and processed by a computer.”10
Mapping standards
A comparison between standards at a domestic (local, state, national) level and an international (regional, global) level, used to identify similarities and gaps between different data standards.
Source: Open Data Handbook Source: The Open Definition 10 Source: Open Data Handbook 8 9
opendatacharter.net
6
FINAL - September 2017
Multilateral institution
An organization, group, or network made up of governments or government representatives. These institutions may be regional or global. Examples include the OECD or the G20.
Open and unrestrictive licence
Open means anyone can freely access, use, modify, and share for any purpose, subject, at most, to requirements that identify the data’s provenance and preserve openness.11 Licenses should be published and linked to open data to ensure data users can easily find and understand the conditions of data access and reuse.
Open by default
“Open by default” policies mandate that data or information should be open and available for the public to find, access, and use under an open and unrestrictive license, unless there is a specific, pressing reason why that data or information cannot be made open, and that reason is clearly communicated to the public. Currently, most governments operate by asking whether there is any pressing, important reason why a data or information resource should be open (e.g. overwhelming public demand, legal requirement). Under an “open by default” policy, governments would instead operate by assuming that all data and information should be open, and asking whether there is any important, pressing reason why data or information cannot be made open (e.g. security or privacy considerations) In cases where data cannot be made open, it may instead be closed data (which can be accessed only by the data subject, owner, or holder) or shared data (which is accessible beyond its subject, owner, or holder, but is only accessible to a limited group of people or organizations).12
Open standards
Data standards which are publicly available and developed, refined, and/or maintained through a collaborative, transparent decision-making process. Open standards are published under an open license, are thoroughly documented, and are made publicly available at zero or low cost, so that they can be accessed and used by anyone.
Source: The Open Definition For more information, see the Data Spectrum, which illustrates the different types or levels of openness of data. 11
12
opendatacharter.net
7
FINAL - September 2017
Personal data (or personally-identifiable data)
Any data that, when used alone or in combination with other available data, may identify an individual. While most personal data cannot be open for reasons of privacy and confidentiality, personal data may be closed, or shared with specific people or organisations. In some cases, personal data may be licensed as open data. This would include, for example, data concerning the identity, contact information, or expense claims of government officials or legislators.
Private sector
Any non-governmental organization, group, or network that works to generate profit.
Rule of law
The principle that all individuals, organizations, and institutions are subject to and accountable under clear, publicized law that is fairly applied and enforced.13
Socially and economically marginalized people
People who, as a result of their culture, ethnicity, gender, religion, or social or economic status are limited in the influence of power they can exert in the public sphere, and particularly as it relates to civic participation.
Source
The point of origin of data, which may be the originally published dataset (in the case of republished or reused data) or the individual or organizational author of the data (in the case of originally published datasets).
Standardized format
Standardized formats may include both file formats and data formats. A standardized file format should be machine-readable, and used consistently across projects or organizations or over time. Examples include CSV, JSON, or XML formats. A standardized data format is a guideline or series of guidelines that defines the way in which data should be collected or recorded, supporting comparability and interoperability between datasets. Examples include the General Transit Feed Specification ( GTFS), or the International Aid Transparency Initiative (IATI) Standard.
13
Definition adapted from: World Justice Project
opendatacharter.net
8
FINAL - September 2017
Structured
Data that is organized according to a fixed schema, and is often incorporated in a relational database.
Sustainable development
“Sustainable development is development that meets the needs of the present without compromising the ability of future generations to meet their own needs. It contains within it [...] the concept of needs, in particular the essential needs of the world's poor, to which overriding priority should be given; and the [concept] of limitations imposed by the state of technology and social organization on the environment's ability to meet present and future needs.”14
Traceability
Traceable data is data that can be “followed” across datasets. For example, international development aid funding may be traced by linking the funds to a unique project identifier in all datasets referring to the funds (e.g. the awarding government’s disclosures, the local recipient’s reporting, and the service deliverer’s contract); starting from any of these datasets, a user may trace the funds to the other datasets thanks to that unique project identifier.
Transparency
Ensuring the public (including civil society and private sector organizations, academic and media representatives, and citizens) has the data and information needed to understand the workings of their government.15
Visualization
Any visual representation of data other than a dataset. Visualizations may include, but are not limited to, plots, tables, graphs, or infographics.
Source: Report of the World Commission on Environment and Development: Our Common Future (Brundtland Report) 15 Source: Open Government Guide 14
opendatacharter.net
9