Knowledge
Open Data Contract Standard (ODCS)
The Open Data Contract Standard (ODCS) was formerly known as the Data Contract Template, which PayPal used to specify datasets. Now, it is governed by Bitol, a Linux Foundation AI & Data project.
We are members in Bitol's Technical Steering Committee, and we are committed to support the ODCS standard in our products.
Starting with v3.0.0, ODCS is supported by Data Contract Manager to define data contracts as an alternative to the Data Contract Specification.
Details of a data contract in Data Contract Manager
Open Data Contract Standard Example
Let's start with an example of a data contract in ODCS v3 format:
apiVersion: v3.0.0
kind: DataContract
id: c176de03-8503-4859-bd0f-218cc413d958
name: Shipments
version: 1.0.0
domain: checkout
status: development
description:
purpose: This data can be used for analytical purposes
schema:
- name: my_table
physicalType: table
properties:
- name: shipment_id
description: Unique identifier for each shipment.
logicalType: string
logicalTypeOptions:
format: uuid
physicalType: uuid
primaryKey: true
examples:
- 03c35ea7-9a26-475f-a38a-0dad96f6de10
- name: order_id
description: Identifier for the order associated with the shipment.
logicalType: string
physicalType: text
required: true
unique: false
examples:
- ORD789012
- name: delivery_date
description: "The actual or expected delivery date of the shipment."
logicalType: date
physicalType: timestamp_tz
required: false
examples:
- "2024-09-05T17:00:00Z"
quality:
- type: text
description: Must be set, when status is "delivered".
- name: carrier
description: "The shipping carrier used for the delivery."
logicalType: string
physicalType: text
examples:
- DHL
- UPS
- name: tracking_number
description: Tracking number provided by the carrier.
logicalType: string
logicalTypeOptions:
minLength: 10
maxLength: 36
physicalType: text
classification: restricted
quality:
- rule: duplicateCount
mustBeLessThan: 1
unit: percent
examples:
- 1Z9999W99999999999
- name: status
description: "Current status of the shipment."
logicalType: string
physicalType: text
examples: ["pending", "shipped", "in_transit", "delivered", "returned", "canceled"]
quality:
- rule: rowCount
name: Verify row count range
mustBeBetween: [1000000, 5000000]
team:
- username: john.doe@example.com
role: Data Product Owner
servers:
- server: production
environment: production
type: bigquery
project: acme_shipments_prod
dataset: shipments_v1
roles:
- role: analyst_us_read
access: read
- role: analyst_eu_read
access: read
If you were familiar with the v2 format, you will notice some significant changes. The ODCS v3 format is more flexible and can be used for a broader range of data products.
Content
These are the building blocks of an ODCS data contract:
Fundamentals
The general information about the data contract, such as the ID, name, version, owner, and description.
Schema
The schema specifies the logical, and optionally, physical representation of the data model. With ODCS v3, also complex data structures (e.g., JSON and AVRO models) are supported.
Data Quality
Data quality guarantees can now be defined as plain text, SQL, or with a maintained library of commonly used predefined quality attributes such as rowCount, unique, freshness, and more.
Pricing
The price that data consumers have to pay for using the data product. Optional.
Servers
The physical location of the data set, such as the actual host, database, and schema. Most technologies and data platforms are supported. Supports multiple servers for different environments or data product versions.
Roles
A list of roles that data consumers can apply for to access the data. Different roles may provide different access rights for role-based access control (RBAC).
SLAs
Service-Level Agreements (SLAs) can be defined to specify the expected availability and performance of the data product.
Custom Properties
For custom needs or tooling-specific requirements, additional properties can be added.
ODCS and Data Contract Specification
Now, how relates the Open Data Contract Standard (ODCS) to the Data Contract Specification (datacontract.com)? The Data Contract Specification is a format that we initially developed for tooling support, in particular for the Data Contract CLI and Data Contract Manager.
Let's start with the similarities: There are no fundamental or conceptual differences between these two major formats. Both are open standards, use YAML, and specify data sets in a similar way.
We are striving for harmonization, and we are active member of Bitol's Technical Steering Committee. We have contributed many of our insights from consulting, the maintenance of the Data Contract CLI and the Data Contract Manager and are actively contributing to the design of ODCS v3 and future versions.
Shaping a standard in a governed committee has many advantages, but also some limitations, particularly with regard to scope, velocity, and simplicity. So, until we have a future version of a unified standard, we will continue to support both formats in our products.
We recommend considering Open Data Contract Standard (ODCS) if these aspects are important to you:
- Using a standard that is governed through a Linux Foundation project
- Vendor-neutral decision-making process
We recommend considering Data Contract Specification if these aspects are important to you:
- Support by Data Contract CLI for data contract testing and code-generation
- Multi-Platform support with bindings to different data platforms for providers and consumers
- Business Definitions
- OpenLineage support
There is no wrong decision: We are committed to support both standards in our products, and we will provide migration tooling for an upcoming unified standard.
Data Contract Manager
Data Contract Manager is a frontend to manage data contracts and data products in an organization. It uses data contracts to create an enterprise data marketplace with advanced features for data discovery, version-controlled data contract editing, contract-testing and automated data governance.
A screenshot of the data contract editor for ODCS in Data Contract Manager
Now, Data Contract Manager supports both Data Contract Specification and Open Data Contract Standard (ODCS) v3.
Sign up now for free, or explore the clickable demo of Data Contract Manager.