Knowledge
Open Data Contract Standard (ODCS)
The Open Data Contract Standard (ODCS) was formerly known as the Data Contract Template, which PayPal used to specify datasets. Now, it is governed by Bitol, a Linux Foundation AI & Data project.
We are members in Bitol's Technical Steering Committee, and we are committed to support the ODCS standard in our products.
Starting with v3.0.0, ODCS is supported by Entropy Data to define data contracts.
Details of a data contract in Entropy Data
Open Data Contract Standard Example
Let's start with an example of a data contract in ODCS v3 format:
apiVersion: v3.0.0
kind: DataContract
id: c176de03-8503-4859-bd0f-218cc413d958
name: Shipments
version: 1.0.0
domain: checkout
status: development
description:
purpose: This data can be used for analytical purposes
schema:
- name: my_table
physicalType: table
properties:
- name: shipment_id
description: Unique identifier for each shipment.
logicalType: string
logicalTypeOptions:
format: uuid
physicalType: uuid
primaryKey: true
examples:
- 03c35ea7-9a26-475f-a38a-0dad96f6de10
- name: order_id
description: Identifier for the order associated with the shipment.
logicalType: string
physicalType: text
required: true
unique: false
examples:
- ORD789012
- name: delivery_date
description: "The actual or expected delivery date of the shipment."
logicalType: date
physicalType: timestamp_tz
required: false
examples:
- "2024-09-05T17:00:00Z"
quality:
- type: text
description: Must be set, when status is "delivered".
- name: carrier
description: "The shipping carrier used for the delivery."
logicalType: string
physicalType: text
examples:
- DHL
- UPS
- name: tracking_number
description: Tracking number provided by the carrier.
logicalType: string
logicalTypeOptions:
minLength: 10
maxLength: 36
physicalType: text
classification: restricted
quality:
- rule: duplicateCount
mustBeLessThan: 1
unit: percent
examples:
- 1Z9999W99999999999
- name: status
description: "Current status of the shipment."
logicalType: string
physicalType: text
examples: ["pending", "shipped", "in_transit", "delivered", "returned", "canceled"]
quality:
- rule: rowCount
name: Verify row count range
mustBeBetween: [1000000, 5000000]
team:
- username: john.doe@example.com
role: Data Product Owner
servers:
- server: production
environment: production
type: bigquery
project: acme_shipments_prod
dataset: shipments_v1
roles:
- role: analyst_us_read
access: read
- role: analyst_eu_read
access: read
If you were familiar with the v2 format, you will notice some significant changes. The ODCS v3 format is more flexible and can be used for a broader range of data products.
Content
These are the building blocks of an ODCS data contract:
Fundamentals
The general information about the data contract, such as the ID, name, version, owner, and description.
Schema
The schema specifies the logical, and optionally, physical representation of the data model. With ODCS v3, also complex data structures (e.g., JSON and AVRO models) are supported.
Data Quality
Data quality guarantees can now be defined as plain text, SQL, or with a maintained library of commonly used predefined quality attributes such as rowCount, unique, freshness, and more.
Pricing
The price that data consumers have to pay for using the data product. Optional.
Servers
The physical location of the data set, such as the actual host, database, and schema. Most technologies and data platforms are supported. Supports multiple servers for different environments or data product versions.
Roles
A list of roles that data consumers can apply for to access the data. Different roles may provide different access rights for role-based access control (RBAC).
SLAs
Service-Level Agreements (SLAs) can be defined to specify the expected availability and performance of the data product.
Custom Properties
For custom needs or tooling-specific requirements, additional properties can be added.
Entropy Data
Entropy Data is a frontend to manage data contracts and data products in an organization. It uses data contracts to create a data product marketplace with advanced features for data discovery, version-controlled data contract editing, contract-testing and automated data governance.
A screenshot of the data contract editor for ODCS in Entropy Data
Sign up now for free, or explore the clickable demo of Entropy Data.