Data Asset Schema
The standard for describing a Data Asset
Templates for Data Assets in ixo Documents use a structured data format based on open schemas (mainly schema.org) to describe any type of Data Asset existing within the Internet of Impact.
Types of data assets
A structured object, such as Verifiable Claim, with a data model that can be processed using a specific tool or algorithm
An algorithm for processing or transforming data
A table or a CSV file with some data
An organised collection of tables
A search query
A collection of files which are related in a way that provides a meaningful dataset
Images capturing data
Files relating to machine learning, such as trained parameters or neural network structure definitions
Anything else that looks like a data asset!
The standard data model (schema) for data assets
The ixo standard for data assets is compatible with Web 2.0 guidelines for dataset providers used to describe data for search engines such as Google to better understand the content of pages. Data assets are easier to find and understand when they are described with metadata such as name, description, creator, format, etc.
The schema describing Data Assets within ixo Documents implements the schema.org Dataset structure.
Dataset example
For example, if the Data Asset is a Dataset, we would use the schema.org/Dataset definition of Dataset
as described in the following table. Included, is information about the publication of the dataset such as the license, when it was published, and identifier
(DOI) or sameAs
pointing to a canonical version of this Dataset object in a different repository.
Add identifier
, license
, and sameAs
for Datasets that provide provenance and license information.
DataCatalog
DataCatalog
The full definition of DataCatalog
is available at schema.org/DataCatalog.
Datasets are often published in repositories that contain many other datasets. The same dataset can be included in more than one such repository. You can refer to a data catalog that this dataset belongs to by referencing it directly.
DataDownload
DataDownload
The full definition of DataDownload
is available at schema.org/DataDownload. In addition to Dataset properties, add the following properties for datasets that provide download options.
The distribution
property describes how to get the dataset itself because the URL often points to the landing page describing the dataset. The distribution
property describes where to get the data and in what format. This property can have several values: for instance, a CSV version has one URL and an Excel version is available at another.
Tabular datasets
A tabular dataset is one organised primarily in terms of a grid of rows and columns. For pages that embed tabular datasets, you can also create more explicit markup, building on the basic approach described above.
Attribution and further resources
The structured data model for ixo data assets builds on schema.org and Google Developer guidelines. To build and test Data Asset templates, a great resource is Google's Structured Data Markup Helper.
Last updated