Using ontologies

You will need to define the class and relations for the properties in your data. The easiest way is to find classes and properties in existing model (aka. ontologies). Some properties are standard like rdf:type and rdfs:label, but for more specific concepts the best is to find an existing data model matching your model.

Reuse existing ontologies ♻️#

A number of ontologies have already been defined for different use-cases and domain. Re-using existing ontologies is faster as you don't need to build the ontology yourself, and it improves the interoperability of your data.

Ontologies repositories#

Search for relevant existing models in ontology repositories:

Linked Open Vocabulary (LOV)
BioPortal for biomedical concepts by the NCBI.
OntologyLookupService by the EBI
AgroPortal for agronomy by INRIA.
EcoPortal for ecology by Life Watch Italy.

The BioPortal Recommender and Search services are efficient to look for concepts in most existing biomedical ontologies.

Popular ontologies#

Semanticscience Integrated Ontology (SIO), a simple, integrated ontology of types and relations for rich description of objects, processes and their attributes.
BioLink Model, A high level datamodel of biological entities (genes, diseases, phenotypes, pathways, individuals, substances, etc) and their associations.
Schema.org, a collaborative project to define schemes for structured data on the Internet, on web pages, in email messages, and beyond.
- Various classes described such as schema:Person, schema:MedicalGuideline, schema:Review, schema:ScholarlyArticle, schema:MedicalScholarlyArticle, schema:Dataset, etc.
- Extensions available, such as BioSchemas for biological data
- Alternatively you can look into Google Data Types, which are mainly built from schema.org and allow to describe and index your website using RDF (JSON-LD)
DublinCore (dc, dct, dctypes), one of the most generic vocabulary (includes properties such as dc:identifier, dct:description, dct:creator, dct:license, dct:rights...).
PAV: Provenance, Authoring and Versioning ontology.
PROV: The Provenance Ontology, another ontology to describe provenance with more details.
DCAT: Data Catalog Vocabulary, to describe datasets.
NCIT: National Cancer Institute Thesaurus, a vocabulary for clinical care, translational and basic research, and public information and administrative activities.

Define the schema#

In the case you are reusing existing ontologies the best is to define the schema your data will follow using SHACL shapes, or ShEx expressions. This will allow you to validate the generated data, and other users will be able to quickly understand your data.

Here are a few examples of tools and methods to generate SHACL or ShEx shapes:

SHACLGEN - Python library to generate SHACL shapes: https://pypi.org/project/shaclgen/
RDFShape - A Web app and library to generate SHACL/ShEx: http://rdfshape.weso.es
SheXer: A library to perform automatic extraction of SHACL/ShEx schemata in RDF graphs: http://shexer.weso.es
"Shape Designer for ShEx and SHACL constraints" by Boneva et al presented in ISWC 2019: https://gitlab.inria.fr/jdusart/shexjapp
Astrea: Automatic generation of SHACL shapes from ontologies: https://astrea.linkeddata.es
TopBraid Composer: https://www.topquadrant.com/products/topbraid-composer/ & https://www.topquadrant.com/from-owl-to-shacl-in-an-automated-way/
"RDF shape induction using knowledge base profiling" to generate Shapes by Mihindukulasooriya et al. presented in Annual ACM Symposium on Applied Computing in 2018.
"Towards improving the quality of knowledge graphs with data-driven ontology patterns and SHACL" by Spahiu et al. presented as a Workshape Paper in ISWC in 2018.

Ontology design 🎨#

If you don't find an ontology that fits, or if you need to edit an ontology, you can check at the following tools:

Protégé#

You can use the Protégé ontology editor to build your ontology, using a tree view

Install Protégé on your computer for better performance than the web hosted service.
Or use WebProtégé for its collaborative features.

VocBench#

VocBench is a web-based, multilingual, collaborative development platform for managing OWL ontologies, SKOS(/XL) thesauri, and generic RDF datasets.

Gra.fo#

Gra.fo is a commercial product, but use it for free to build simple RDFS/OWL ontologies with a diagram view and collaboration features.

Chowlk#

Chowlk is a web service that automatically generates the OWL code from your Ontology Diagram made with diagrams.net. You will need to follow the instructions to define the diagrams block following a specific format.

OwlReady2#

OwlReady2 is a Python library to work with OWL ontologies. It helps you build OWL ontologies with Python code and Jupyter notebooks.

TopBraid Composer#

Now with free edition: https://www.topquadrant.com/products/topbraid-composer/

StarDog#

The StarDog triplestore includes an ontology editor, but it requires a license.

Resolve prefixes#

http://prefix.cc is a handy service to resolve prefixes.

E.g. http://prefix.cc/bl

Publish the ontology 📰#

The easiest place to publish your ontology is in a GitHub repository.

Publish documentation#

2 options are available:

Widoco: generate ontology documentation following the W3C style
Ontospy: provide multiple choices for ontology documentation (more user-friendly for larger ontologies)

See this example workflow implementing Widoco and Ontospy: https://github.com/vemonet/semanticscience/blob/master/.github/workflows/generate-docs.yml

It allows to automatically generate and publish documentation for your ontology using GitHub Actions and GitHub Pages:

The ontology is published in a GitHub repository, in our case in ontology/sio.owl
The GitHub Actions workflow is triggered when there is a change in the ontology file.
The GitHub Actions workflow runs Ontospy, or Widoco (yours to choose), given the latest committed ontology file (ontology/sio.owl in this example), which generates the HTML documentation in the gh-pages branch, in a different folder for each documentation type.
The gh-pages branch is published as a GitHub Page

In this example we have a simple index.html file to let the user choose the documentation types he wants to access

Feel free to adapt this GitHub Actions workflow

Use persistent identifier#

We recommend to use the w3id.org system, as it allows any GitHub user to define and reserve your persistent namespace for free in a few minutes:

Fork the w3id.org repository: https://github.com/perma-id/w3id.org
Create a folder with your namespace name (e.g. my-onto)
Add a .htaccess file with the redirection to your ontology (and a README.md file shortly explaining the purpose of this namespace)
Send a pull request to the https://github.com/perma-id/w3id.org repository. It usually takes between a few hours and a few days to be accepted.

Examples:

See this example for a .htaccess passing the original w3id URI queries
Or this example to redirect to different websites depending on the path.

The persistent identifiers can be easily modified later if necessary, you will just need to send a new pull request with the changes.

Add it to an ontology repository#

Depends on the ontology domain (see above).