Introduction
The Data Science Research Infrastructure is an OKD 4.6 cluster, the open source version of OpenShift, using RedHat Ceph Storage.
The DSRI provides a graphical user interface on top of the Kubernetes containers orchestration to easily deploy and manage services.
New DSRI version!
This documentation cover the new DSRI version using OKD4.6 available at https://console-openshift-console.apps.dsri2.unimaas.nl
You can find the documentation for the legacy DSRI version using OKD 3.11 here.
#
Which DSRI version should you use?#
New OKD 4.6 clusterYou need to start applications on CPU
Storage of applications deployed in the new cluster is automated.
#
Legacy OKD 3.11 clusterYou need to run applications on GPU (TensorFlow, PyTorch...)
Storage of applications deployed in the legacy cluster needs to be manually configured.
If you need to run applications on GPU, visit the documentation for the legacy cluster.
#
Getting started#
What can be done on the DSRI โ๏ธRun Data Science applications in Docker container ๐ณ on the UM network, such as:
- Multiple flavors of JupyterLab (scipy, tensorflow, all-spark, and more)
- JupyterHub with GitHub authentication
- RStudio, with a complementary Shiny server
- VisualStudio Code server
- Tensorflow or PyTorch on Nvidia GPU (with JupyterLab or VisualStudio Code)
- Apache Flink cluster for streaming applications
- Or any program installed in a Docker image!
Data storage
DSRI is a computing infrastructure, built and used to run data science workloads. DSRI stores data in a persistent manner, but all data stored on the DSRI is susceptible to be altered by the workloads you are running, and we cannot guarantee its immutability.
Always keep a safe copy of your data outside the DSRI. And don't rely on the DSRI for long term storage.
#
What cannot be done โ- Since DSRI can only be accessed when on the physical UM network or using the UM VPN, deployed services will not be available on the public Internet ๐
- All activities must be legal in basis. You must closely examine and abide by the terms and conditions of any data, software, or web service that you use as part of your work ๐
Request an account
If you are working at Maastricht University, see this page to request an account, and run your services on the DSRI.
#
The DSRI in a nutshellHere is a diagram providing a simplified explanation of how the DSRI works, using popular data science applications as examples (JupyterLab, RStudio, VSCode server)

#
The DSRI specifications#
Software- OKD 4.6 (Open Source version of RedHat OpenShift) to run services and jobs.
- RedHat Ceph storage for distributed storage.
#
Hardware- 16 CPU nodes
RAM (GB) | CPU (cores) | Storage (TB) | |
---|---|---|---|
Node capacity | 512 GB | 64 cores (128 threads) | 120 TB |
Total capacity | 8 192 GB | 1 024 cores | 1 920 TB |
- 1 GPU node: Nvidia DGX1 8x Tesla V100 - 32GB GPU
GPUs | RAM (GB) | CPU (cores) | |
---|---|---|---|
GPU node capacity | 8 | 528 GB | 40 cores |

#
Learn more about DSRISee the following presentation about the Data Science Research Infrastructure
