The DSRI provides a graphical user interface on top of the Kubernetes containers orchestration to easily deploy and manage services.
New DSRI version!
This documentation cover the new DSRI version using OKD4.6 available at https://console-openshift-console.apps.dsri2.unimaas.nl
You can find the documentation for the legacy DSRI version using OKD 3.11 here.
You need to start applications on CPU
Storage of applications deployed in the new cluster is automated.
You need to run applications on GPU (TensorFlow, PyTorch...)
Storage of applications deployed in the legacy cluster needs to be manually configured.
If you need to run applications on GPU, visit the documentation for the legacy cluster.
- Multiple flavors of JupyterLab (scipy, tensorflow, all-spark, and more)
- JupyterHub with GitHub authentication
- RStudio, with a complementary Shiny server
- VisualStudio Code server
- Tensorflow or PyTorch on Nvidia GPU (with JupyterLab or VisualStudio Code)
- Apache Flink cluster for streaming applications
- Or any program installed in a Docker image!
DSRI is a computing infrastructure, built and used to run data science workloads. DSRI stores data in a persistent manner, but all data stored on the DSRI is susceptible to be altered by the workloads you are running, and we cannot guarantee its immutability.
Always keep a safe copy of your data outside the DSRI. And don't rely on the DSRI for long term storage.
- Since DSRI can only be accessed when on the physical UM network or using the UM VPN, deployed services will not be available on the public Internet 🔒
- All activities must be legal in basis. You must closely examine and abide by the terms and conditions of any data, software, or web service that you use as part of your work 📜
Request an account
If you are working at Maastricht University, see this page to request an account, and run your services on the DSRI.
Here is a diagram providing a simplified explanation of how the DSRI works, using popular data science applications as examples (JupyterLab, RStudio, VSCode server)
- OKD 4.6 (Open Source version of RedHat OpenShift) to run services and jobs.
- RedHat Ceph storage for distributed storage.
- 16 CPU nodes
|RAM (GB)||CPU (cores)||Storage (TB)|
|Node capacity||512 GB||64 cores (128 threads)||120 TB|
|Total capacity||8 192 GB||1 024 cores||1 920 TB|
- 1 GPU node: Nvidia DGX1 8x Tesla V100 - 32GB GPU
|GPUs||RAM (GB)||CPU (cores)|
|GPU node capacity||8||528 GB||40 cores|
See the following presentation about the Data Science Research Infrastructure