Skip to the content.

A workshop to get started with the Data Science Research Infrastructure (DSRI) in an hour πŸ• (hopefully)!

During this workshop, you will:

Prerequisites:


Access the DSRI πŸ”‘

πŸ“– The DSRI documentation can be found at https://maastrichtu-ids.github.io/dsri-documentation

  1. Connect to the UM VPN.

     sudo openconnect --passwd-on-stdin -u YOUR.UM.USER --authgroup 01-Employees vpn-rw1.maastrichtuniversity.nl
    
  2. Access the DSRI OpenShift web UI

  3. πŸ‘©β€πŸ’» Go to the workspace-workshop project in the OpenShift web UI


Start an application πŸš€

Start a JupyterLab/RStudio/VSCode application from the DSRI catalog in ids-projects

πŸ“– See how to deploy JupyterLab, RStudio, VSCode and lots more.

  1. πŸ‘¨β€πŸ’» Use your name to generate a unique Application name, e.g. rstudio-vemonet
  2. Persistent storage will create automatically.
  3. Access the application you just started


Upload files πŸ—‚οΈ

πŸ‘¨β€πŸ’» For small and medium size files you can simply drag and drop files and folder in the application web UI, or use the Upload files button in RStudio.

This solution works for files up to a few hundred MBs (depending on the application, use it until it fails!).


Upload your code πŸ“œ

We recommend you to use git with GitHub or GitLab, you can use it directly from the terminal in all applications, or use the web UI integration each app proposes.

πŸ“– See the documentation for each application:


Upload large data files πŸ“¦

For large data files you will need to install the oc command line interface.

If you have the time it can be quickly installed on MacOS, Linux (works with WSL):

wget https://github.com/openshift/origin/releases/download/v3.11.0/openshift-origin-client-tools-v3.11.0-0cbc58b-linux-64bit.tar.gz
tar xvf openshift-origin-client-tools*.tar.gz
cd openshift-origin-client*/
sudo mv oc kubectl /usr/local/bin/
brew install openshift-cli

πŸ“– See the complete documentation to upload large data file

πŸ’‘ You will have a better connection when directly connected to the UMnet network (or eduroam at UM) to upload large data file. Even better if you can use ethernet wires.


Stop and delete your application ❌

πŸ‘¨β€πŸ’» Stop your application from the OpenShift web UI Topology page:

You can use the Filter by name search box to quickly find your application based on the name you gave it.

Stop your application

Note: creating more than one pod (β€œScale up”) is useless for most data science applications, such as RStudio, VSCode or JupyterLab. It is only relevant for applications running as a cluster, like Apache Flink or Apache Spark, or web application with a lot of traffic (OpenShift will redirect the traffic depending on pod availability, and start new pods if required, aka. horizontal scaling).

πŸ‘©β€πŸ’» Delete your application:

oc delete all,secret,configmaps,serviceaccount,rolebinding --selector app=my-application

Replace my-application by the Application name you defined.

Delete application from the web UI

πŸ“– See the complete documentation to delete an application.


See you soon! πŸ‘‹

πŸ“ Fill this form to help us create a project for you on the Data Science Research Infrastructure for a longer term!