Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Below are brief descriptions of the participating groups that might be useful as you consider your presentation topics.  Please let me know if I've made any mistakes or feel free to correct/add detail as you see fit.

ProjectsDescription

SciServer

(Kim, Lemson)

SciServer Compute supports interactive and batch access to to multiple large public datasets across several domains (including the Sloan Digital Sky Survey) via containers. They support Rstudio/Jupyter/Matlab interactive environments and have developed a custom job scheduler for containers, each with supporting scripting libraries for SciServer component and data integration. SciServer compute supports SSO with user defined group access to shared storage and access to centralized datasets, some in relational databases

See also

Cyverse

(McEwan, Fronner)

The Cyverse Discovery Environment (DE) uses containers to support customizable, non-interactive workflows for data stored in Cyverse. They are interest in (or working on) supporting interactive access to the Cyverse data commons data via containers iRods. Through Atmosphere, they support provisioning cloud resources on demand for researchers and access to HPC resources through TACC.

TACC has installed Singularity container support on all of its HPC systems and is working with BioContainers to make 2400+ BioConda applications findable and accessible at TACC or any HPC system that supports Singularity. The end result of these efforts is to support all BioConda packages across all Cyverse infrastructure. This already works using Docker on the Cyverse Condor cluster. We are about 90% there on having a nice solution for the other HPC systems using Singularity. 

See also:

Whole Tale, yt.Hub, RSL, Data Exploration Lab

(Turk, Kowalik)

The yt Hub provides access to very large datasets (both observation and simulation based) via the integration of Girder and Jupyter Notebook/Lab. Entire available data is locally mounted to compute nodes of a Docker Swarm cluster via NFS. However, the physical location of the data is abstracted through a FUSE filesystem, which allows to provide only a subset of data selected by the user inside the container running Jupyter Notebook/Lab.

The basic architecture of the yt Hub: Girder + remote environment with data selection, is currently being extended as a part of the Whole Tale project, which provides (among other things) the ability to launch containerized applications over a wide variety of the *remote* datasets (e.g., via DataOne). They are addressing complexity of exposing data to containers via a variety of underlying mechanisms (posix, S3, HTTP, Globus, etc) through a data management framework. In contrast to the yt Hub, data is provided inside the computing environment on demand using a sync mechanism and local cache, rather than being served locally. Containers also play a role in provenance/preservation of scientific workflows and publication process.

The Renaissance Labs project will leverage this same approach to provide access to the Renaissance Simulations at SDSC – adding the ability to move analysis to HPC resources and adding a custom UI.

TERRA-REF

(LeBauer, Burnette)

Blue Waters


NDS 

(Willis, Lambert, Coakley)

The NDS Labs Workbench is a generic platform for launching containerized environments near remote datasets, leveraging Kubernetes. Labs Workbench is deployed on OpenStack as a Kubernetes cluster with GlusterFS for a shared user filesystem across containers (e.g., home directory). Workbench is used by the TERRA-REF project and increasingly for training/education environments (hackathons, workshops, bootcamps, etc). The DataDNS project is an emerging vision for supporting access to remote computational environments. Workbench is a single optional component of the DataDNS framework.

See also:

CyberGIS

(Liu, Terstriep)

See also:

SDSC (Zonca)

Deployment of Jupyterhub with Docker Swarm and batch spawner support in HPC environments in support of science gateways, research, and training/education.

See also:

  • No labels