Is Cloud a Good Fit for my Research Computing Needs? Lessons Learned from Triaging Use Cases to Deploy on AWS

Sandeep Giri
Svetlana Milter

UCEN - State Street Room
Tue, Jul 16 4:30pm - 5:15pm

In early 2018, our campus rolled out a AWS environment for research computing with PHI, limited to a certain number of use cases, the idea being that if a researcher has a use case matching an approved one, they will have a much more streamlined option to start working on AWS, while still meeting all the security and compliance requirements. We also limited this to a few research projects at first so that we could refine the environment based on the learnings from those projects and also develop a better-informed pipeline of features and enhancements that will be needed to scale this platform.

Initially, we limited all computing work to be done within the confines of AWS. As a researcher, you needed a separate VPN to connect to this AWS environment. You only had access to ec3, s3, and RDS service. You couldn’t make API calls to any systems outside of your AWS environment. Other than that, you could spin up as many computing nodes and storage as you needed, and dial them up or down.

For a number of researchers, this worked great. They could very quickly spin up their project in AWS and get going, while also managing their costs by using only as much computing and storage as they needed, and only when they needed. However, there were a number of use cases that were not supportable with the limitations we imposed.

We have projects that just want to use AWS for data storage. There are projects that want to make use of containers. Some want to be able to share data (and grant access to) with research collaborators in other institutions. How do we provision access to them? Some want to host web applications that are accessible to patients participating in a research so patients can report data over the web, but we can’t ask the patients to use a special VPN software to do this. Some want their research applications to query data from the EHR using FHIR, or make similar API calls to other systems outside of our AWS confines. Some want to bring in PHI data sets from other institutions that may have their own unique data usage requirements.

These are the types of questions our group went through in evaluating use cases and determining whether they are a good fit for AWS. This presentation will walk through how our team developed our intake process, and how we have planned our roadmap for supporting new types of use cases. This will be more of an interactive discussion so the audience can learn from each other’s experiences and hopefully come up with some common themes on how to determine which research computing use cases make the best candidates to use the cloud, and also understand how to leverage cloud computing features to support such use cases.

Previous Knowledge
Familiar with AWS, GCP, or Azure. And also with typical types of computing projects done by university researchers.

Software Installation Expectation
None

Session Skill Level
Intermediate

Session Track
Supporting Research and Researchers