An introduction to the Penn Medicine Biobank

2 minute read

Published:

Welcome to the blog. In this first posting, I aim to describe the Penn Medicine Biobank. Although it remains an unfamiliar resource within Penn Medicine, it contains a fantastic array of electronic health record and genetic information with tremendous research potential.

Penn Medicine Biobank (PMBB)

Firstly, PMBB has a great website here which describes the database in much better detail than I can do it justice. Put simply, the database contains, as of 2022:

  • 60,232 patients mapped to 10,011 ICD-9 codes
  • 21,790 who have been genotyped so far

Interestingly, PMBB is not unique. There exist several other databases that serve similar functions, such as the UK Biobank and the Millilon Veterans Program.

The exciting potential in PMBB is based on the ability to link phenotypic and genotypic data in a large scale.

demographics ancestry

Why this database?

During medical school, I developed an obsession with watching medical grand rounds lectures throughout the US, frequently available on Youtube. Along this path, I watched a fascinating lecture by Dr. Daniel Rader on the PMBB and its applications to lipid physiology in the field of Cardiology. His medical grand rounds in November 2020 to Vanderbilt audience is available on youtube here. Other lectures by Dr. Daniel Rader are available from grand rounds and UW in 2015 here and a lecture to Northwestern in 2019 here

How to establish access to the PMBB dataset

Based on my interest in genetic oncology, I have joined the lab of Dr. Kara Maxwell to start research using this database. After obtaining permission from multiple sources within Penn, I performed the following steps:

  1. Emailed psom-pmacshpc@pennmedicine.upenn.edu for access to HPC (High Performance Computing) server
  2. Emailed DART_SIO@Pennmedicine.upenn.edu for access to LPC server, request a ticket in this helpdesk
  3. The PI of your lab needs to fill out a “data access request form” on the PMBB website
  4. The above steps will require showing course certification in HIPAA training
  5. Set up an account with PMACS by emailing medhelp@pennmedicine.upenn.edu,
  6. Set up mobile dual authentication at this website
  7. Set up remote VPM with PMACS and Pulse Secure at this website. It may also be possible to setup a VPN with Forticlient
  8. Set up a login software - For Windows, options include SecureCRT, Putty, or MobaXterm. For Mac, options include XQuartz or terminal.

The above steps will allow you to access the PMBB, which is located on the LPC and HPC servers

  • LPC wiki is located here
  • HPC wiki is located here

As you can see, it all takes a lot of steps!

End product

Remote workflow after accomplishing the above steps should be:

  1. Logging into Pulse Connect Secure to establish a remote VPN
  2. Logging into LPC or HPC via software such as SecureCrt

Now with access to data, we can move onto the research!