Beginning of a quick project

4 minute read

Published:

Below is a post to identify the first few steps in a project in which I aim to identify the proportion of pathogenic variants present in our population.

Introduction

First, I’m just going to copy the information I have on this project so far from Dr. Maxwell.

“I had an idea for a relatively quick analysis that would probably get you a decent quick paper - asking the question what is the rate of DNA repair LP/P mutations in PMBB cancer patients (or PMBB lung cancer patients) with WES? And then can do burden testing to cancer-free patients. It does not appear anyone has published a similar analysis from UKBB. The reason we would like to do this is there are always all these papers coming out from genetic testing companies or MSKCC IMPACT reporting on fraction of patients with DNA repair mutations and they are fine studies but they have big caveats as the genetic testing companies have a referral bias for fam hx+ patients and diagnoses are reported on testing requisition forms and MSKCC has a high proportion of Ashkenazi Jewish individuals. For example I discussed this abstract today which showed a 15% mutation rate, which is really high, including 2.8% BRCA2 and 1.2% BRCA1. I just went through the PMBB data really quickly, we have 795 lung cancer patients with WES (confirmed lung cancer diagnosis with Penn Med Cancer Registry – PMCR) and only 0.8% had BRCA2, 0.4% BRCA1 - so much lower. The caveat with PMBB is that the WES was not that high depth (~60X). The other strong benefit will be that many of the patients have histology abstracted, so what’s crazy is all 6 BRCA2 carriers are squamous cell carcinomas, which I would think they would be adenocarcinomas, so that’s unexpected.”

“This is very exciting. So the same data could be presented in 2 useful ways for clinical use – one is the fraction of cancer X w/LPP variants in gene Y vs fraction of cancer-free w/LPP variants in gene Y – ie the burden testing question. Other is % of carriers of LPP variants in gene Y with cancer X. That second analysis is a penetrance question – vis a vis these UKBB and Geisinger analyses of BRCA1/2 mutations, which both suggest much lower rates of breast cancer than quotes we give to patients (ie 20% vs 60-90%). PMBB did publish an overall burden testing experiment for LOF mutations, but the difference for this would be a more in-depth analysis of the specific cancer risk genes we care about.

“We’re interested in the ACMG list here. The genes include BRIP1, RAD51C, RAD51D, HOXB13, ATM and CHEK2 (that could be argued, but would be a very reasonable list)”

Basically the steps are:

  1. Curating the LP/P variants from the WES data – one of Kate Nathanson’s bioinformaticians has already pulled the variants so we’d ask her permission to use those variant pulls and that curate which ones are LP/P (unless they’ve already done that)
  2. Match to Heena’s data of which PMBB IDs have cancer by ICD and by cancer registry.”
  3. Analyze

Additional information

In reading through this, I learned a few new things. First of all, P/LP variants refer to pathogenic or likely pathogenic variations. The other terminology available is B/LB, which refers to benign and likely benign, and VUS, which refers to variants of undetermined significance.

I also created a more concrete list of which genes we are interested in evaluating in a separate post. Of these genes, we want to find the proportion of pathogenic or likely pathogenic variants in our population.

If we need to start from scratch and identify pathogenic variants in our WES data, this project may take some time. If there is already a list containing the patients have pathogenic variants in each gene (created by other members of our lab), then this could be far more simple.