Overview of datasets

1
df3
A data.frame: 8 × 11
Grouporg_countorg_percentfiltered_countfiltered_percentannovar_obsannovar_obs_percentbg_af_calibrburden_afbg_bf_calibrburden_bf
<chr><chr><chr><chr><dbl><dbl><dbl><dbl><dbl><dbl><dbl>
PTV_Highest(pLI=0.995-1) 366 5.13 289 4.69 150 3.14 522.8846 1111.3514
PTV_Middle(pLI=0.5-0.995)164 2.30 122 1.98 48 1.00 750.6400 361.3333
PTV_Lowest(pLI=0-0.5) 442 6.20 313 5.08 156 3.26 2030.7685 1161.3448
Missense_Highest(MPC≥2) 354 4.96 323 5.24 181 3.79 2080.8702 1341.3507
Missense_Middle(MPC=1-2) 894 12.54 789 12.80 584 12.22 6770.8626 4331.3487
Missense_Lowest(MPC<1) 315544.24 2804 45.482221 46.4527740.800616461.3493
Synonymous 175624.62 1526 24.751441 30.1414411.000010681.3493
Total 7131100.006166100.004781100.00 NA NA NA NA

org: dataset from the paper

  • 7131 DNMs from 6430 affected individuals
  • protein-coding autosomal
  • syn + Mis + PTV (‘splice_donor_variant’, ‘splice_acceptor_variant’, ‘stop_gained’, ‘frameshift_variant’)

filter: dataset used in our model

  • 6166 DNMs filtered from the 7131 DNMs
  • filter criteria
    • within our coding windows
    • variant type of ALT should be SNV
  • sample size = 4059

Annotations used in that paper were annotated by VEP, whereas ours were from ANNOVAR, which may render RR different. It will take a few more days to get VEP annotations.

bg_af(bf)_calibr: the expected number of background mutations after (before) calibration

$burden = \frac{annovar_obs}{annovar_background}$

I’m currently confused by the burden which is less than 1.

Mutation rate calibration

  • calibrate using synonymous mutations (n_DNM = 6166,n_sample = 4059)
    • the expected number of background synonymous mutations is 1068, while the observed one is 1441
    • supposed burden = 1, then took 1441/1068 as a scaling factor to perform calibration

RR estimated separately and in paper

  • RR separately estimated from our model (the first 4 columns)
  • RR reported in that paper (the last 2 columns)
1
rr2
A data.frame: 7 × 7
logRRlower_boundupper_boundRRannotaRR_paperlogRR_paper
<dbl><dbl><dbl><dbl><chr><dbl><dbl>
0.1557098-2.7656213.077041 1.168487annovar_syn 1.1263900.119018
2.3474430 2.1133852.58150110.458792annovar_MPC>=2 22.1499893.097837
1.4654350 1.2002041.730666 4.329426annovar_1<=MPC<2 4.1799951.430310
1.2471260 1.1075931.386660 3.480326annovar_0<=MPC<1 NA NA
3.1838690 2.9136243.45411324.139971annovar_pLI>=0.995 50.5604173.923169
2.2840890 1.7325802.835598 9.816739annovar_0.5>=pLI>0.995 6.8367261.922309
1.7402540 1.2410262.239483 5.698791annovar_0>=pLI>0.5 NA NA

number of risk gene

  • paper: 65 genes
  • our model: 40 genes