Our work leads to a variety of output that can help facilitate research of others.
The Center for Biomarker Research and Precision Medicine (BPM) is housed in VCU Health Sciences Research Building and Annex on the university’s Medical College of Virginia Campus. The laboratory suite, covering over 2,000 square feet, includes a specialized genomics and epigenomics laboratory and a behavioral pharmacology laboratory. BPM investigators also have access to adjacent laboratory suite for epigenetic editing.
To minimize the risks of contamination and maximize throughout while keeping the quality of the data at the highest possible level, carefully optimized protocols and dedicated laboratory spaces are utilized for all projects
The epigenetic editing laboratory suite consists of two dedicated areas.
This laboratory is designed for studying alterations in behavior in response to treatment.
BPM is linked through a LAN and is hardwired to VCU’s high performance computing cluster (godel.vcu.edu).
The computing cluster consists of ~1200 Opteron 64 bit and Intel 64 bit cores, each with at least 3 GB RAM/core, and 4.2TB of total RAM. The servers are connected by QDR Infiniband connections. Connected to this cluster BPM has 130 TB storage space that is exclusively dedicated for the center’s use.
Read moreWe use a carefully optimized protocol for methyl-binding domain (MBD) enrichment in combination with high output next-generation sequencing to assess CpG methylation.
Chan RF, Shabalin AA, Xie LY, Adkins DE, Zhao M, Turecki G, Clark SL, Aberg KA, van den Oord EJCG. Enrichment methods provide a feasible approach to comprehensive and adequately powered investigations of the brain methylome. Nucleic Acids Res. 2017 Jun 20;45(11):e97.
Aberg KA, Chan RF, Shabalin AA, Zhao M, Turecki G, Staunstrup NH, Starnawska A, Mors O, Xie LY, van den Oord EJ. A MBD-seq protocol for large-scale methylome-wide studies with (very) low amounts of DNA. Epigenetics. 2017 Sep;12(9):743-750.
The following section includes modified protocols for extraction of DNA We have found the DNA extracted with these protocols to be of high quality and suitable for next-generation sequencing based approaches.
We have recently evaluated the ability to use DNA extracted from dry blood spots for methylome-wide investigations. We have published a modified version of a DNA protocol that allowed for extraction of the DNA from a complete blood spot in one reaction (Aberg et al. Epigenetics 2013 May;8(5):542-7.).
We have further modified our blood spot extraction protocol to also allow for efficient extraction of DNA from scrap material, i.e., the part of the blood spot that is left when punches, typically used for standard DNA extraction protocols, have been removed.
A Bioconductor package for Fast Methylome-Wide Association Study Pipeline for Enrichment Platforms
A complete toolset for methylome-wide association studies (MWAS). It is specifically designed for data from enrichment based methylation assays, but can be applied to other data as well. The analysis pipeline includes seven steps: (1) scanning aligned reads from BAM files, (2) calculation of quality control measures, (3) creation of methylation score (coverage) matrix, (4) principal component analysis for capturing batch effects and detection of outliers, (5) association analysis with respect to phenotypes of interest while correcting for top PCs and known covariates, (6) annotation of significant findings, and (7) multi-marker analysis (methylation risk score) using elastic net. Additionally, RaMWAS include tools for joint analysis of methlyation and genotype data.
A cross-platform application for optimizing LD studies
Because of the assays costs and large sample sizes that are required to discover effects while controlling false discoveries, large scale genetic association studies can be very expensive. Two-stage designs can be used to design these studies in the most cost-effective way. In two stage designs all the markers are assayed and tested in a first stage. Only the promising markers are subsequently assayed in the second stage using additional samples. Compared to single-stage studies, optimized multistage designs can achieve the same goals in terms of true and false discoveries with a 50-70% saving in the amount of genotyping. Furthermore, rather than using arbitrary rules (e.g. P-values smaller than 0.05 suggest a replication), use of multistage designs can provide statistically motivated decision rules for declaring significance.
lga972 is a cross-platform application with a graphical interface that uses a genetic algorithm for determining the design features of 2-stage genetic association studies that minimize the genotyping burden. The user can choose among a variety of case-control and family based tests where outcome may be scored as present versus absent or is a continuous variable. The text-based output can easily be exported to other programs such as word-processors and spreadsheets.
Robles, J & Van den Oord, EJCG (2004). lga972: A cross-platform application for optimizing LD studies via the genetic algorithm. Bioinformatics, 20, 3244-3245. Related papers:
Fast Enrichment Analysis via Circular Permutations
After testing many markers for association with an outcome, researcher often examine whether their top findings are enriched for specific (epi-)genomic features, pathways, or findings from other association studies. Standard statistical tools (e.g., Fisher’s exact test) may not be applicable as they would produce too optimistic P values due to the dependency of observations or use of multiple thresholds to select the top results.
To address these challenges we created shiftR, an R package for enrichment testing through fast circular permutations. shiftR is free, cross-platform, open source, and applicable for datasets of any type. A very fast algorithm based on bitwise operations is used to enable calculation of permutation P values with high dimensional data sets.