Our studies typically measure the methylation status of > 20 million sites. It is common for computer programs to store data in computer memory. This, however, is not feasible for data sets of this size (e.g. 20 million sites across 1000 samples would already occupy 160 GB, which far exceeds the capacity of most computers). Furthermore, existing software lack important features needed for thorough data analyses.
We present a new software package called RaMWAS that addresses these challenges. RaMWAS uses a specially developed system of data processing that avoids loading all data into memory. We have made RaMWAS fast by employing efficient algorithms and by parallelizing most tasks across multiple CPU cores. Finally, we implemented a full set of tools for quality control of data, analyses aimed at detecting disease sites, and added advanced options such as the possibility to create risk scores that can used as biomarkers to diagnose disease or predict drug response.