Alzheimer's disease (AD) is the most common cause of dementia affecting 36 million people worldwide. As the demographic transition in the developed countries progresses towards older population, the worsening ratio of workers per retirees and the growing number of patients with age-related illnesses such as AD will challenge the current healthcare systems and national economies. For these reasons AD has been identified as a health priority, and various methods for diagnosis and many candidates for therapies are under intense research. Even though there is currently no cure for AD, its effects can be managed. Today the significance of early and precise diagnosis of AD is emphasized in order to minimize its irreversible effects on the nervous system. When new drugs and therapies enter the market it is also vital to effectively identify the right candidates to benefit from these. The main objective of the PredictAD project was to find and integrate efficient biomarkers from heterogeneous patient data to make early diagnosis and to monitor the progress of AD in a more efficient, reliable and objective manner. The project focused on discovering biomarkers from biomolecular data, electrophysiological measurements of the brain and structural, functional and molecular brain images. We also designed and built a statistical model and a framework for exploiting these biomarkers with other available patient history and background data. We were able to discover several potential novel biomarker candidates and implement the framework in software. The results are currently used in several research projects, licensed to commercial use and being tested for clinical use in several trials.
Alzheimer's disease (AD) is the most common cause of dementia. Costs associated with dementia account for equivalent to about 1 per cent of the gross domestic product of the whole world . Currently, dementia affects approximately 36 million people worldwide. This number is expected to double during the next 20 years . The prevalence of AD in Europe will rise rapidly with the ageing population. According to  the current ratio of four European working age people per retiree will change to 2 : 1 by the year 2050. This will challenge the current healthcare systems and national economies with the problem of providing good quality care to a growing number of people with fewer resources available per patient.
The current consensus is that the preventive actions and therapies for AD should be started as soon as possible in order to be effective. This emphasizes the importance of early diagnosis. In Europe, the diagnosis of AD is made about 20 months after the appearance of symptomatic memory problems . However, the disease is known to progress for several years or even decades prior to appearance of the first clear symptoms. Despite this long period of progression, early diagnosis of AD remains a great challenge.
Unfortunately, there is currently no cure for AD although different strategies for treating and managing AD and its effects are under investigation. Some of these are therapies targeting the illness directly, such as anti-amyloid plaque agents, or prevention strategies, such as lifestyle changes. When new drugs or prevention strategies become available, early diagnosis becomes even more essential because the candidates in need and to benefit from these therapies are to be identified at earliest possible stage. Moreover, objective and evidence-based observation of the success of the treatment is vital when following the efficacy of the chosen therapy. This is especially important when developing and testing new drugs and other candidates for therapies.
Today AD is clinically diagnosed by taking physical and neurological examinations, and checking other signs of intellectual impairment through standard neuropsychological and cognitive tests. In addition to these measures, the current guidelines for the diagnostics of AD [4–6] emphasize the role of various biomarkers. These include measures from magnetic resonance imaging (MRI), positron emission tomography (PET), cerebrospinal fluid (CSF) protein profiles as well as genetic risk profiles.
The main objective of the PredictAD (www.predictad.eu) EU-funded VPH research project (June 2008 to November 2011) was to find and integrate efficient biomarkers from heterogeneous patient data to make early diagnosis, and progress monitoring of AD in a more efficient, reliable and objective manner. This article summarizes the work done in biomarker discovery from different data sources, and in developing a data-driven clinical decision support tool for combining and effectively exploiting the information in these biomarkers. The main types of data and steps of analysis are shown in figure 1.
2. Data and methods
2.1. Biomarkers from biomolecular data
Reliable indicators of AD such as traces of beta-amyloid and protein tau in the brain can be extracted from CSF. Unfortunately, collecting of CSF is a rather invasive and even somewhat risky procedure. This makes the extraction of CSF-based biomarkers challenging and unpractical for early detection by large-scale screenings. Developing CSF-based biomarkers is also challenging owing to the sensitive measurements required, which often leads to poor cross-institutional consistency. For these reasons, the possibilities of using easily available and low cost blood samples are gaining more attention.
Metabolites are by definition the small molecules in the blood involved in metabolism, which is a common definition for the vital series of chemical reactions that take place in the living body. The existence of different metabolites in blood give indications on how cells live, breathe and communicate, and whether the metabolic interactions follow their normal pathways. In the case of a disease such as AD, these interactions are presumably disturbed. In the project, we used metabolomics to reveal potential biomarkers for detecting AD . This was done by analysing blood serum samples from three different groups: healthy controls, patients with mild cognitive impairment (MCI), a possible pre-stage of AD, and patients with AD. The samples were analysed with sensitive mass spectrometer platforms capable of detecting tiny differences in concentrations of metabolites. The mass spectrometer data were analysed with a software platform designed to detect the presence of known and yet unknown metabolite groups and to compare the differences in metabolite concentrations between the three groups of subjects. The results were used to test which kind of metabolites have potential to be biomarkers indicating the processes related to AD.
Proteins are large macromolecules that take part in most functions of the living cell from structural formation to guiding the various biological processes in the cell. The synthesis and function of proteins especially in the neurons of the brain are affected by AD. During the project, in a rather similar way to metabolites, protein concentrations were analysed with a sensitive mass spectrometer and a special software was developed and used, and to analyse and interpret the results . This analysis was first done by pooling the samples from four different groups: healthy, stable MCI (SMCI), progressive MCI (PMCI), a condition to be progressed to AD, and AD to find stronger signals of differences with these groups and to narrow the scope of potential proteins. These analyses were then repeated for individual samples to identify the best protein biomarker candidates for indicating AD.
2.2. Biomarkers from electrophysiology
AD is known to cause abnormal structural and functional degeneration of the brain. The brain goes through constant changes during lifetime, and the challenge is to identify which of those are related to normal ageing and which are caused by AD or other diseases. Electroencephalography (EEG) can be used for characterizing the electrophysiology of the brain, known to be affected in AD. Measuring typical evoked responses requires interaction with the subject, which can be challenging with patients having severe symptoms. One approach to overcome this is to measure how the brain will respond to external electromagnetic stimulus. Transcranial magnetic stimulation combined with EEG (TMS/EEG) is a novel non-invasive tool for measuring the strength of the neural connectivity of the brain. According to the method, magnetic pulse is first applied to an area on the brain cortex and its response is immediately measured using EEG. Precise localization of the coils sending the TMS pulses and the electrodes recording EEG with respect to the brain allows directing of the stimulation to the areas of interest and mapping of the responses around the cortex with great spatial and temporal resolution. With the help of MRI, features of brain anatomy can be correlated with recorded potentials. TMS/EEG can be used to estimate the functional state of brain unbiased by the cognitive impairment and subjective functional performance of patients with AD.
In the project, we studied the responses of stimulating the frontal cortex of healthy young volunteers, healthy elderly volunteers and subjects with AD . In another study , the motor cortex of three subject groups, healthy, MCI and AD were stimulated and recorded. Our purpose was to analyse the correlation between the thickness of the cortex and strength of the electromagnetic field as a result to the stimulation. A unified mathematical framework was developed to analyse the TMS/EEG data, at the sensor as well as at the source level, including statistical analysis for identifying the spatio-temporal pattern of brain activity significantly evoked by TMS. The methods for the data analysis were implemented in Matlab.
2.3. Biomarkers from imaging data
The presence of medial temporal lobe atrophy (tissue loss) is a hallmark indicator of the AD. MRI provides an excellent tool for quantifying early and disproportionate brain atrophy. Because manual delineation of different structures is highly laborious and subjective, automated and objective methods are needed. A high number of image analysis techniques have been developed for quantifying the size and shape of different brain structures, especially the hippocampus, amygdala and entorhinal cortex in the context of AD. Other commonly used measures are cortical thickness, atrophy rate from longitudinal data and voxel- and region-based characteristics using morphometric techniques . Although automated tools are developed actively in many research groups, the development of robust, accurate and fast automatic methods has proved to be a very challenging problem and automatic methods are still very much lacking in clinical practice.
In addition to structural imaging, functional imaging offers complementary information about the status of the brain. For example, fludeoxyglucose (FDG) PET images show response to the glucose metabolism in the brain, which is highly correlated with the loss of neurons owing to AD, and is hence useful for monitoring the progression and for determining the severity of the disease. MRI and FDG-PET both show secondary effects of the disease process. It is hypothesized that AD is caused by accumulation of amyloid plaques in the brain. Because of this there has been an increased interest in in vivo imaging of amyloid deposits. One of the most promising methods in the early diagnosis of AD is PET imaging using a molecular tracer that binds to amyloid plaques.
During the project, we developed methods for the segmenting and extracting features of spatio-temporal brain structures in MR and PET images. We focused on developing fast and robust segmentation of the hippocampus, brain atrophy-rate measurements, tensor-based morphometry and manifold learning [12–15]. We also developed tools for the analysis of FDG-PET and PET amyloid imaging [16,17]. The clinical usefulness of these methods was evaluated in terms of robustness, accuracy and computation times by analysing data from several patient cohorts.
2.4. Model and software for integrating biomarkers
The availability of data in massive scientific cohorts collected by large healthcare systems is a tremendous asset for scientific research. However making, holistic, truthful and objective view of the state of the patient becomes very challenging as the databases grow or when new databases or biomarkers become available. Effective use of the whole spectrum of data require simple, reliable, fast and intuitive ways of comparing new, undiagnosed subject data against the existing databases.
Information sciences have developed a wealth of methods for processing and analysing multi-dimensional heterogeneous data. Most publications discussing the detection of AD from multiple patient measurements use the common classifiers of statistics such as support vector machines. These classifiers define the most probable class label for a given patient using a decision model derived from the training data. The use of such classifiers in the clinical practice requires either very high classification accuracy validated in large patient cohorts or an estimate about the reliability of the classification result for each respective case. When analysing complex diseases, the biomarkers may not contain enough information to separate the healthy from ill with high accuracy. Therefore, tools that enable the assessment of the reliability of the classification are needed. Another challenge is that the transition from the healthy state to the disease state is not typically a discrete event, especially in neurodegenerative diseases where the disease progresses very slowly. An index that reflects the disease probability or disease severity could provide a solution for both of these challenges.
We addressed these needs by developing a novel framework combining heterogeneous multi-source data and providing an evidence-based index reflecting the probability of the disease in the case of an individual subject . This index, called the Disease State Index (DSI), is computed by comparing the measurements of a yet undiagnosed case against a large number of measurements from healthy and diseased subjects collected to databases. In addition, a graphical counterpart of DSI, the Disease State Fingerprint (DSF) was developed for visualizing the state of the patient in a way that a clinician can easily see and understand how different measures of various biomarkers contribute to the index. A measure of relevance was defined to estimate the reliability of DSI assessment in making every individual decision. The framework was realized in the PredictAD software tool (figure 2) for the Microsoft Windows environment. The software was implemented with an intuitive user interface and the relevant components integrated so that it could be used as stand-alone clinical decision support system (CDSS).
3.1. Biomarkers from biomolecular data
In our study of metabolites, we found groups of molecules which show significantly lower levels in subjects with AD . Our data included samples from the first entry to the hospital and from the follow-up visit several years later when the symptoms were severe enough to make the diagnosis. The best power (p = 0.0135) of separating the subjects with AD at the baseline was found in a group of ether phospholipids that are significant building blocks of cell membranes. The best model having a combination of metabolites included with age was able to separate patients with AD area under curve (AUC) performance 0.81. The setting gave us the opportunity to test the power of different sets of metabolites in predicting which MCI patients will eventually progress to AD (PMCI) and which will remain stable (SMCI). The model with the best power of predicting the progression had three metabolites (2,4-dihydroxybutanoic acid, unidentified carboxylic acid a phosphatidylcholine). With this setting, we were able to achieve reasonably good AUC = 0.77.
Our main discovery in proteomics was the high level of isoaspartyl (isoAsp) residues (p = 0.03) in blood plasma of patients with AD . Increased levels were discovered by pooling and analysing samples from healthy controls against samples from MCI and AD patients. These results were verified with individual samples (p ≈ 0.01). Higher levels of isoAsp were also found on females. This study together with previous findings gives grounds to a hypotheses that isoAsp might be a significant marker in early processes resulting to AD.
To summarize, we managed to achieve promising results by discovering novel and verifying previously hypothesized molecular level signatures. Our findings suggest that hypoxia, oxidative stress, membrane lipid remodelling and abnormal protein aggregation might have role in early AD. Establishment of pathogenic relevance of predictive biomarkers such as these may not only facilitate early diagnosis, but may also help identify new therapeutic avenues.
3.2. Biomarkers from electrophysiology
We were able to demonstrate, that TMS-evoked potentials applied to frontal cortex are reduced in AD patients when compared with age-matched or younger healthy controls . Interestingly no significant difference in the TMS-evoked potentials between healthy young and healthy elderly controls was found. This suggests that normal age-related frontal lobe atrophy does not affect cortical excitability, but is weakened on patients with AD. We also discovered a correlation between cortical thinning and weaker cortical excitability on the motor cortex . The previously reported effect of increased excitability in AD on the sensorimotor cortex, possibly compensating the loss of cortical volume, was also recorded. This mechanism was not found on MCI patients, which might indicate that the progression of the disease proceeds with different dynamics in the structure and function of neuronal circuits from normal conditions via MCI to AD.
When combined, the results show that it is possible to extract synthetic indices of cortical excitability and effective connectivity from TMS/EEG data that significantly correlate with clinical measures of cognitive decline. Thus, TMS/EEG technology has clear potential in defining novel biomarkers for the diagnostics of AD.
3.3. Biomarkers from imaging data
We developed a method of segmenting the hippocampus, which is one of the first areas of the brain affected by AD, from MRI images . The reliability of method was verified by comparing the results against a semi-automatic method. A Dice similarity of 0.87 between the methods was achieved. The speed of the method was found sufficient for use in clinical practice. The rates of hippocampal atrophy, i.e. tissue loss, in consecutive MRI scans were also studied . We developed a method which segments image pairs or triplets in a single step in order to compute an estimate of the rate. We found different patterns of atrophy between the healthy and subjects with AD and significant (p < 0.01) correlations between atrophy measurements and other clinical variables such as Mini Mental State Examination. Beyond hippocampus, we studied whole brain feature extraction using tensor-based morphometry . We found that use of a selection of multiple templates gives significantly better classification accuracy (normal versus AD 86%, stable MCI versus progressive MCI 72,1%) than can be achieved with conventional single template methods. Changes in total of 84 brain areas were analysed and the best ones in classifying normal versus AD were found to be close to the hippocampus. In another study, modern manifold-learning-based techniques were used to map similar patients close to each other in the new manifold space . When brain image biomarkers were combined non-imaging biomarkers, we found that the addition of biomarkers such as genotype, and the concentration of beta-amyloid residue in CSF to the classifier improved the classification results (normal versus AD, stable MCI versus progressive MCI and normal versus progressive MCI).
In the study of Gray et al. , we focused on combining functional FDG-PET data with MRI images. By co-registering FDG-PET and MRI images, and analysing FDG-PET signal strengths on several brain areas in two time points, we were able achieve state of the art level classification accuracy (normal versus AD 88%, SMCI versus PMCI 65%). In another PET study , we explored the relation between hippocampal atrophy in MRI images and beta-amyloid deposit signals traced with [18F]flutemetamol. The study showed that MRI and PET provide complementary information for diagnostics.
Overall, we did a comprehensive validation for used and developed MR image analysis methods: data from six cohorts consisting of almost 2000 subjects in total were used. In a specific comparison study, it was found that no single method appeared clearly superior to the others . It was shown that the combination of the results from all developed method improves the diagnostic accuracy.
3.4. Model and software
In the study of Mattila et al. , we used the baseline data from the Alzheimer's disease neuroimaging initiative (ADNI; http://www.adni-info.org/, accessed 26 October 2012.) to benchmark the ability of the DSI to model disease progression from elderly healthy controls to AD and its ability to predict conversion from MCI to AD. We found that the DSI provides well-behaving AD state estimates that correspond well with the existing diagnoses. For predicting conversion from MCI to AD, the DSI attains performance similar to state of the art reference classifiers. The results suggest that the DSF establishes an effective decision support and data visualization framework for improving AD diagnostics, allowing clinicians to rapidly analysse large quantities of diverse patient data.
The validation of the implementation of DSI/DSF framework in a software solution (figure 2) showed that its use improves clinicians' diagnostic accuracy and their confidence about their decisions compared with current diagnostic work-flows [20,21]. Our results show that using the tool diagnosis could be made with high accuracy for 50 per cent of MCI cases about twelve months earlier than is currently possible . One of the innovations in PredictAD is the stratification of populations, i.e. patients can be categorized to clear and less-clear cases based on the property of continuity of the DSI. As the ADNI data from about 400 cases was used in this study, these numbers do not contain improvements that are obtained by including the novel blood, amyloid-PET and TMS/EEG-based biomarkers. If these markers were used, the numbers would presumably be significantly better. Unfortunately there is yet no cohort having all these data available.
The significance of early diagnosis of AD has been recognized and emphasized in many different forums and contexts. Success in developing effective means for the early diagnosis would have major impact first in the developed countries and later in the rest of the world owing to the demographic transition towards older population. Despite large past and on-going efforts in developing early diagnosis of AD the problem remains largely unsolved. It is the current understanding that no simple and easy solution will likely be discovered soon.
Our contribution to solving the problem was not to focus on finding a single indicator, but to develop a methodology which is capable of using the little clues and hints collected along the patient history in a systematic fashion. The significance and relevance of the available information were tested against past data on similar patients whose outcome has been documented. This way decisions on whether to take more tests or give diagnosis and estimate the state of the disease on already collected information could be made. This methodology was realized in a usable and intuitive way and so that the reliability and accuracy of the developed methods were verifiable. By putting together a team of experts of several disciplines such as molecular chemistry, bioinformatics, medical imaging, statistics, software engineering, business and clinical medicine, we were able to study, develop and oversee steps from fundamental research to clinically usable software implementation. We also believe that having scientists, engineers, and clinicians working in collaboration, there are significantly better chances of enabling results of fundamental research to benefit the clinical routine.
Many approaches of using multiple biomarkers in the early detection of AD have been and are currently being studied. For example, Fan et al.  have successfully combined biomarkers from structural and molecular imaging to classify the samples to MCI and AD patients. In recent work by Fonteijn et al. , the disease progression is modelled successfully through a series of consecutive events where a number of measurements of different biomarkers have been taken. A rough disease progression model as proposed by Jack et al.  was used as a basis. The same progression model is used in the work by Jedynak et al. . There a disease progression score was composed from a number of sigmoid functions each of which were introduced to estimate the disease progression as indicated by a respective biomarker. In contrast to our work the methods proposed in recent studies [24,26], assume a monotonic progression of the disease in light of the many measured biomarkers. As a potential advantage over our approach, the possible relative power of biomarkers in estimating the disease state during the course of the disease is taken into account. A particular benefit of our approach over many other disease models is the ability to visualize the individual contributions of different biomarkers with the DSF (figure 2) in the final disease index value.
Utilization of clinical disease support systems, such as discussed in Mattila et al. , has potential economic impact in several ways. Accurate an early diagnosis would delay the progression AD, and thus likely reduce the need for institutional care significantly. This point will become critical when more effective therapies will emerge. Efficient use of CDSSs would bring give more value to existing information collected to reference databases and to the data collected by testing the subject being diagnosed. CDSSs will help in objectively defining the point where enough evidence has been collected and no potentially expensive, laborious or even risky further test are needed. Evidence suggesting solid cost versus benefit gains when using  was found based on the cost per procedure data from project partner hospitals. These results are yet to be published.
To conclude, PredictAD took various steps towards more objective and efficient diagnostics in AD on several fronts. We were able to make new discoveries by acquiring and analysing new data, developing novel, and redefining current methods for reprocessing existing data and building a framework in software that is capable of integrating and visualizing masses of heterogeneous data for the foundation of evidence-based clinical decisions. The project provided several novel tools for biomarker discovery and a novel data-driven and evidence-based disease profiling. The results are currently used in several research projects, licensed to commercial use and being tested for clinical use in several trials.
One contribution of 25 to a Theme Issue ‘The virtual physiological human: integrative approaches to computational biomedicine’.
- Received November 1, 2012.
- Accepted December 26, 2012.
- © 2013 The Author(s) Published by the Royal Society. All rights reserved.