Interoperability boosts personalised medicine

Dmitry Etin

Globally, we are still far away from solving our national eHealth challenges. There are only a few countries where electronic health records are shared between hospitals nationwide and patients can access their health data from their homes. From that perspective, developed countries are not that different from emerging economies. The latter struggle with undeveloped infrastructure to enable their e-health programs. Meanwhile, the western world is fighting bureaucracy and legislation challenges.

Acute and specialised care are, however, visibly more advanced and extremely high tech, at least in the western world. For the last decade or two, ageing societies with extra GDP actively invested in R&D in major areas like cancer, rare disease treatments and genetic studies. Pharmacogenomics and oncogenomics is an applied combination of genetic research, drug development and customised treatment to improve individual cancer treatment plans and drug therapies. Well-known outcomes are providing better reactions to the drugs based on the individual level of resistance. In general terms it is called personalised medicine and we are seeing positive improvements with these techniques in some of the most common types of cancer.

Long-term investments

As a consequence of the above situation, volumes of bio-medical and bio-informatics data are currently growing rapidly. More than 10 years has passed since the human genome project was formally accomplished. Investments in this project and in similar private and public initiatives advanced the technology on both the genomic sequencing side and in terms of data interpretation techniques. Pharmaceutical organisations, universities, high tech laboratories and hospitals generate ‘omics data privately and contribute research findings to open data libraries like PubMed and ClinTrials.

Information islands

With all the progress achieved in recent years, many of these investments created islands of information, lacking data normalisation, using different ontologies from project to project. A particular laboratory or a hospital, for example, might deal with 10-20 patients of a similar disease and related gene or variant. This is not enough for evidence-based medicine and statistically valid interpretation. There is still no effective way to support a particular clinical study where biologists, bioinformaticians and clinicians are manually navigating through articles, public databases of gene variants and mutations and matching those findings to their clinical context and phenotype data. This is still acceptable to create small-scale solutions - in-house IT solutions, for example - which might speed the research process up. The proprietary knowledge bases at best were shared across few universities and hospitals in the partnership.

Due to those reasons, personalised medicine remains an exclusive treatment up until now. Healthcare authorities, especially in those countries where the healthcare system is funded mostly by the government, turned to look for technologies that might help improve the accessibility of those medical techniques (i.e. Israeli Genetic Database). Universities, pharmaceutical companies, technology vendors and other not-for profit organisations co-operate these days to create initiatives like GA4GH - Global Alliance for Genomics and Health.

New technologies

All those efforts target data sharing and interoperability challenges. One of the first GA4GH successes was the Beacon project, which enables enquiries for a particular gene and mutation variant record in any given remote organisation connected to the Beacon project. It doesn’t return the complete genetic and clinical records, but at least it solves a ‘needle in a haystack issue; by pointing at the organisation which possesses the relevant data.

APIs for standardised access to genomics data and genotype to phenotype correlation (G2P) projects are the next steps - and hot topics for the industry.

Collaboration challenges

The next wave of investments these days occurs in building knowledge management techniques and normalised data models between different types of data: research studies, diagnostics, genetic and clinical records. Until recently these were seen as separate initiatives – and were both semantically and technologically different. The industry has achieved quite good systematisation and standardisation of phenotype data and, as stated above, progressed well with the genetic data. The major bottleneck to overcome now is how to unite these two knowledge domains with the acceptable clinical relevancy.

The interpretation bottleneck of personalised medicine

A typical cancer genomics workflow, from sequence to report, is illustrated above. The upstream, relatively automated steps (shown by their light colour here) involve (1) the production of millions of short sequence reads from a tumour sample; (2) alignment to the reference genome and application of event detection algorithms; (3) filtering, manual review and validation to identify high-quality events and (4) annotation of events and application of functional prediction algorithms. These steps culminate in (5) the production of dozens to thousands of potential tumour-driving events that must be interpreted by a skilled analyst and synthesised in a report. Each event must be researched in the context of current literature (PubMed), drug-gene interaction databases (DGIdb), relevant clinical trials (ClinTrials) and known clinical actionability from sources such as My Cancer Genome (MCG). In our opinion, this attempt to infer clinical actionability represents the most severe bottleneck of the process. The analyst must find their way through the dark by extensive manual curation before handing off (6) a report for clinical evaluation and application by medical professionals.

© Good et al.: Organising knowledge to enable personalisation of medicine in cancer. Genome Biology 2014 15:438.

Any given modern hospital bridges genetic research studies and clinical findings manually. It’s still common to interact in person between treating physicians and bioinformaticians via emails, office documents and presentations in order to exchange and validate genetic test results with phenotype information. Neither does this improve the knowledge sharing growth, nor does it allow to scale and bring the personalised medicine and treatment to a more affordable cost level.

Emerging co-operation

Fortunately we see a breakthrough in different dimensions. The standardisation organisations started collaborating with each other and we now have multiple cross-domain working groups which define a way to express genetic finding in a clinical language. We see the HL7 and its emerging FHIR initiative to be considered by GA4GH to build access to the genetic data resources. We see a lot of improvement in the clinical genomics space of the HL7 Information Model.

Technology companies test and implement trial standards, design new solutions to enable better communication between the two worlds. The integration capabilities emerge quickly to unite:

  • medical records, clinical lab results, diagnostic images repositories;
  • sequenced genome variants repositories, annotations databases;
  • public and private sources for articles and studies;
  • genotype to phenotype normalised data models; and
  • algorithms to find clinically relevant dependencies between the genotype and phenotype data.


The longer we live, the more personalised care we need. Aging societies require scalable technologies to address the growing volume of needs to stratify and personalise treatment and care processes. It does appear to be a good time to start evaluating what is available on the market.

Dmitry Etin is a member of the EMEA Strategic Pursuits Team at EMC and is responsible for national healthcare opportunities in Europe, Middle East and Africa.