The Phenotype day is an initiative developed jointly with the Bio-Ontologies and BioLINK Special Interest Groups.

The systematic description of phenotype variation has gained increasing importance since the discovery of the causal relationship between a genotype placed in a certain environment and a phenotype. It plays not only a role when accessing and mining medical records but also for the analysis of model organism data, genome sequence analysis and translation of knowledge across species. Accurate phenotyping has the potential to be the bridge between studies that aim to advance the science of medicine (such as a better understanding of the genomic basis of diseases), and studies that aim to advance the practice of medicine (such as phase IV surveillance of approved drugs).

Various research activities that attempt to understand the underlying domain knowledge exist, but they are rather restrictively applied and not very well synchronized. In this Phenotype Day we propose to trigger a comprehensive and coherent approach to studying (and ultimately facilitating) the process of knowledge acquisition and support for Deep Phenotyping by bringing together researchers and practitioners that include but are not limited to the following fields:

  • biology as well as computational biology
  • genomics, clinical genetics, pharmacogenomics, healthcare
  • text/data mining and knowledge discovery
  • knowledge representation and ontology engineering



Example topics include but are not limited to:

  • Representation of phenotypes
    • Controlled vocabularies
    • Ontologies (pre- and post-composed)
    • Data standards
  • Acquisition of phenotype descriptions
    • NLP annotation tools and pipelines
    • Tools and methods to support data curation for phenotypes
    • Integration of textual data and controlled vocabularies/ontologies
    • Phenotype discovery
    • Collaborative development and peer-review
    • Guidelines for phenotype data curation
    • Quality control and evaluation
  • Application of phenotypes to real world problems
    • Methods for phenotype alignment and interoperability
    • Drug repurposing / development
    • Genotype-environment/phenotype-genotype/phenotype-disorder relation discovery
    • Personalised medicine

Accepted papers


Full papers:

Short papers:

Position papers:


Single volume proceedings: PhenotypeDay2014.pdf

Draft program


(This is a tentative schedule and it may still suffer changes.)

Draft program for download: HERE

Invited speakers


Prof. Peter Robinson

Title: The Human Phenotype Ontology: Algorithms and Applications

Abstract: The Human Phenotype Ontology (HPO) is being developed to provide a standardized, controlled vocabulary that allows phenotypic information to be described in an unambiguous fashion in medical publications and databases. The use of an ontology to capture phenotypic information allows computational algorithms to exploit semantic similarity between related phenotypic abnormalities to define phenotypic similarity metrics, which can be used to perform database searches for clinical diagnostics or as a basis for incorporating the human phenome into large-scale computational analysis of gene expression patterns and other cellular phenomena associated with human disease. A major goal of the HPO is thus to make clinical data “computable” and to interlink clinical data with data from other domains of translational research. In this talk, I will present some of the main algorithms from the fields of semantic similarity and reasoning that are used to power HPO-based algorithms, and I will present some of the applications that we have developed using the HPO in the fields of clinical diagnostics, exome-based diagnostics, and analysis of copy-number variant diseases.

Bio: Prof. Robinson is a leading researcher in the field of phenotype knowledge representation and its application to human heritable diseases. Dr. Robinson leads a research group at the Institute of Medical Genetics and Human Genetics of the Charité – Universitätsmedizin Berlin. A major focus in his research has been to use mathematical and bioinformatic models to understand biology and hereditary disease. Dr. Robinson's computational group has developed the Human Phenotype Ontology (HPO), so he is in a unique position to offer insights into the challenges the community faces with phenotype vocabulary curation and knowledge integration. A major current focus lies in the development of algorithms for using phenotype and genotype information for diagnostics and computational biology.

Dr. Dietrich Rebholz-Schuhmann

Title: Semantic normalisation of phenotypes for biomedical data integration: requirements, status and caveats

Bio: Dr. Rebholz-Schuhmann holds a master in medicine (Univ. Duesseldorf, 1988), a Ph.D. in immunology (Univ. Duesseldorf, 1989) and a master in computer science (Univ. Passau, 1993). He was research group leader at the European Bioinformatics Institute, Hinxton (Uk) doing research in biomedical literature analysis. Since July 2012, he is senior researcher at the University of Zürich in the department of computational linguistics, heading the MANTRA project, which addresses analysis of multilingual patient records. This work will help to provide linkage between data in the scientific and clinical domains He is also editor-in-chief of the Journal of Biomedical Semantics. As a leading researcher in the field of text mining for molecular biology Dr. Rebholz-Schuhmann will provide a valuable perspective on the integration of phenotypes, genes and diseases in text to semantic resources such as biological databases and ontologies.



Nigel Collier

Nigel Collier, European Bioinformatics Institute, UK and National Institute of Informatics, Japan. Nigel is a Marie Curie Research Fellow at the European Bioinformatics Institute in Cambridge and Associate Professor at the National Institute of Informatics in Tokyo. He has been active in many projects related to natural language processing for biomedical knowledge acquisition and data integration. He developed the BioCaster system for early alerting of infectious diseases from Web and social media data. has been widely used by international human and animal health agencies. In 2012 he was awarded an EC Marie Curie Fellowship to conduct research into the acquisition and linking of phenotypes in scientific and clinical texts.

Anika Oellrich

Anika Oellrich, Wellcome Trust Sanger Institute, UK. Anika conducted PhD studies in Bioinformatics at the University of Cambridge under supervision at the European Bioinformatics Institute, Rebholz group. She was then appointed as a Senior Bioinformatician in the Mouse Genome Informatics group at the Wellcome Trust Sanger Institute, Hinxton. Her research work focuses on aspects of phenotype mining, in large data sets as well as scientific literature. Having investigated the different representations of phenotypes, she applies this knowledge to data integration and human genetic disorders with the aim of improving the understanding about the molecular mechanisms underlying human diseases.

Tudor Groza

Tudor Groza is a Research Fellow in the e-Research Group of the School of ITEE, at The University of Queensland. He currently works with experts from various organizations on the research and development of a community-driven national registry for rare disorders. His research focuses on representing, mining and using phenotype and genotype data to infer meaningful associations that could lead to a better description and management of rare disorders. Tudor received his PhD in Computer Science from the Digital Enterprise Research Institute (DERI) Galway, National University of Ireland, Galway in 2010. In 2012 he has been awarded an ARC Discovery Early Career Researcher Award to investigate novel ways of extracting, consolidating and linking scientific artefacts present in biomedical publications, with a focus on evidence-based medicine.

Karin Verspoor

Karin Verspoor, National ICT Australia (NICTA) and The University of Melbourne. Karin is the Scientific Director for Health and Life Sciences, and the Biomedical Informatics team leader, a team that is primarily focused on text- and data analytics for clinical text and information extraction from the biomedical literature. Her research addresses the development of knowledge-based methods to support biological discovery and clinical decision making, with recent work in protein function prediction and genetic variant interpretation, in addition to projects investigating the role of structured vocabularies for information retrieval in the clinical context. Karin has also been active in efforts to develop text annotation standards, both in terms of software architectures and data representations, to facilitate interoperability and reuse of tools and resources.

Nigam H. Shah

Dr. Nigam H. Shah is an Assistant Professor of Medicine (Biomedical Informatics) at the Stanford School of Medicine. Dr. Shah's research is focused on developing applications of bio-ontologies to annotate, index and analyze large unstructured datasets available in biomedicine. A key focus is to combine machine learning and text-mining with prior knowledge encoded in medical ontologies to discover hidden trends from the unstructured portion of the medical record and enable data-driven medicine. Dr. Shah holds an MBBS from Baroda Medical College, India, a PhD from Penn State University, USA and completed post-doctoral training at the Stanford Medical School.

Program Committee

  • Kevin Cohen, University of Colorado, US
  • Hong-Jie Dai, Taipei Medical University, Taiwan
  • Georgios V. Gkoutos, Aberystwyth University, UK
  • Melissa Haendel, Oregon Health & Science University, US
  • Eva Huala, Carnegie Institution for Science, US
  • Hilmar Lapp, National Evolutionary Synthesis Center (NESCent), US
  • Jin-Dong Kim, Database Center for Life Science, Japan
  • Jung-Jae Kim, Nanyang Technological University, Singapore
  • Hiroaki Kitano, Okinawa Institute of Science and Technology Graduate University, Japan
  • Sebastian Koehler, Charite Medical University Berlin, Germany
  • Suzanna Lewis, Berkeley Lab, US
  • Chris Mungall, Berkeley Lab, US
  • Jong Park, KAIST, Korea
  • Peter N. Robinson, Charite Medical University Berlin, Germany
  • Paul N. Schofield, University of Cambridge, UK
  • Guergana Savova, Children's Hospital Boston, MA, US
  • Damian Smedley, European Bioinformatics Institute, UK
  • Andreas Zankl, University of Sydney, Australia