Abstracts


Immunoinformatics—the new kid in town

Vladimir Brusic an
d Nikolai Petrovsky

Laboratories for Information Technology, 21 Heng Mui Keng Terrace, Singapore 119613, Centre for Medical 
Informatics, Division of Science and Design, University of Canberra, Bruce ACT 2617 and National Health Sciences 
Centre, Canberra Clinical School, Woden ACT 2606, Australia


The astounding diversity of immune system components (e.g. immunoglobulins, lymphocyte receptors, or cytokines) 
together with the complexity of the regulatory pathways and network-type interactions makes immunology a 
combinatorial science. Currently available data represent only a tiny fraction of possible situations and data 
continues to accrue at an exponential rate. Computational analysis has therefore become an essential element of 
immunology research with a main role of immunoinformatics being the management and analysis of immunological 
data. More advanced analyses of the immune system using computational models typically involve conversion of an 
immunological question to a computational problem, followed by solving of the computational problem and 
translation of these results into biologically meaningful answers. Major immunoinformatics developments include 
immunological databases, sequence analysis, structure modelling, mathematical modelling of the immune system, 
simulation of laboratory experiments, statistical support for immunological experimentation and immunogenomics. 
In this paper we describe the status and challenges within these sub-fields. We foresee the emergence of 
immunomics not only as a collective endeavour by researchers to decipher the sequences of T cell receptors, 
immunoglobulins, and other immune receptors, but also to functionally annotate the capacity of the immune system 
to interact with the whole array of self and non-self entities, including genome-to-genome interactions.

Return to contents
©2003 The Novartis Foundation


The future for computational modelling and prediction systems in clinical immunology

Nikolai Petrov
sky, Diego Silva and Vladimir Brusic

Centre for Medical Informatics, Division of Science and Design, University of Canberra, Bruce ACT 2617, 
Australia, Autoimmunity Research Unit, The Canberra Hospital, Woden ACT 2606, John Curtin School of Medical 
Research, Canberra ACT 2606, Australia and Laboratories for Information Technology, 21 Heng Mui Keng Terrace, 
Singapore 119613

Advances in computational science, despite their enormous potential, have been surprisingly slow to impact on 
clinical practice. This paper examines the potential of bioinformatics to advance clinical immunology across a 
number of key examples including the use of computational immunology to improve renal transplantation outcomes, 
identify novel genes involved in immunological disorders, decipher the relationship between antigen presentation 
pathways and human disease, and predict allergenicity. These examples demonstrate the enormous potential for 
immunoinformatics to advance clinical and experimental immunology. The acceptance of immunoinformatic techniques 
by clinical and research immunologists will need robust standards of data quality, system integrity and properly 
validated immunoinformatic systems. Such validation, at a minimum, will require appropriately designed clinical 
studies conducted according to Good Clinical Practice standards. This strategy will enable immunoinformatics to 
achieve its full potential to advance and shape clinical immunology into the future.

Return to contents
©2003 The Novartis Foundation



Immunoinformatics in personalized medicine

Kamalak
ar Gulukota

gvk bioSciences Private Limited, #210, 'My Home Tycoon', 6-3-1192, Begumpet, Hyderabad 500 016, India

Diagnosis of human disease has been undergoing steady improvement over the past few centuries. Many ailments 
that were once considered a single entity have been classified into finer categories on the basis of response to 
therapy (e.g. type I and type II diabetes), inheritance (e.g. familial and non-familial polyposis coli), 
histology (e.g. small cell and adenocarcinoma of lung) and most recently transcriptional profiling (e.g. 
leukaemia, lymphoma). The next dimension in this finer categorization appears to be the typing of the patient 
rather than the disease i.e. disease X in person of type Y. The problem of personalized medicine is to devise 
tests which predict the type of individual, especially where the type is correlated with response to therapy. 
Immunology has been at the forefront of personalized medicine for quite a while, even though the term is not 
often used in this connection. Blood grouping and cross-matching (for blood transfusion), and anaphylaxis test 
(for penicillin) are just two examples. In this paper I will argue that immunological tests have an important 
place in the future of personalized medicine. I will describe methods we developed for personalizing vaccines 
based on MHC allele frequencies in human populations and methods for predicting peptide binding to class I MHC 
molecules. In conclusion, I will argue that immunological tests, and consequently immunoinformatics, will play a 
big role in making personalized medicine a reality.

Return to contents
©2003 The Novartis Foundation


From immunome to vaccine: epitope mapping and vaccine design tools

Anne S. De Groot and William Martin

TB/HIV Research Laboratory, Brown University, International Health Institute, Box GB473, Providence, RI 02912, 
and EpiVax Inc, 16 Bassett Street, Providence RI 02903, USA

Since the publication of the complete genome of a pathogenic bacterium in 1995, more than 50 bacterial pathogens 
have been sequenced and at least 120 additional projects are currently underway. Faced with the expanding volume 
of information now available from genome databases, vaccinologists are turning to epitope mapping tools to screen 
vaccine candidates. Bioinformatics tools such as EpiMatrix and Conservatrix, which search for unique or 
multi-HLA-restricted (promiscuous) T cell epitopes and can find epitopes that are conserved across variant 
strains of the same pathogen, have accelerated the process of epitope mapping. Additional tools for screening 
epitopes for similarity to ‘self’ (BlastiMer) and forassembling putative epitopes into strings if they overlap 
(EpiAssembler) have been developed at EpiVax. Tools that map proteasome cleavage sites are available on the 
Internet. When used together, these bioinformatics tools offer a significant advantage over traditional methods 
of vaccine design since high throughput screening and design is performed in silico, followed by confirmatory 
studies in vitro. These new tools are being used to develop novel vaccines and therapeutics for the prevention 
and treatment of infectious diseases such as HIV, hepatitis C, tuberculosis, and some cancers. More recent 
applications of the tools involve deriving novel vaccine candidates directly from whole genomes, an approach that 
has been named ‘genome to vaccine’. 

Return to contents
©2003 The Novartis Foundation

Insights from MHC-bound peptides

Hanah Margalit
and Yael Altuvia

Department of Molecular Genetics and Biotechnology, The Hebrew University Hadassah Medical School, Jerusalem 
91120, Israel


Cytotoxic T cells recognize short antigenic peptides, the processing products of protein antigens, when they are 
bound to major histocompatibility complex (MHC) class I molecules. Peptide binding to MHC molecules has been 
studied extensively in numerous laboratories, providing vast amounts of sequence and structure data that have 
been used as a rich source for bioinformatic research. MHC-bound peptides and their flanking sequences provide 
information about the sequence requirements of the different processing stages, in particular, the cleavage by 
the proteasome and the binding to MHC molecules. Elucidation of these sequence requirements sheds light on the 
evolutionary forces that have shaped and designed these peptides, and should lead to the development of an 
integrative predictive algorithm. Remarkably, the peptide sequence and structure data are also valuable for the 
study of biological questions that are apparently unrelated to cellular immunity, namely, sequence–structure 
relationship and genome annotation. Here we describe our computational analyses of MHC-bound peptides, applied 
to all these biological topics. 

Return to contents
©2003 The Novartis Foundation


Computational vaccinology: quantitative approaches

Darren R. Flower, Hele
n McSparron, Martin J Blythe, Christianna Zygouri, Deborah Taylor, Pingping Guan, Shouzhan 
Wan, Peter Coveney, Valerie Walshe, Persephone Borrow and Irini A. Doytchinova

Edward Jenner Institute for Vaccine Research, High Street, Compton, Berkshire, RG0 7NN and Centre for 
Computational Science, Department of Chemistry, Queen Mary, University of London, Mile End Road, London E1 4NS, 
UK

The immune system is hierarchical and has many tiers, exhibiting much emergent behaviour. However, at its heart 
are molecular recognition events that are indistinguishable from other types of biomacromolecular interaction. 
These can be addressed well by quantitative experimental and theoretical biophysical techniques, and particularly 
by methods from drug design. We review here our approach to computational immunovaccinology. In particular, we 
describe the JenPep database and two new techniques for T cell epitope prediction. One is based on quantitative 
structure–activity relationships (a 3D-QSAR method based on CoMSIA and another 2D method based on the Free–Wilson 
approach) and the other on atomistic molecular dynamic simulations using high performance computing. JenPep 
(http://www.jenner.ac.uk/JenPep) is a relational database system supporting quantitative data on peptide binding 
to major histocompatibility complexes, TAP transporters, TCR-pMHC complexes, and an annotated list of B cell and 
T cell epitopes. Our 2D-QSAR method factors the contribution to peptide binding from individual amino acids as 
well as 1–2 and 1–3 residue interactions. In the 3D-QSAR approach, the influence of five physicochemical 
properties (volume, electrostatic potential, hydrophobicity, hydrogen-bond donor and acceptor abilities) on 
peptide affinity were considered. Both methods are exemplified through their application to the well-studied 
problem of peptide binding to the human class I MHC molecule HLA-A*0201.

Return to contents
©2003 The Novartis Foundation


IMGT, the international ImMunoGeneTics database®, http://imgt.cines.fr

Marie-Paule
Lefranc

Université Montpellier II, Laboratoire d'ImmunoGénétique Moléculaire, LIGM, UPR CNRS 1142, Institut de Génétique 
Humaine, Montpellier, France

IMGT, the international ImMunoGeneTics database® (http://imgt.cines.fr), is a high quality integrated information 
system specializing in immunoglobulins (Ig), T cell receptors (TCR) and major histocompatibility complexes (MHC) 
of human and other vertebrates, created in 1989 by LIGM at the Université Montpellier II, CNRS, Montpellier, 
France. IMGT provides a common access to standardized data which include nucleotide and protein sequences, 
oligonucleotide primers, gene maps, genetic polymorphisms, specificities, and 2D and 3D structures. IMGT includes 
four databases (IMGT/LIGM-DB, IMGT/3Dstructure-DB, IMGT/HLA-DB, IMGT/PRIMER-DB,) Web resources (‘IMGT Marie-Paule 
page’) and interactive tools (IMGT/V-QUEST, IMGT/JunctionAnalysis, IMGT/PhyloGene, IMGT/LocusView, IMGT/Geneview, 
IMGT/GeneSearch). IMGT data are expertly annotated according to the rules of the IMGT scientific chart based on 
IMGT-ONTOLOGY. IMGT tools are particularly useful for the analysis of the Ig and TCR repertoires in physiological 
normal and pathological situations. IMGT has important applications in medical research (autoimmune diseases, 
AIDS, leukaemias, lymphomas, myelomas), biotechnology related to antibody engineering (phage displays, 
combinatorial libraries) and therapeutic approaches (graft, immunotherapy). IMGT is freely available at 
http://imgt.cines.fr.

Return to contents
©2003 The Novartis Foundation


Generating data for databases—the peptide repertoire of HLA molecules

Stefan Stevanović, Clau
dia Lemmel, Maik Häntschel and Ute Eberle

Eberhard-Karls-Universität Tübingen, Institut für Zellbiologie, Abteilung Immunologie, Auf der Morgenstelle 15, 
D-72076 Tübingen, German

During the past few years, a huge amount of information about HLA-presented peptides has been compiled: several 
thousand naturally processed ligands of such cell surface receptors are already known. Nevertheless, our 
knowledge covers only a minute proportion of the total peptide repertoire. The overall amount of different 
peptides presented by one given HLA class I molecule lies between 1000 and 10000 individual sequences per 
cell. There is, however, no HLA molecule of which more than 100 ligands have been published so far. The situation 
is further complicated by the fact that different cells present different sets of peptides by the same HLA 
molecules, a feature that provides great hope for immunotherapy. We have been analysing HLA-presented peptides 
for many years for three reasons. First, the basic rules of peptide presentation (the ‘peptide motifs’) had to be 
established. Second, the listing of individual peptides presented by HLA molecules is steadily continuing, 
although a comprehensive catalogue of all possible HLA-presented peptides is utopical in our days. Third, 
quantitative differences in the presentation of individual HLA ligands provide information about the dynamic 
state of the host cells. Comprehensive information about HLA-presented peptides enables accurate epitope 
prediction and provides a basis for diagnostic assessment and therapeutic intervention.

Return to contents
©2003 The Novartis Foundation

HLA nomenclature and the IMGT/HLA Sequence Database

Steven G. E. Mar
sh

Anthony Nolan Research Institute and Department of Haematology, Royal Free & University College Medical School, 
Hampstead, London NW3 2QG, UK

Early in their study it was recognized that the genes encoding the HLA molecules were highly polymorphic and that 
there was a need for a systematic nomenclature. The result was the WHO Nomenclature Committee for Factors of the 
HLA System, which first met in 1968, and laid down the criteria for successive meetings. This committee meets 
regularly to discuss issues of nomenclature and has published 16 major reports documenting firstly the HLA 
antigens and more recently the genes and alleles. The standardization of HLA antigenic specificities has been 
controlled by the exchange of typing reagents and cells in the International Histocompatibility Workshops. Since 
1989 when a large number of HLA allele sequences were first analysed and named, the job of curating and 
maintaining a database of sequences has been of prime importance. In 1998 the IMGT/HLA database became the 
official repository for HLA sequences. In addition to the nucleotide and protein sequences the database contains 
information of the cell from which the sequence was obtained. The database which provides tools for sequence 
analysis and the submission of new data, is updated quarterly and now contains over 1500 HLA allele sequences.

Return to contents
©2003 The Novartis Foundation


From immunogenetics to immunomics: functional prospecting of genes and transcripts

Christian Schö
nbach

Biomedical Knowledge Discovery Team, Bioinformatics Group, RIKEN Genomic Sciences Center (GSC), 1-7-22 
Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan

Human and mouse genome and transcriptome projects have expanded the field of ‘immunogenetics’ beyond the 
traditional study of the genetics and evolution of MHC, TCR and Ig loci into the new interdisciplinary area of 
‘immunomics’. Immunomics is the study of the molecular functions associated with all immune-related coding and 
non-coding mRNA transcripts. To unravel the function, regulation and diversity of the immunome requires that we 
identify and correctly categorize all immune-related transcripts. The importance of intercalated genes, antisense 
transcripts and non-coding RNAs and their potential role in regulation of immune development and function are 
only just starting to be appreciated. To better understand immune function and regulation, transcriptome projects 
(e.g. Functional Annotation of the Mouse, FANTOM), that focus on sequencing full-length transcripts from multiple 
tissue sources, ideally should include specific immune cells (e.g. T cell, B cells, macrophages, dendritic cells) 
at various states of development, in activated and unactivated states and in different disease contexts. Progress 
in deciphering immune regulatory networks will require the cooperative efforts of immunologists, 
immunogeneticists, molecular biologists and bioinformaticians. Although primary sequence analysis remains useful 
for annotation of new transcripts it is less useful for identifying novel functions of known transcripts in a new 
context (protein interaction network or pathway). The most efficient approach to mine useful information from the 
vast a priori knowledge contained in biological databases and the scientific literature, is to use a combination 
of computational and expert-driven knowledge discovery strategies. This paper will illustrate the challenges 
posed in attempts to functionally infer transcriptional regulation and interaction of immune-related genes from 
text and sequence-based data sources. 

Return to contents
©2003 The Novartis Foundation

Mathematical models of HIV and the immune system

Dominik Wodarz

Fred Hutchinson Cancer Research Center, 1100 Fairview Avenue North, MP-665, Seattle, WA 98109-1024, USA

I describe how mathematical models have been used to elucidate the principles which govern HIV and immune system 
dynamics in relation to antiviral drug therapy. The review starts by introducing a basic model of virus infection 
and demonstrates how it was used to study HIV dynamics and to measure crucial parameters which lead to a new 
understanding of the disease process. Since this analysis indicates that eradication of the virus is not feasible 
during the lifetime of the patient, I continue to discuss mathematical models with the aim to explore how drug 
therapy can be used to induce long-term immunological control of the infection.

Return to contents
©2003 The Novartis Foundation


Immunogenomics: towards a digital immune system

Stephan Beck


Wellcome Trust Sanger Institute, Hinxton Genome Campus, Cambridge CB10 1SA, UK

One of the major differences that set apart vertebrates from non-vertebrates is the presence of a complex immune 
system. Over the past 400–500 million years, many novel immune genes and gene families have emerged and their 
products form sophisticated pathways providing protection against most pathogens. The Human Genome Project has 
laid the foundation to study these genes and pathways in unprecedented detail. Members of the immunoglobulin (Ig) 
superfamily alone were found to make up over 2% of human genes possibly constituting the largest gene family in 
the human genome. A subgroup of these human immune genes, those (among others) involved in antigen processing and 
presentation, are encoded in a single region, the major histocompatibility complex (MHC) on the short arm of 
chromosome 6. My laboratory has a long-standing interest in understanding the molecular organization and 
evolution of the MHC. To this end, we have been generating a range of MHC genomic resources that we make 
available in form of maps and databases. Much of the complex data of the immune system can be reduced to binary 
(on/off) information that can easily be made available and analysed by bioinformatics approaches, thus 
contributing to better understand immune function via a 'digital immune system'.

Return to contents
©2003 The Novartis Foundation


Viral bioinformatics: computational views of host and pathogen

Paul Kellam, Ri
a Holzerlandt, Eva Gramoustianou*, Richard Jenner and Antonia Kwan

Viral Genomics and Bioinformatics Group, Wohl Virion Centre, Department of Immunology and Molecular Pathology and 
Department of Virology, University College, Windeyer Institute of Medical Sciences, 46 Cleveland Street, London 
W1T 4JF, UK

Wherever cellular life occurs, viruses are also found. As a result, complex organism and cellular antiviral 
responses co-evolve with virally encoded countermeasures. Since viruses co-opt or interfere with specific 
cellular pathways during their replication, knowledge of viral genome sequences has helped fundamental 
understanding of host biology. During viral infection, shifts in the balance between host and viral biological 
processes result in acute or chronic viral disease pathology accompanied with either active viral replication, 
viral containment/persistence or viral clearance. Studying host–virus interactions at the level of single gene 
effects however, fails to produce a global systems-level understanding. This should now be achievable in the 
context of complete host and pathogen genome sequences. New experimental methods and computational approaches are 
rapidly developing, allowing global views of dynamic viral and cellular molecular mechanisms. Systems level 
virology using DNA microarrays and specific viral data resources will reveal the detailed cellular context in 
which viruses replicate, highlighting common and distinct antiviral mechanisms, the effect of different host cell 
gene expression programs and the response of cells to similar or diverse virus types. Ultimately, microbiology 
and immunology will tend towards a systems-level view of how host and pathogen interact.

Return to contents
©2003 The Novartis Foundation