Table of Contents
General Structure of the nucleic acids
Molecular architecture and composition
Genotypes and Subtypes
Transcription and translation
Viruses have been killing people for thousands of years. Today they represent a field of active research. In this essay, the hepatitis viruses are considered. Because analyzing each of the six hepatitis viruses in depth would have been a too large topic, this paper will only deal with HBV and HCV. Even though the diseases they cause, its severity and its symptoms are remarkably similar, these two viruses have a completely different molecular biology.
After a brief report about all known hepatitis viruses, HBV and HCV are dealt with in detail. For a better comparison, the two viruses will be discussed in parallel, according to important aspects of their biology. A classification is made, also considering diameter and density. The virion is also talked about. The genome as well as genotypes and subtypes of each virus are also discussed. Genes and enzymes specific to each virus are included. After these characteristics, HBV and HCV are compared according to their replication , which includes an in depth description of the processes occuring during transcription and translation. Aspects such as opening reading fragments are also dealt with.
It is noted that the fact that HBV’s genetic material is DNA, while HCV’s is RNA makes a important difference. Since the genetic material is the main constituent of a virus, the kind of nucleic acid a virus possesses will make a difference in all its characteristics. Based on the observations make under the above mentioned categories, it is concluded that the two viruses are highly different.
The Hepatitis B and C viruses and how they differ from each other
Commonly known as the agents of disease, viruses are the boundary between the living and the non-living. Living, because they exist, reproduce and make us sick; non- living, because they need a host cell to do so. Viruses are fragments of nucleic acid (either DNA or RNA) surrounded by a protein coat. However, because they are incomplete and lack the machinery required to reproduce, viruses must inject their genetic material into a host cell first, thus, taking over the cell and turning it into a virus factory. There are numerous kinds of viruses, most of which we don’t even know of yet. One of the main preoccupations of medicine is to categorize and find a cure to the diseases caused by viruses. One of these diseases, which has been affecting people for centuries, is hepatitis, which means inflammation of the liver (Harrison, p. 7) and can be caused by a number of viruses. About twenty years ago, there were only two known hepatitis viruses. They were denoted with the letters A and B, in chronological order of their discovery. If the liver disease wasn’t caused by the hepatitis A virus (HAV), or by the hepatitis B virus (HBV) it was simply called non A- non B. Since then, a world of progress has been made. There are six known hepatitis viruses today: HAV, HBV, HCV, HDV, HEV and HGV (the letter F was indeed skipped, when naming the viruses). Even though differentiated only by the letters in their names, the viruses are different and independent of each other, even if they all affect the liver. Because looking into the molecular biology of all hepatitis viruses is a too large project, this paper will only deal with HBV and HCV. These viruses have also been picked because a large number of hepatitis patients are infected with one of them. Even though the disease they cause, its severity and its symptoms are remarkably similar, these two viruses have a completely different molecular biology.
As mentioned above, the other four hepatitis viruses won’t be discussed in detail, but their main characteristics will be mentioned. A brief statement about these traits will give the discussion of HBV and HCV a larger context.
HAV is a member of the Picornaviridae family. It is a small RNA virus, which means that the genome is single stranded, with a non-enveloped virion 27 nanometers (nm) in diameter. The genome is of positive polarity. (Monjardino, p5)
Another RNA virus is HEV, belonging to the Caliciviridae family. Its genome is also of positive polarity, but it is larger that HAV by about 5 nm. (Zuckerman, p.76) Even so, HAV and HEV are more similar than any other two hepatitis viruses.
A third RNA virus is HDV, which has not been classified yet. It is known that it has a diameter of about 36 nm and that its genome is of negative polarity. Its most important feature however, is the fact that it isn’t an independent risk factor, since it can be transmitted only in association with HBV. (Zuckerman, p. 82)
HGV is a new member of the hepatitis viruses, having been discovered only recently. It was found to be a RNA virus, belonging to the Flaviviridae family (Lee, p.849).
In the past, the unavailability of a cell culture system for propagation of hepatitis viruses remained the main impediment to the study of the natural history of molecular events, during infection. It was only with the development of genetic engineering, such as the PCR (polymerase chain reaction) method in the early eighties, that a spectacular change occurred in this area. As techniques for cloning and amplification of DNA became available, the virus genome was sequenced and genes could be identified (Monjardino, p. 3). That is why, such a big progress has been made in a fraction of the time since these viruses have been killing people or scarring them for life, if not in any other way, emotionally for sure.
The two remaining viruses, HBV and HCV will be discussed in greater detail. For a better comparison, different categories will be considered and under those headings, the viruses will be discussed in parallel.
Because of the lack of genetic homology with the other known viruses, HBV became the prototype of a new family, the Hepadnaviridae, which was soon to acquire other members (Lee, p. 855). Just like HBV, HCV is a new genus within its family. Considering its overall genomic organization (in terms of gene ordering within the genome and gene seize) it was classified as a new genus of Flavividae (ibid).
Consequently, the two viruses are not similar. They belong to different families, and the fact that they were the first members of their genus is their only similarity, up to this point.
The amount of information available about the virions of HBV and HCV is very different. This is due to the small amount of HCV available from infected serum and to the present inability to grow the virus in cultured cells (Monjardino, p.86) Even if we know more information about HBV, it is possible to sketch a comparison between this virus and HCV.
HBV is spherical and between 42-47 nm in diameter (Monjardino, p. 31), while the diameter of HCV ranges between 30 and 75nm (Monjardino, p. 86). Because of the huge range of HCV it can be said that some HC viruses could have the same size as the HB viruses. HBV consists of a 25-27 nm in diameter core, surrounded by a lipoprotein envelope. In addition, small spherical and tubular particles can be found. They are of about 20 nm in diameter and are invariably present, being exclusively made up of envelope material. In turn, the envelope material contains surface antigen (called HbsAg) which are not infectious. (Monjardino, p. 34) The molecular architecture of HCV has yet to be elucidated.
The density of the HBV varions varies between 1.19 and 1.24 g/ml in cesium chloride (CsCl). Consequently, varions have a high density (higher than the density of the other particles) and can therefore be purified (ibid). The reported particle density for HCV varies between 1.10 and 1.18 g/cm3, even though different studies have yielded different results (Monjardino, p88).
The HBV genome is a circular, partly double-stranded molecule of DNA (Nishioka, p. 68). Unlike it, the HCV genome consists of single RNA strands (ibid). This is a crucial difference between the two viruses because of the different properties of the two nucleic acids.
General structure of the nucleic acids
In DNA, the sugar is deoxyribose and the base adenine (A) pairs with thymine (T). The sugar in RNA is ribose and adenine pairs with uracil (U). The most significant difference however, is the fact that DNA is a double helix, consisting of two strands, while RNA is a single helix, therefore having just one strand. This feature will also determine the way each virus replicates. Recalling the fact that a virus’s only function is to replicate itself, we could say that the difference in nucleic acids is in turn making a difference between most of the characteristics of HBV and HCV.
Molecular architecture and composition
Since HBV has two strands, it is in a way more complex. It has one complete strand (the so-called minus strand), which is about 3200 nucleotides long and a complementary strand (also called positive strand), which has one interruption. The positive strand varies in length. Therefore, there is one common 5’end and different 3’ ends (or “termini”). That’s why the positive strands make up for about 50% - 70% of the full size of the genome. The paired base region (CG & AU) of about 200 base pairs between the 5’ termini ensures the circular configuration of the HBV genome (Harrison, p. 88).
The HVC RNA is about 9600 nucleotides long and of positive polarity. It
contains a long reading frame encoding 3011 amino acids, which spans almost the whole length of the genome (Monjardino, p. 91).
The HCV ORF encodes 10 genes of which three are structural: Core (C), envelope1 (E1) and envelope2 (E2); and seven are non-structural: NS2, NS3, NS4A, NS4B, NS5A and NS5B. An additional sequence p7 has been mapped between E2 and NS2 which in contrast to all other genes, appears not to be cotraslationally processed (Reesink, p.39).
The core gene has been mapped to the 5’end of the ORF. The suggested boundaries of the gene are nucleotides 342-914; The predicted size of the core protein is 22.5 kD, which is also positively charged (Monjardino, p. 92).
The E1 gene of HCV has been mapped between amino acids 192 and 383. The predicted molecular weight is around 21 kD, being able to go up to 35 kD due to modifications to 4 of the 5 glycosylation sites (Monjardino, p. 94). The E2 gene maps between amino acids 383 and 129. The molecular weight of the gene product is 70 kD. Three carboxy-terminally different E2 containing polypeptides have been identified, which reflects the complex processing of this region. The three forms include E2 itself and two E2 precursors, E2-NS2 and E2-p7, which are processed with variable efficiency according to genotype (Monjardino, p. 92).
The p7 sequence is located between E2 and NS2 and maps between amino acids 729 and 809. At both its amino and carboxyl termini it is thought to be processed by host signalases and cleavage sites have been identified. The function of p7 is yet unknown (Reesink, p.43).
The boundaries of NS2 lie between residues 810 and 1027. While the processing of the three structural proteins is thought to be carried out by host signalases, since it is carried out successfully in the absence of the non-structural genes, the cleavage at the NS2/NS3 boundary is thought to be carried out by a virus encoded protease (ibid). The NS3 gene is mapped between amino acids 1027 and 1657. The expressed protein has a weight of 70 kD (ibid).
NS4 maps between amino acids 1657 and 1711, and has a molecular weight of about 6 kD. The polypeptide has been found to act as a co-factor of NS3 (Reesink, p.44).
NS4B corresponds to amino acids 1712-1972 and the expressed protein has a molecular weight of about 27 kD. Processing is affected by the NS3 protease. The function of NS4B is unknown (ibid).
NS5A maps between amino acids 1973 and 2419 and the expressed protein has a molecular weight of 56 kD. NS5A is processed efficiently by NS3 protease, possibly without a requirement for NS4A (Reesink, p. 46).
Lastly, NS5B maps between amino acids 2420 and 3011 and the expressed protein has a molecular weight of 65 kD. (Reesink, p.49)
The HBV genome has four major ORFs. These four ORFs account for one and a half times the information content of the genome; all in one reading frame. This method of using twice the genetic information allows little genetic variation of individual genes. (Koshy, p. 95)
The HBV made of an envelope and a core. The envelope consists of three major proteins or surface antigens known as preS1, preS2 and s. PreS1, initiates at the AUT codon in position 2848 and is 409 nucleotides long (Nishioka, p. 73). PreS2, initiates at the AUT codon at position 3172 and is 281 amino acids long. Finally, the s protein initiates at position 155 and is 226 amino acids long. A similar alternative use of the initiation codons is seen in the core ORF where the core protein initiates at the AUT at position 1901 and terminates at position 2450. Another reading frame, P, extends over more than three quarters of the genome, from the AUTcodon at position 2357 to the termination codon at position 1621. It is 832 amino acids long. The other reading fragment, X, initiates at position 1374 and terminates at position 1836, encoding a protein 154 amino acids long (Nishioka, p. 76).
Genotypes and Subtypes
Yet another difference created by the difference in nucleic acids is HCV’s heterogeneity. Being a double helix, the replication of DNA is semi-conservative. After the unwinding of the two strands, each one of the old strands serves as a template for the production of a new one. Since A always pairs with T and C with G, two identical double helixes are obtained, each containing one of the original strands. For this reason, no matter how many times the DNA is replicated, the genetic information will stay the same. RNA, on the other hand is very error prone, because it has just a single strand. This absence of a checking mechanism leads to the formation of various, slightly different, genetic materials. Even in a single infected individual, the HCV genome exists as a quasispecies distribution of closely related but nevertheless heterogeneous genomes (Spallanzani, p. 9). HCV sequence variations have been reported, which were in turn organized in six main genotypes, each comprising different subtypes. The different genotypes differ from each other by approximately 30% over the whole genome. Different subtypes are dominant in different parts of the globe. The most important subtypes are those of Group 1, 1a and 1b. Sequence diversity varies throughout the genome with the core and the non-structural genes (especially NS4B) being the most conserved, while E1 and E2 the most variable. (Monjardino, p93)
HBV can be classified according to the antibodies against HbsAg serotypes. Six different genotypes have been identified, designated A-F, differing by as much as (or more than) 8%. The differences are in the protein sequence, which include genotype specific amino acids. The common genotypes are: a and two pairs d or y, w or r which produce other subtypes: adw or adr and ayw or ayr . These have been further subdivided into nine subtypes: ayw1, ayw2, ayw3, ayw4, ayr, adw2, adw4, adrq+ and adrq- (Harrison, p. 83). As in the case of HCV, different subtypes are dominant in different parts of the globe. (Feitelson, p. 88)
The replication mechanism of HBV seems to be unique to hepadnaviruses. After infection and uncoating, the genome migrates to the nucleus where the partially double stranded molecule is first filled in probably by the DNA polymerase present inside the virion. Then the genome is ligated into a supercoiled molecule by host ligase. This molecule (also called CCC -covalently closed circular molecule) then becomes the template for transcription of the minus strand. (McLachlan, p.17)
HVC replication involves the synthesis of an RNA strand of negative polarity, which will generate a duplex replicative form. Then this double stranded RNA will in turn become a template for the synthesis of plus strands. (Monjardino, p95)
Thanks to PCR amplification techniques it is now possible to detect the minus strand. This has showed that HCV replication occurs in the liver, not in peripheral lymphocytes which has been a popular misconception in the past, as recorded in numerous studies. (Monjardino, p87)
Transcription and Translation
This is another part where HBV and HCV are completely different solely because of their nucleic acids. For example, the DNA double helix needs to be unwound by the enzyme helicase, before another enzyme, polymerase, can start reading the template. Obviously, this procedure isn’t necessary in the case of RNA.
Generally, during transcription a complementary mRNA strand is formed and translation deals with the production of the desired polypeptide on the ribosome site.
The enzyme that transcribes HBV DNA is presumed to be RNA Polymerase II. (Feitelson, p. 109) The more abundant mRNA species produced is about 2.1 kb long and is the template for the synthesis of either the pre-S2 protein or for antigen s. Heterogeneity at the 5’ end accounts for different transcripts which become templates for one or the other form of the antigen. (Harrison, p.99)
Trancripts of about 3.4 kb serve as mRNAs for both core and polymerase multi- functional protein. It is also used as a pregenomic RNA template for reverse transcription during replication, and as mRNA for precore proteins which are ultimately processed in the ER into e antigens. Heterogeneity of the 5’ termini of these transcripts, and a strech of 15 nucleotides was reported to be sufficient for their correct initiation at nucleotides 1794 and 1823 respectively. However, when transcribing the 3.4 kb long RNA, the polymerase ignores the termination site during the first passage, but recognizes it during its second passage thus generating a transcript which is longer than the full length genome (3.4 rather than 3.2 kb). The molecular basis for the termination ‘leakage’ is not established but is thought to involve the obligatory co-operation of other DNA sequences upstream of the promoter in order to generate an effective termination signal. Initiation of transcription with binding of the polymerase to the promoter may also interfere with the formation of such a termination comlpex which only occurs during the second passage. From this single transcript, both core and polymerase are synthesized. (Harrison, p.101)
The third transcript, 2.6 kb long is the template for the L surface antigen. It is a transcript of very low abundance in naturally infected tissues, transcribed weakly after transfection into differentiated hepatoma cell lines and almost absent in non-hepatic cell lines. At the 3’ end the 2.6 kb transcript is co-terminal with both the 2.1 and the 3.4 kb transcripts. (Harrison, p 105)
Transcription of the X gene is less well understood. Although the gene is copied in all other three transcripts (2.1, 2.6, and 3.4 kD, translation from an internal promoter is normally expected to be very inefficient. It is currently assumed that translation of the X protein occurs. (ibid)
In the case of HCV, a large ORF extends throughout most of the RNA genomic sequences which could encode a polypeptide between 3010 and 3033 aas long. This ORF encodes a polyprotein precursor that is processed co- and post- translationally to yield a variety of structural and non-structural proteins. (Zuckerman, p230)
The non-structural (NS) protein region of HCV is organized in small amino acid sequences. These are conserved among proteases, helicases and replicase enzimes and are collinear in all three types of viral-encoded-polyproteins. Experimental data about the characterization and function of the HCV non-structural proteins are not available yet. However, the hydrocapathicity profiles of all three viral non-structural regions are similar, even though distinctive protein domains can be discerned within the HCV polyprotein. (ibid)
The viral enzyme protease mediates many of the cleavages that release the individual NS proteins from the precursor. The helicases are able of unwinding RNA templates and can be operational during replication, translation and splicing of RNA. As mentioned above, it is probable that HCV replicates in the cytoplasm not in the nucleus of the host cell. Since the splicing enzymes are located in the nucleus, it is probable that the HCV helicase is involved in aspects of RNA replication and/or translation. One of the HCV domains, NS5, has primary sequence motifs conserved among all viral RNA- dependent polymerases. Therefore it is assumed that this replicase activity represents at least one of its functions. However, this domain is very large (approximately 1000 amino acids) and it is likely that it has multiple functions. (Harrison, p.235)
It seems that the structural proteins are processed from the N-terminal region of the HCV polyprotein precursor, beginning with the RNA-binding nucleocapsid polypeptide, followed by the E1 and E2/ NS1 glycoproteins. The processing of the structural region of the HCV polyprotein is partly mediated by the enzyme signalase, also known as the host signal peptidase. Experiments showed that internal signal sequences upstream of the two glycoproteins direct the precursor polyprotein to the ER, where translocation and signal cleavage takes place. (Harrison, p.241)
Based on the analogies with pesti- and flaviruses and hydrophilicity/ hydrophobicity predictions of gene products, HCV structutal genes could be mapped to the 5’ end of the ORF whereas the non-structural genes followed downstream (from E2) to the end of the 5’ end of the ORF. (Monjardino, p.105)
The long HCV opening reading frame (ORF) is flanked by 2 non-coding regions. One of them is located at the 5’ end and is 341 nucleotides long. It is thought to contain essential regulatory sequences and is well conserved amongst isolates. The other noncoding region is located at the 3’ end and appears to be more variable both in length and in primary structure. (Zuckerman, p230)
In conclusion, we can say that it is evident from all of the categories discussed above, that the hepatitis B and C viruses are highly unalike. Even though they both cause a disease of the same organ, the liver, their molecular biology is very different. The basis of this difference is the fact that HBV’s genetic material is DNA, while HCV possesses RNA. The genetic material is the most important component of any organism. In viruses, on the other hand, it plays an even more important role, since it is all they consist of (beside the envelope). Therefore, the form of nucleic acid each virus possesses influences all other aspects of that viruses’ life.
Even though a lot of discoveries have been made and the amount of information added to what we already know is growing every day, there are still many aspects about HBV and HCV that have not been elucidated yet. We can only hope, that if the current pace is kept up, we will, one day, find a cure against these viruses.
Lee, Richard G. and others. Wintrobe’s Clinocal Hematology: Volume 1. Bltimore: Williams & Wilkins, 1999.
Monjardino, J. Molecular Biology of Human Hepatititis Viruses. London: Imperial College Press, 1998.
Harrison, Tim J. and Zuckerman, Arie J. The Molecular Medicine of Viral Hepatitis. London:John Wiley & Sons Ltd, 1997.
Zuckerman, Arie J. and Thomas, Howard C. Viral Hepatitis- Scientific Basis and Clinical Manegement. New York: Longman Group UK limited, 1993.
Spallanzani, L. Hepatitis C Virus: Genetic Heterogeneity and Viral Load. Paris, Libbey John Eurotext, 1997.
McLachlan, Alan. Molecular Biology of the Hepatitis B Virus. Boca Rato: CRC Press Inc, 1991.
Koshy, Rajen and Caselman, Wolfgang H. Hepatitis B Virus - Molecular Mechanisms in Disease and Novel Strategies for Therapy. London: ImperialCollege Press, 1998.
Nishioka, Kusuga and others. Hepatitis Viruses and Hepatocellular Carciroma - Approaches through Molecular Biology and Ecology. Tokyo: Academic Press Inc, 1985.
Reesink, H.W. Hepatitis C Virus. Basel: S Karger AG, 1998.
Feitelson, Mark. Molecular Component of Hepatitis B Virus. Boston: Martimus Nijhoff Publishing, 1985.
- 382 KB