Biochemistry, Tertiary Protein Structure

Ibraheem Rehman; Connor Kerndt; Salome Botelho

Introduction

Proteins are often referred to as the workhorses of the cell because of their numerous crucial roles, including providing structural stability, acting as motors for movement, carrying out metabolic activities, participating in the expression of genetic information, and mediating communication between the cell and its environment. Although proteins are synthesized in various shapes and sizes, they share a common building block.

Twenty amino acids are commonly found in proteins. An amino acid is a small molecule containing an amino group (NH₃+), a carboxylate group (COO−), and a variable side chain called the R group. These amino acids are formally called alpha-amino acids because both amino and carboxylate (acid) groups are attached to a central carbon atom known as the α carbon (C alpha). At physiological pH, the carboxyl group is unprotonated (negatively charged), and the amino group is protonated (positively charged). Therefore, an isolated amino acid bears both negative and positive charges.

Amino acids are grouped into 3 major groups—amino acids with hydrophobic R groups, amino acids with polar (hydrophilic) R groups, and amino acids with charged R groups. Within proteins, amino acids are linked together through peptide bonds. The chemical and physical properties of proteins depend on their constituent amino acids. The sequence of amino acids in a polypeptide chain is called the protein-protein structure. The conformation of the polypeptide backbone, excluding the R groups, is known as the secondary structure. The complete 3-dimensional conformation of the protein, including its backbone atoms and all its side chains, is called tertiary structure.

In proteins with more than 1 polypeptide chain, the spatial arrangement of all the chains is referred to as the quaternary structure.[1][2][3][4] The quaternary structure is essential for maintaining the function and stability of multi-chain proteins.

Fundamentals

The structure of proteins is fundamental to their diverse functions in living organisms. This structure encompasses four levels—primary, secondary, tertiary, and quaternary, each contributing to the overall shape and function of the protein.

Primary Structure of Proteins

A protein's primary structure, defined by its unique sequence of amino acids, is essential to its function. For example, a single mutation in hemoglobin that replaces glutamic acid at residue 6 in β-globin with valine (Β6Glu→Val) results in sickle-cell anemia.[5] This primary structure forms the foundation for higher levels of protein structure, including secondary, tertiary, and quaternary.

Secondary Structure of Proteins

The secondary structure of proteins is formed by interactions between functional groups in the peptide-bonded backbone. These structures are distinctively shaped segments stabilized mainly by hydrogen bonding between the oxygen on the C=O group of one amino acid residue and the hydrogen on the N-H groups of another amino acid.

In most proteins, these polar groups form hydrogen bonds when the backbone bends, resulting in either an α-helix, where the polypeptide's backbone is coiled, or a β-pleated sheet, where segments of a peptide chain bend 180° and then fold in the same plane. In both α-helix and β-pleated sheet structures, hydrogen bond residues are often located close together in the linear sequence of a polypeptide's primary structure.

In an α-helix, hydrogen bonds are formed between residues, which are only 4 linear positions apart. The distance between residues that form a β-pleated sheet can be more significant as folds in the chain bring them close enough to bond in a 3-dimensional space.

The primary structure of a protein determines whether a segment forms α-helix or β-pleated sheet structures. Certain amino acids are more likely involved in α-helices than in β-pleated sheets, and vice versa. For instance, proline is rarely found in α-helices due to its unusual R-group, which forms bonds with the alpha carbon and the nitrogen of the core amino group.

Tertiary Structure of Proteins

A protein's distinctive 3-dimensional configuration, or tertiary structure, arises from interactions between residues as the chain bends and folds in a 3-dimensional space. These interacting residues are often distant from each other in the linear sequence.

Unlike secondary structures, which involve only hydrogen bonds between backbone components, tertiary structures result from diverse bonds and interactions between R-groups or between R-groups and the backbone. The key chemical bonds involved include hydrogen bonding, which occurs between polar side chains and opposite partial charges either in the peptide backbone or other R-groups; hydrophobic interactions in an aqueous solution, where water molecules interact with the hydrophilic polar side chains of a polypeptide, forcing the hydrophobic nonpolar side chains to merge into globular masses; and van der Waals interactions. Once hydrophobic side chains are close, their association is further stabilized by electrical attractions known as van der Waals interactions.[6][7][8]

Covalent bonding can occur between the side chains of 2 cysteines through a reaction involving their sulfhydryl groups. These disulfide bonds are often called bridges because they create strong links between distinct regions of the same polypeptide or 2 separate polypeptides. Ionic bonding, on the other hand, forms between groups with full and opposing charges, such as the ionized acidic and basic R groups.

As a polypeptide folds into its correct shape, amino acids with nonpolar side chains typically cluster at the core of the protein, avoiding contact with water. Once these nonpolar amino acids have formed the core, weak van der Waals forces stabilize the protein. In addition, hydrogen bonds and ionic interactions between polar, charged amino acids contribute to the tertiary structure. Although individually weak in the cellular environment, their cumulative effect is crucial in determining the protein's distinctive shape.[9]

The primary structure plays a critical role in determining the tertiary structure and the protein's overall function. Experiments have convincingly demonstrated that protein denaturation is a reversible process. Proteins denatured by heat, extreme pH, or denaturing reagents regain their native structure and original biological function when returned to conditions favoring the native conformation.

For example, ribonuclease undergoes denaturation in the presence of urea and mercaptoethanol, where disulfide bridges break apart in the reducing environment. However, after removing urea and mercaptoethanol, ribonuclease regains its native conformation. This example underscores how the primary structure determines the tertiary structure.

Most proteins likely undergo several intermediate structures as they progress toward achieving their most stable conformation. However, these intermediates are not visible in the final folded protein. Chaperonins, protein molecules that assist in proper protein folding, are crucial in this process. These molecules keep proteins away from disruptive chemical conditions in the cytoplasmic environment and allow polypeptides to fold spontaneously. Defects in protein folding provide the molecular basis for several genetic disorders.

Quaternary Structure of Proteins

The primary, secondary, and tertiary structures of proteins involve single polypeptides. However, some proteins contain multiple polypeptides that interact to form a single functional structure. These combinations of polypeptides, called subunits, give some proteins a quaternary structure, where individual polypeptides are held together by the same types of bonds and interactions found in the tertiary level of structure.

Clinical Significance

The structure of proteins determines their function. Therefore, incorrectly folded proteins in the human body can have catastrophic effects. Misfolding or alterations in a protein's primary structure can affect its tertiary structure. Protein misfolding can lead to conditions such as type 2 diabetes, Alzheimer's disease, Huntington's disease, and Parkinson's disease. In these conditions, a normally soluble protein is misfolded and secreted as an insoluble form called an amyloid fiber, collectively known as amyloidoses.

Mutations in the gene encoding a specific protein can lead to incorrect tertiary structure of the protein. One example of a disease resulting from this mechanism is cystic fibrosis, an inherited disorder characterized by thick and sticky mucus. Typically, the mucus secreted by epithelial cells is watery and serves as a lubricant to protect tissues. However, the thick and sticky mucus associated with cystic fibrosis does not move easily, leading to infections, difficulty absorbing nutrients in children, and infertility in men. All these issues stem from a single misfolded protein.

Cystic fibrosis arises from mutations in the gene that codes for a protein called cystic fibrosis transmembrane regulator. Individuals with cystic fibrosis produce abnormal cystic fibrosis transmembrane regulator protein or sometimes none at all. Consequently, the body produces thick, sticky mucus instead of the thin, watery type.[10][11]

The primary structure of a protein determines its secondary and tertiary structure. Mutations in the gene encoding a protein can occur due to environmental factors such as ultraviolet radiation or errors in DNA replication. Furthermore, the primary structure can be altered if the translation process on the ribosome does not occur entirely accurately. Understanding that a protein's entire structure and function rely on its primary structure is essential; if incorrect, it can potentially devastate the human body.

Details

References

[1]

Aprahamian ML, Lindert S. Utility of Covalent Labeling Mass Spectrometry Data in Protein Structure Prediction with Rosetta. Journal of chemical theory and computation. 2019 May 14:15(5):3410-3424. doi: 10.1021/acs.jctc.9b00101. Epub 2019 Apr 4 [PubMed PMID: 30946594]

[2]

Banach M, Konieczny L, Roterman I. Secondary and Supersecondary Structure of Proteins in Light of the Structure of Hydrophobic Cores. Methods in molecular biology (Clifton, N.J.). 2019:1958():347-378. doi: 10.1007/978-1-4939-9161-7_19. Epub [PubMed PMID: 30945229]

[3]

Bayse CA, Pollard DB. Conformation dynamics of cyclic disulfides and selenosulfides in CXXC(U) (X = Gly, Ala) tetrapeptide redox motifs. Journal of peptide science : an official publication of the European Peptide Society. 2019 Jun:25(6):e3160. doi: 10.1002/psc.3160. Epub 2019 Mar 14 [PubMed PMID: 30873692]

[4]

Bleiholder C, Liu FC. Structure Relaxation Approximation (SRA) for Elucidation of Protein Structures from Ion Mobility Measurements. The journal of physical chemistry. B. 2019 Apr 4:123(13):2756-2769. doi: 10.1021/acs.jpcb.8b11818. Epub 2019 Mar 25 [PubMed PMID: 30866623]

[5]

Gambari R, Waziri AD, Goonasekera H, Peprah E. Pharmacogenomics of Drugs Used in β-Thalassemia and Sickle-Cell Disease: From Basic Research to Clinical Applications. International journal of molecular sciences. 2024 Apr 12:25(8):. doi: 10.3390/ijms25084263. Epub 2024 Apr 12 [PubMed PMID: 38673849]

[6]

Rittle J, Field MJ, Green MT, Tezcan FA. An efficient, step-economical strategy for the design of functional metalloproteins. Nature chemistry. 2019 May:11(5):434-441. doi: 10.1038/s41557-019-0218-9. Epub 2019 Feb 18 [PubMed PMID: 30778140]

[7]

Snead D, Eliezer D. Intrinsically disordered proteins in synaptic vesicle trafficking and release. The Journal of biological chemistry. 2019 Mar 8:294(10):3325-3342. doi: 10.1074/jbc.REV118.006493. Epub 2019 Jan 30 [PubMed PMID: 30700558]

[8]

Gligorijević N, Minić S, Robajac D, Nikolić M, Ćirković Veličković T, Nedić O. Characterisation and the effects of bilirubin binding to human fibrinogen. International journal of biological macromolecules. 2019 May 1:128():74-79. doi: 10.1016/j.ijbiomac.2019.01.124. Epub 2019 Jan 23 [PubMed PMID: 30684573]

[9]

Scholten NR, Haandrikman D, Tolhuis JO, Morandi E, Incarnato D. SHAPEwarp-web: sequence-agnostic search for structurally homologous RNA regions across databases of chemical probing data. Nucleic acids research. 2024 May 6:():. pii: gkae348. doi: 10.1093/nar/gkae348. Epub 2024 May 6 [PubMed PMID: 38709889]

[10]

Molinski SV, Shahani VM, Subramanian AS, MacKinnon SS, Woollard G, Laforet M, Laselva O, Morayniss LD, Bear CE, Windemuth A. Comprehensive mapping of cystic fibrosis mutations to CFTR protein identifies mutation clusters and molecular docking predicts corrector binding site. Proteins. 2018 Aug:86(8):833-843. doi: 10.1002/prot.25496. Epub 2018 Apr 10 [PubMed PMID: 29569753]

[11]

Bulcaen M, Kortleven P, Liu RB, Maule G, Dreano E, Kelly M, Ensinck MM, Thierie S, Smits M, Ciciani M, Hatton A, Chevalier B, Ramalho AS, Casadevall I Solvas X, Debyser Z, Vermeulen F, Gijsbers R, Sermet-Gaudelus I, Cereseto A, Carlon MS. Prime editing functionally corrects cystic fibrosis-causing CFTR mutations in human organoids and airway epithelial cells. Cell reports. Medicine. 2024 May 21:5(5):101544. doi: 10.1016/j.xcrm.2024.101544. Epub 2024 May 1 [PubMed PMID: 38697102]