Protein structure

aesthetics  →
being  →
complexity  →
database  →
enterprise  →
ethics  →
fiction  →
history  →
internet  →
knowledge  →
language  →
licensing  →
linux  →
logic  →
method  →
news  →
perception  →
philosophy  →
policy  →
purpose  →
religion  →
science  →
sociology  →
software  →
truth  →
unix  →
wiki  →
essay  →
feed  →
help  →
system  →
wiki  →
critical  →
discussion  →
forked  →
imported  →
original  →
Protein structure
[ temporary import ]
please note:
- the content below is remote from Wikipedia
- it has been imported raw for GetWiki
{{Use dmy dates|date=June 2019}}{{More citations needed|date=May 2018}}{{Protein structure}}Protein structure is the three-dimensional arrangement of atoms in an amino acid-chain molecule. Proteins are polymers{{snd}} specifically polypeptides{{snd}} formed from sequences of amino acids, the monomers of the polymer. A single amino acid monomer may also be called a residue indicating a repeating unit of a polymer. Proteins form by amino acids undergoing condensation reactions, in which the amino acids lose one water molecule per reaction in order to attach to one another with a peptide bond. By convention, a chain under 30 amino acids is often identified as a peptide, rather than a protein.BOOK, H. Stephen Stoker, Organic and Biological Chemistry,weblink 1 January 2015, Cengage Learning, 978-1-305-68645-8, 371, To be able to perform their biological function, proteins fold into one or more specific spatial conformations driven by a number of non-covalent interactions such as hydrogen bonding, ionic interactions, Van der Waals forces, and hydrophobic packing. To understand the functions of proteins at a molecular level, it is often necessary to determine their three-dimensional structure. This is the topic of the scientific field of structural biology, which employs techniques such as X-ray crystallography, NMR spectroscopy, and dual polarisation interferometry to determine the structure of proteins.Protein structures range in size from tens to several thousand amino acids.JOURNAL, Brocchieri L, Karlin S, Protein length in eukaryotic and prokaryotic proteomes, 2005-06-10, 33, 10, 3390–3400, 10.1093/nar/gki615, 15951512, Nucleic Acids Research, 1150220, By physical size, proteins are classified as nanoparticles, between 1–100 nm. Very large aggregates can be formed from protein subunits. For example, many thousands of actin molecules assemble into a microfilament.A protein generally undergoes reversible structural changes in performing its biological function. The alternative structures of the same protein are referred to as different conformational isomers, or simply, conformations, and transitions between them are called conformational changes.

Levels of protein structure

There are four distinct levels of protein structure.

Primary structure

The primary structure of a protein refers to the sequence of amino acids in the polypeptide chain. The primary structure is held together by peptide bonds that are made during the process of protein biosynthesis. The two ends of the polypeptide chain are referred to as the carboxyl terminus (C-terminus) and the amino terminus (N-terminus) based on the nature of the free group on each extremity. Counting of residues always starts at the N-terminal end (NH2-group), which is the end where the amino group is not involved in a peptide bond. The primary structure of a protein is determined by the gene corresponding to the protein. A specific sequence of nucleotides in DNA is transcribed into mRNA, which is read by the ribosome in a process called translation. The sequence of amino acids in insulin was discovered by Frederick Sanger, establishing that proteins have defining amino acid sequences.JOURNAL, The amino-acid sequence in the phenylalanyl chain of insulin. I. The identification of lower peptides from partial hydrolysates, The Biochemical Journal, 1951-09-01, 0264-6021, 1197535, 14886310, 463–481, 49, 4, F., Sanger, H., Tuppy, 10.1042/bj0490463, JOURNAL, Chemistry of Insulin, Science, 1959-05-15, 0036-8075, 13658959, 1340–1344, 129, 3359, 10.1126/science.129.3359.1340, en, F., Sanger, The sequence of a protein is unique to that protein, and defines the structure and function of the protein. The sequence of a protein can be determined by methods such as Edman degradation or tandem mass spectrometry. Often, however, it is read directly from the sequence of the gene using the genetic code. It is strictly recommended to use the words "amino acid residues" when discussing proteins because when a peptide bond is formed, a water molecule is lost, and therefore proteins are made up of amino acid residues. Post-translational modification such as phosphorylations and glycosylations are usually also considered a part of the primary structure, and cannot be read from the gene. For example, insulin is composed of 51 amino acids in 2 chains. One chain has 31 amino acids, and the other has 20 amino acids.

Secondary structure

(File:Alpha helix.png|thumb|100px|An α-helix with hydrogen bonds (yellow dots))Secondary structure refers to highly regular local sub-structures on the actual polypeptide backbone chain. Two main types of secondary structure, the α-helix and the β-strand or β-sheets, were suggested in 1951 by Linus Pauling et al.JOURNAL, Pauling L, Corey RB, Branson HR, Proc Natl Acad Sci USA, 1951, 37, 4, 205–211, The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain, 14816373, 10.1073/pnas.37.4.205, 1063337, These secondary structures are defined by patterns of hydrogen bonds between the main-chain peptide groups. They have a regular geometry, being constrained to specific values of the dihedral angles ψ and φ on the Ramachandran plot. Both the α-helix and the β-sheet represent a way of saturating all the hydrogen bond donors and acceptors in the peptide backbone. Some parts of the protein are ordered but do not form any regular structures. They should not be confused with random coil, an unfolded polypeptide chain lacking any fixed three-dimensional structure. Several sequential secondary structures may form a "supersecondary unit".JOURNAL, Chiang YS, Gelfand TI, Kister AE, Gelfand IM, New classification of supersecondary structures of sandwich-like proteins uncovers strict patterns of strand assemblage., Proteins, 68, 4, 915–921, 2007, 17557333, 10.1002/prot.21473,

Tertiary structure

Tertiary structure refers to the three-dimensional structure of monomeric and multimeric protein molecules. The α-helixes and β-pleated-sheets are folded into a compact globular structure. The folding is driven by the non-specific hydrophobic interactions, the burial of hydrophobic residues from water, but the structure is stable only when the parts of a protein domain are locked into place by specific tertiary interactions, such as salt bridges, hydrogen bonds, and the tight packing of side chains and disulfide bonds. The disulfide bonds are extremely rare in cytosolic proteins, since the cytosol (intracellular fluid) is generally a reducing environment.

Quaternary structure

Quaternary structure is the three-dimensional structure consisting of the aggregation of two or more individual polypeptide chains (subunits) that operate as a single functional unit (multimer). The resulting multimer is stabilized by the same non-covalent interactions and disulfide bonds as in tertiary structure. There are many possible quaternary structure organisations.JOURNAL, Moutevelis E, Woolfson DN, A periodic table of coiled-coil protein structures, J. Mol. Biol., 385, 3, 726–32, January 2009, 19059267, 0022-2836, 10.1016/j.jmb.2008.11.028, Complexes of two or more polypeptides (i.e. multiple subunits) are called multimers. Specifically it would be called a dimer if it contains two subunits, a trimer if it contains three subunits, a tetramer if it contains four subunits, and a pentamer if it contains five subunits. The subunits are frequently related to one another by symmetry operations, such as a 2-fold axis in a dimer. Multimers made up of identical subunits are referred to with a prefix of "homo-" (e.g. a homotetramer) and those made up of different subunits are referred to with a prefix of "hetero-", for example, a heterotetramer, such as the two alpha and two beta chains of hemoglobin.

Domains, motifs, and folds in protein structure

File:Domain Homology.png|thumb|right|Protein domains. The two shown protein structures share a common domain (maroon), the PH domain, which is involved in phosphatidylinositol (3,4,5)-trisphosphatephosphatidylinositol (3,4,5)-trisphosphateProteins are frequently described as consisting of several structural units. These units include domains, motifs, and folds. Despite the fact that there are about 100,000 different proteins expressed in eukaryotic systems, there are many fewer different domains, structural motifs and folds.

Structural domain

A structural domain is an element of the protein's overall structure that is self-stabilizing and often folds independently of the rest of the protein chain. Many domains are not unique to the protein products of one gene or one gene family but instead appear in a variety of proteins. Domains often are named and singled out because they figure prominently in the biological function of the protein they belong to; for example, the "calcium-binding domain of calmodulin". Because they are independently stable, domains can be "swapped" by genetic engineering between one protein and another to make chimera proteins.

Structural and sequence motif

The structural and sequence motifs refer to short segments of protein three-dimensional structure or amino acid sequence that were found in a large number of different proteins.

Supersecondary structure

The supersecondary structure refers to a specific combination of secondary structure elements, such as β-α-β units or a helix-turn-helix motif. Some of them may be also referred to as structural motifs.

Protein fold

A protein fold refers to the general protein architecture, like a helix bundle, β-barrel, Rossman fold or different "folds" provided in the Structural Classification of Proteins database.JOURNAL, Govindarajan S, Recabarren R, Goldstein RA, Estimating the total number of protein folds., Proteins, 35, 4, 408–414, 17 September 1999,weblinkweblink" title="">weblink yes, 5 January 2013, 10.1002/(SICI)1097-0134(19990601)35:43.0.CO;2-A, 10382668, A related concept is protein topology that refers to the arrangement of contacts within the protein.


A superdomain consists of two or more nominally unrelated structural domains that are inherited as a single unit and occur in different proteins.JOURNAL, Haynie DT, Xue B, Superdomain in the protein structure hierarchy: the case of PTP-C2., Protein Science, 2015, 10.1002/pro.2664, 25694109, 24, 5, 874–82, 4420535, An example is provided by the protein tyrosine phosphatase domain and C2 domain pair in PTEN, several tensin proteins, auxilin and proteins in plants and fungi. The PTP-C2 superdomain evidently came into existence prior to the divergence of fungi, plants and animals is therefore likely to be about 1.5 billion years old.{{cn|date=April 2019}}

Protein dynamics

File:Protein translation.gif|thumb|300px| A ribosome is a biological machine that utilizes protein dynamics on nanoscalenanoscaleProteins are not however strictly static objects, but rather populate ensembles of conformational states.Transitions between these states typically occur on nanoscales,and have been linked to functionally relevant phenomena such as allosteric signalingBOOK, Bu Z, Callaway DJ, Proteins MOVE! Protein dynamics and long-range allostery in cell signaling, 83, 163–221, 2011, 21570668, 10.1016/B978-0-12-381262-9.00005-7,weblink Advances in Protein Chemistry and Structural Biology, 9780123812629, Protein Structure and Diseases, and enzyme catalysis.JOURNAL, Fraser JS, Clarkson MW, Degnan SC, Erion R, Kern D, Alber T, Hidden alternative structures of proline isomerase essential for catalysis, Nature, 462, 7273, 669–673, Dec 2009, 19956261, 10.1038/nature08615, 2009Natur.462..669F, 2805857, Protein dynamics and conformational changes allowproteins to function as nanoscale biological machines within cells, often in the form of multi-protein complexes.BOOK, Biochemistry, Donald, Voet, 2011, John Wiley & Sons, Voet, Judith G., 9780470570951, 4th, Hoboken, NJ, 690489261, Examples include motor proteins, such as myosin, which is responsible for muscle contraction, kinesin, which moves cargo inside cells away from the nucleus along microtubules, and dynein, which moves cargo inside cells towards the nucleus and produces the axonemal beating of motile cilia and flagella. "[I]n effect, the [motile cilium] is a nanomachine composed of perhaps over 600 proteins in molecular complexes, many of which also function independently as nanomachines...Flexible linkers allow the mobile protein domains connected by them to recruit their binding partners and induce long-range allostery via (Protein dynamics#Global flexibility: multiple domains|protein domain dynamics). "JOURNAL
, Satir
, Peter
, Søren T. Christensen
, Structure and function of mammalian cilia
, Histochemistry and Cell Biology
, 129
, 6
, 687–93
, 2008-03-26
, 10.1007/s00418-008-0416-9
, 1432-119X
, 18365235
, 2386530,

Protein folding

{{expand section|date=April 2019}}As it is translated, polypeptides exit the ribosome as a random coil and folds into its native state.JOURNAL, 2011-02-01, Folding at the birth of the nascent chain: coordinating translation with co-translational folding, Current Opinion in Structural Biology, en, 21, 1, 25–31, 10.1016/, 21111607, 0959-440X, Zhang, Gong, Ignatova, Zoya, BOOK, Molecular Biology of the Cell; Fourth Edition, Alberts, Bruce, Alexander Johnson, Julian Lewis, Martin Raff, Keith Roberts, Peter Walters, Garland Science, 2002, 978-0-8153-3218-3, New York and London, The Shape and Structure of Proteins, Bruce Alberts,weblink Since the fold is determined by a network of interactions between amino acids in the polypeptide, the final structure of the protein chain is determined by its amino acid sequence (Anfinsen's dogma).JOURNAL, Anfinsen, C., Christian B. Anfinsen, 1972, The formation and stabilization of protein structure, Biochem. J., 128, 4, 737–49, 10.1042/bj1280737, 1173893, 4565129,

Protein stability

Protein stability depends upon a few factors such as 1) Non-covalent electrostatic interactions 2) Hydrophobic interactions These interaction energies are from the order of 20-40 kJ/mol.{{Cn|date=August 2018}} Proteins are very sensitive to changing temperatures and a change in temperature may result in unfolding or denaturation. Protein denaturation may result in loss of function, and loss of native state.or it can be primitive state as well..X-ray crystallography and calorimetry indicates that there is no general mechanism that describes the effect of temperature change on the functions and structure of proteins. This is due to the fact that proteins do not represent a uniform class of chemical entities from an energetic point of view. The structure and stability of an individual protein depends on the ratio of its polar and non-polar residues. They contribute to the conformational and the net enthalpies of local and non-local interactions.Taking the weak intermolecular interactions responsible for structural integrity into consideration, it is hard to predict the effects of temperature because there are too many unknown factors contributing to the hypothetical free energy balance and its temperature dependence. Internal salt linkages produce thermal stability, and whether cold temperature results in the destabilization of these linkages is unknown.In principle, the free energy of stabilization of soluble globular proteins does not exceed 50-100 kJ/mol.{{Cn|date=August 2018}} The stabilization is based on the equivalent of few hydrogen bonds, ion pairs, or hydrophobic interactions, even though numerous intramolecular interactions results in stabilization. Taking into consideration the large number of hydrogen bonds that take place for the stabilization of secondary structures, and the stabilization of the inner core through hydrophobic interactions, the free energy of stabilization emerges as small difference between large numbers. Therefore, the structure of a native protein is not optimized for the maximum stability.JOURNAL, Jaenicke, R., Heber, U., Franks, F., Chapman, D., Griffin, Mary C. A., Hvidt, A., Cowan, D. A., 1990, Protein Structure and Function at Low Temperatures [and Discussion], Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 326, 1237, 535–553, 2398703, 10.1098/rstb.1990.0030, 1969647,

Protein structure determination

File:Protein structure examples.png|thumb|left|Examples of protein structures from the PDB ]](File:Rate of Protein Structure Determination-2014.png|thumb|400px|Rate of Protein Structure Determination by Method and Year)Around 90% of the protein structures available in the Protein Data Bank have been determined by X-ray crystallography.JOURNAL, Kendrew, J.C., Bodo, G., Dintzis, H. M., Parrish, R. G., Wyckoff, H., Phillips, D.C., Nature, 181, 4610, 662–666, 1958, A Three-Dimensional Model of the Myoglobin Molecule Obtained by X-Ray Analysis, 10.1038/181662a0, 13517261, This method allows one to measure the three-dimensional (3-D) density distribution of electrons in the protein, in the crystallized state, and thereby infer the 3-D coordinates of all the atoms to be determined to a certain resolution. Roughly 9% of the known protein structures have been obtained by nuclear magnetic resonance (NMR) techniques.{{Citation needed|date=October 2017}} For larger protein complexes, cryo-electron microscopy can determine protein structures. The resolution is typically lower than that of X-ray crystallography, or NMR, but the maximum resolution is steadily increasing. This technique is still a particularly valuable for very large protein complexes such as virus coat proteins and amyloid fibers.General secondary structure composition can be determined via circular dichroism. Vibrational spectroscopy can also be used to characterize the conformation of peptides, polypeptides, and proteins.BOOK, Krimm S, Bandekar J, Vibrational spectroscopy and conformation of peptides, polypeptides, and proteins, Adv. Protein Chem., 38, 181–364, 1986, 3541539, 10.1016/S0065-3233(08)60528-8, Advances in Protein Chemistry, 9780120342389, Two-dimensional infrared spectroscopy has become a valuable method to investigate the structures of flexible peptides and proteins that cannot be studied with other methods.JOURNAL, Lessing, J., Roy, S., Reppert, M., Baer, M., Marx, D., Jansen, T.L.C., Knoester, J., Tokmakoff, A., Identifying Residual Structure in Intrinsically Disordered Systems: A 2D IR Spectroscopic Study of the GVGXPGVG Peptide, 2012, 134, 11, 5032–5035, 10.1021/ja2114135, J. Am. Chem. Soc., 22356513, JOURNAL, Jansen, T.L.C., Knoester, J., Two-dimensional infrared population transfer spectroscopy for enhancing structural markers of proteins, 2008, 94, 5, 1818–1825, Biophys. J., 10.1529/biophysj.107.118851, 17981904, 2242754, A more qualitative picture of protein structure is often obtained by proteolysis, which is also useful to screen for more crystallizable protein samples. Novel implementations of this approach, including fast parallel proteolysis (FASTpp), can probe the structured fraction and its stability without the need for purification.JOURNAL, Minde DP, Maurice MM, Rüdiger SG, Determining biophysical protein stability in lysates by a fast proteolysis assay, FASTpp, PLoS ONE, 7, 10, e46147, 2012, 23056252, 3463568, 10.1371/journal.pone.0046147, Once a protein's structure has been experimentally determined, further detailed studies can be done computationally, using molecular dynamic simulations of that structure.JOURNAL, Kumari I, Sandhu P, Ahmed M, Akhter Y, Molecular Dynamics Simulations, Challenges and Opportunities: A Biologist's Prospective, Curr. Protein Pept. Sci., 18, 11, 1163–1179, August 2017, 28637405, 10.2174/1389203718666170622074741, {{Clear}}

Protein Sequence Analysis: Ensembles

(File:Schematic view of the two main ensemble modeling approaches.jpg|thumb|right|500px|Schematic view of the two main ensemble modeling approaches.)Proteins are often thought of as relatively stable structures that have a set tertiary structure and experience conformational changes as a result of being modified by other proteins or as part of enzymatic activity. However proteins have varying degrees of stability and some of the less stable variants are intrinsically disordered proteins. These proteins exist and function in a relatively 'disordered' state lacking a stable tertiary structure. As a result, they are difficult to describe in a standard protein structure model that was designed for proteins with a fixed tertiary structure. Conformational ensembles have been devised as a way to provide a more accurate and 'dynamic' representation of the conformational state of intrinsically disordered proteins. Conformational ensembles function by attempting to represent the various conformations of intrinsically disordered proteins within an ensemble file (the type found at the Protein Ensemble Database).Protein ensemble files are a representation of a protein that can be considered to have a flexible structure. Creating these files requires determining which of the various theoretically possible protein conformations actually exist. One approach is to apply computational algorithms to the protein data in order to try to determine the most likely set of conformations for an ensemble file.There are multiple methods for preparing data for the Protein Ensemble Database that fall into two general methodologies – pool and molecular dynamics (MD) approaches (diagrammed in the figure). The pool based approach uses the protein’s amino acid sequence to create a massive pool of random conformations. This pool is then subjected to more computational processing that creates a set of theoretical parameters for each conformation based on the structure. Conformational subsets from this pool whose average theoretical parameters closely match known experimental data for this protein are selected.JOURNAL, Computational approaches for inferring the functions of intrinsically disordered proteins, Frontiers in Molecular Biosciences, 2015-01-01, 4525029, 26301226, 45, 10.3389/fmolb.2015.00045, Mihaly, Varadi, Wim, Vranken, Mainak, Guharoy, Peter, Tompa, 2, The molecular dynamics approach takes multiple random conformations at a time and subjects all of them to experimental data. Here the experimental data is serving as limitations to be placed on the conformations (e.g. known distances between atoms). Only conformations that manage to remain within the limits set by the experimental data are accepted. This approach often applies large amounts of experimental data to the conformations which is a very computationally demanding task.{| class="wikitable"! Protein !! Data Type !! Protocol !! PED ID !! References
Sic1/Cell division control protein 4>Cdc4 Nuclear magnetic resonance and SAXS >PED9AAA >DATE = 2010-03-14PMC = 2924144PAGES = 494–506ISSUE = 4FIRST = TANJAFIRST2 = JOSEPHFIRST3 = ALEXANDERFIRST4 = STEPHENFIRST5 = HONGFIRST6 = FRANKFIRST7 = MIKEFIRST8 = JULIE D., Forman-Kay,
CDKN1B>P27 KID Nuclear magnetic resonance >PED2AAA >DATE = 2005-11-11PMID = 16214166VOLUME = 353DOI = 10.1016/J.JMB.2005.08.074LAST = SIVAKOLUNDULAST2 = BASHFORDLAST3 = KRIWACKI,
(adapted from image in "Computational approaches for inferring the functions of intrinsically disordered proteins")

Protein structure databases

Protein structure database is a database that is modeled around the various experimentally determined protein structures. The aim of most protein structure databases is to organize and annotate the protein structures, providing the biological community access to the experimental data in a useful way. Data included in protein structure databases often includes 3D coordinates as well as experimental information, such as unit cell dimensions and angles for x-ray crystallography determined structures. Though most instances, in this case either proteins or a specific structure determinations of a protein, also contain sequence information and some databases even provide means for performing sequence based queries, the primary attribute of a structure database is structural information, whereas sequence databases focus on sequence information, and contain no structural information for the majority of entries. Protein structure databases are critical for many efforts in computational biology such as structure based drug design, both in developing the computational methods used and in providing a large experimental dataset used by some methods to provide insights about the function of a protein.JOURNAL, Laskowski, RA, Protein structure databases, Mol Biotechnol, 48, 2, 183–98, 21225378, 2011, 10.1007/s12033-010-9372-4,

Structure classification

Protein structures can be grouped based on their similarity or a common evolutionary origin. The Structural Classification of Proteins databaseJOURNAL
, Murzin
, A. G.
, Brenner
, S.
, Steven E. Brenner
, Hubbard
, T.
, Tim Hubbard
, Chothia
, C.
, Cyrus Chothia
, SCOP: A structural classification of proteins database for the investigation of sequences and structures
, Journal of Molecular Biology
, 247
, 4
, 536–540
, 1995
, 10.1016/S0022-2836(05)80134-2
, 7723011
, yes
,weblink" title="">weblink
, 26 April 2012
, dmy-all
, and CATH databaseORENGO > FIRST1 = C. A., Christine Orengo first2 = A. D. first3 = S. first4 = D. T. first5 = M. B. first6 = J. M., Janet Thornton, CATH--a hierarchic classification of protein domain structures, Structure, 5, 8, 1093–1108, 1997 doi=10.1016/S0969-2126(97)00260-8, provide two different structural classifications of proteins. Shared structure between proteins is considered evidence of evolutionary relatedness between proteins and is used to group proteins together into protein superfamilies.JOURNAL, Holm, L, Rosenström, P, Dali server: conservation mapping in 3D., Nucleic Acids Research, July 2010, 38, Web Server issue, W545–9, 20457744, 10.1093/nar/gkq366, 2896194,

Computational prediction of protein structure

The generation of a protein sequence is much easier than the determination of a protein structure. However, the structure of a protein gives much more insight in the function of the protein than its sequence. Therefore, a number of methods for the computational prediction of protein structure from its sequence have been developed.JOURNAL, Zhang Y, Progress and challenges in protein structure prediction, Curr Opin Struct Biol, 18, 3, 342–348, 2008, 10.1016/, 18436442, 2680823, Ab initio prediction methods use just the sequence of the protein. Threading and homology modeling methods can build a 3-D model for a protein of unknown structure from experimental structures of evolutionarily-related proteins, called a protein family.

See also



Further reading

External links

{{Prone to spam|date=August 2014}}{{Z148}}{{Protein topics}}{{Protein domains}}{{Protein structure determination}}{{Biomolecular structure}}

- content above as imported from Wikipedia
- "Protein structure" does not exist on GetWiki (yet)
- time: 2:32pm EDT - Tue, Aug 20 2019
[ this remote article is provided by Wikipedia ]
LATEST EDITS [ see all ]
Eastern Philosophy
History of Philosophy
M.R.M. Parrott