A3. Amino Acid Charges - Biology

A3. Amino Acid Charges - Biology

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Monomeric amino acids have an alpha amino group and a carboxyl group, both of which may be protonated or deprotonated, and a R group, some of which may be protonated or deprotonated. The Ka for the reaction is:

[mathrm{K_a = dfrac{[H_3O^+][A]}{[HA]}}]


[mathrm{[H_3O^+ ] = K_adfrac{[HA ]}{[A]}}]

[mathrm{ - log [H_3O^+] = -log K_a + log dfrac {[A]}{[HA]}}]


[mathrm{pH = pK_a + log dfrac{[A]}{[HA]}}]

This is the (in)famous Henderson-Hasselbach (HH) equation.

The properties of a protein will be determined partly by whether the side chain functional groups, the N terminal, and the C terminal are charged or not. The HH equation tells us that this will depend on the pH and the pKa of the functional group.

  • If the pH is 2 units below the pKa, the HH equation becomes, -2 = log A/HA, or .01 = A/HA. This means that the functional group will be about 99% protonated (with either 0 or +1 charge, depending of the functional group).
  • If the pH is 2 units above the pKa, the HH equation becomes 2 = log A/HA, or 100 = A/HA. Therefore the functional group will be 99% deprotonated.
  • If the pH = pka, the HH equation becomes 0 = log A/HA, or 1 = A/HA. Therefore the functional group will be 50% deprotonated

From these simple examples, we have derived the +2 rule. This rule is used to quickly determine protonation, and hence charge state, and is extremely important to know (and easy to derive). Titration curves for Gly (no ionizable) side chain, Glu (carboxlic acid side chain) and Lys (amine side chain) are shown below. You should be able to associate various sections of these curves with titration of specific ionizable groups in the amino acids.

Figure: Titration curves for Gly, Glu, and Lys

Buffer Review

The Henderson-Hasselbach equation is also useful in calculating the composition of buffer solutions. Remember that buffer solutions are composed of a weak acid and its conjugate base. Consider the equilibrium for a weak acid, like acetic acid, and its conjugate base, acetate:

[mathrm{CH_3CO_2H + H_2O Leftrightarrow H_3O^+ + sideset{}{_{2}^{-}}{CH_3CO}}]

If the buffer solution contains equal concentrations of acetic acid and acetate, the pH of the solution is:

[mathrm{pH = pK_a + log dfrac {[A]}{[HA]} = 4.7 + log 1 = 4.7}]

A look at the titration curve for the carboxyl group of Gly (see above) shows that when the pH = pKa, the slope of the curve (i.e. the change in pH with addition of base or acid) is at a minimum. As a general rule of thumb, buffer solution can be made for a weak acid/base in the range of +/- 1 pH unit from the pKa of the weak acids. At the pH = pKa, the buffer solution best resists addition of either acid and base, and hence has its greatest buffering ability. The weak acid can react with added strong base to form the weak conjugate base, and the conjugate base can react with added strong acid to form the weak acid (as shown below) so pH changes on addition of strong acid and base are minimized.

  • addition of strong base produces weak conjugate base: (mathrm{CH_3CO_2H + OH^- ightarrow sideset{}{_{2}^{-}}{CH_3CO} + H_2O})
  • addition of strong acid produces weak acid: (mathrm{H_3O^+ + sideset{}{_{2}^{-}}{CH_3CO} ightarrow CH_3CO_2H + H_2O})

There are two simples ways to make a buffered solution. Consider an acetic acid/acetate buffer solution.

  • make equal molar solution of acetic acid and sodium acetate, and mix them, monitoring pH with a pH meter, until the desired pH is reached (+/- 1 unit from the pKa).
  • take a solution of acetic acid and add NaOH at substoichiometric amounts until the desired pH is reached (+/- 1 unit from the pKa). In this method you are forming the conjugate base, acetate, on addition of the weak base:

[mathrm{CH_3CO_2H + OH^- ightarrow sideset{}{_{2}^{-}}{CH_3CO} +, H_2O}]

  • Buffers for pH control: Recipes based on pKas for acids, temperature, and ionic strength

Isoelectric Point

What happens if you have many ionizable groups in a single molecule, as is the case with a polypeptide or protein. Consider a protein. At a pH of 2, all ionizable groups would be protonated, and the overall charge of the protein would be positive. (Remember, when carboxylic acid side chains are protonated, their net charge is 0.) As the pH is increased, the most acidic groups will start to deprotonate and the net charge will become less positive. At high pH, all the ionizable groups will become deprotonated in the strong base, and the overall charge of the protein will be negative. At some pH, then, the net charge will be 0. This pH is called the isoelectric point (pI). The pI can be determined by averaging the pKa values of the two groups which are closest to and straddle the pI. One of the online problems will address this in more detail

  • LIst of pI and MW for proteins derived from 2D gels

Remember that pKa is really a measure of the equilibrium constant for the reaction. And of course, you remember that DGo = -RT ln Keq. Therefore, pKa is independent of concentration, and depends only on the intrinsic stability of reactants with respect to the products. This is true only AT A GIVEN SET OF CONDITIONS, SUCH AS T, P, AND SOLVENT CONDITIONS.

Consider, for example acetic acid, which in aqueous solution has a pKa of about 4.7. It is a weak acid, which dissociates only slightly to form H+ (in water the hydronium ion, H3O+, is formed) and acetate (Ac-). These ions are moderately stable in water, but reassociate readily to form the starting product. The pKa of acetic acid in 80% ethanol is 6.87. This can be accounted for by the decrease in stability of the charged products which are less shielded from each other by the less polar ethanol. Ethanol has a lower dielectric constant than does water. The pKa increases to 10.32 in 100% ethanol, and to a whopping 130 in air!

  • A great interactive web site: Amino Acid Acid/Base Titration Curves
  • pI calculator for any protein sequence
  • Amino Acid Repository: Properties of Amino Acids

The Biology of Nutrients

Supratim Choudhuri , Ronald F. Chanderbhan , in Nutraceuticals , 2016

Amino Acid Transporters

Amino acid (AA) transporters accept groups of AAs rather than individual AAs, such as small neutral AAs, large neutral AAs, anionic AAs, and cationic AAs. Some transporters are Na + -dependent while others are Na + -independent. Initially, AA transporters were classified as “systems,” such as System L, System A, System ASC, System X AG, System y + , System b 0,+ , and many others. In this nomenclature, names in uppercase indicate Na + -dependent transporters, whereas names in lowercase indicate Na + -independent transporters. AA transporters can be symporter, antiporter, or uniporter. Neutral and anionic AAs are taken up into intestinal epithelial cells by Na + -coupled cotransport, whereas cationic/dibasic AAs are transported largely by an AA exchange mechanism ( Poncet and Taylor, 2013 ).

The modern classification of AA transporters in mammalian cells, based on similarity between transporter gene sequences, has largely replaced the classical “systems”-based classification ( Taylor, 2014 ). There are seven families of AA transporters in the SLC gene superfamily, such as SLC1A (high-affinity glutamate and neutral AA transporters), SLC6A (Na + -dependent neurotransmitter transporters), SLC7A (cationic and neutral AA transporters), SLC16A (monocarboxylate, such as lactate, pyruvate, and others and aromatic AA transporters could be H + -dependent or facilitative), SLC36A (H + -dependent AA transporters), SLC38A (Na + -dependent neutral AA transporters), and SLC43A (facilitative branched-chain AA and amino alcohol transporters). Each family has many isoforms. The monocarboxylate transporter (MCT1–9 SLC16A1–9) is important in fatty acid transport as well (discussed later). The protein names of the AA transporters frequently reflect the substrates they transport. For example, ASCT1 (SLC1A4) transports A (Ala), S (Ser), C (Cys), and T (Thr). AA transporters usually contain 10–12 TMDs.

What is the structural formula for lysine with pH of 4?

The first proton to be lost as the pH is raised (when base is added) is the proton of the alpha-carboxyl group ( #pK_"a1" = 2.2# ). The next proton to be removed is in the alpha-aminium group ( #pK_"a2" = 9.0# ). The final proton to be removed is at the side chain's aminium group ( #pK_"a3" = 10.5# ).

The state of the amino acid at any given pH is determined by a combination of two equilibria.

Referring to the Henderson–Hasselbalch equation,

When the #pK_a# is the same as #pH# , it means,

which means, #[HA]=[A^-]# , meaning we have an equal amount of the 2 forms of the amino acid.

Starting at very low pH, the predominant structure of lysine is the dicationic form (refer to image above). As more base is added (pH increases), some of the monocationic form appears. At #pH = pK_"a1" = 2.2# , we have an equal amount of dicationic and monocationic form .

As the pH increases (past pH 2.2), there will be more monocationic form as compared to the dicationic form .

At #pH = pK_"a2" = 9.0# , there will be an equal amount of the monocationic form and dipolar ion .

Since the question is asking for the structure at pH 4 , that's before reaching #pK_"a2"# . That means, the predominant structure of lysine will be the monocationic form :

Site-directed mutagenesis of an HLA-A3 gene identifies amino acid 152 as crucial for major-histocompatibility-complex-restricted and alloreactive cytotoxic-T-lymphocyte recognition

Major histocompatibility complex-restricted and alloreactive cytotoxic T lymphocytes (CTL) can discriminate between the HLA-A3.1 and HLA-A3.2 antigens. HLA-A3.1 and the rare variant HLA-A3.2 have been shown to differ by two amino acids in the alpha 2 domain at positions 152 (A3.1, glutamic acid A3.2, valine) and 156 (A3.1, leucine A3.2, glutamine). To determine the structural basis for the ability of CTL to differentiate A3.1 from A3.2, two site-directed mutants of the HLA-A3.2 gene were produced, 152A3.1-156A3.2 and 152A3.2-156A3.1, that have the indicated codons for positions 152 and 156. These mutated HLA-A3 genes, as well as the nonmutated HLA-A3.1 and HLA-A3.2 genes, were then transfected into the murine cell line P815-HTR and used as targets for human CTL. Influenza virus-specific HLA-A3.1-restricted CTL lysed virus-infected P815 cells transformed with the HLA-A3.1 and 152A3.1-156A3.2 genes, but not P815 cells transformed with the HLA-A3.2 and 152A3.2-156A3.1 genes. HLA-A3.2-allospecific CTL lysed the P815 cells transformed with the HLA-A3.2 and 152A3.2-156A3.1 genes but did not lyse P815 cells transformed with the HLA-A3.1 or 152A3.1-156A3.2 genes. Thus, a single amino acid change at position 152, substituting valine for glutamic acid and thereby introducing a charge difference, produces major structural changes in the epitopes recognized by major histocompatibility complex-restricted and alloreactive CTL.


Practice: Which shows the proper structure of Leu at physiological pH?

Concept #2: Determining Predominant Amino Acid Structures

Example #1: At pH 2, which is the predominant structure of Val?

Practice: Fill in the groups for the predominant structure of Ala at pH 13?

Practice: Fill in the appropriate groups for Asp at pH 4.3.

Practice: Draw the predominant structure of Arg at pH 6.5? (pK a1 = 9.04, pKa2 = 2.17, pKa3 = 12.48).

Practice: At what pH would an amino acid bear both a neutral -COOH and a -NH 2 group?

Videos in Zwitterion

Sign up for free to watch this video!

Join thousands of students and gain free access to 14 hours of Biochemistry videos that follow the topics your textbook covers.


The introduction of antibiotics in the 20th century contributed hugely to extend the human life span. However, the increase in antibiotic resistance and the concomitant steep decline in the number of new compounds discovered via high-throughput screening [1,2] means that we again face huge challenges to treat infections by multidrug resistant bacteria [3]. The low return of investment of high-throughput screening is due to dereplication, in other words, the rediscovery of bioactive compounds that have been identified before [4,5]. A revolution in our understanding was brought about by the development of next-generation sequencing technologies. Actinobacteria are the most prolific producers of bioactive compounds, including some two-thirds of the clinical antibiotics [6,7]. Mining of the genome sequences of these bacteria revealed a huge repository of previously unseen biosynthetic gene clusters (BGCs), highlighting that their potential as producers of bioactive molecules had been grossly underestimated [6,8,9]. However, these BGCs are often not expressed under laboratory conditions, most likely because the environmental cues that activate their expression in their original habitat are missing [10,11]. To circumvent these issues, a common strategy is to select a candidate BGC and force its expression by expression of the pathway-specific activator or via expression of the BGC in a heterologous host [12]. However, these methods are time-consuming, while it is hard to predict the novelty and utility of the compounds they produce.

To improve the success of genome mining-based drug discovery, many bioinformatic tools have been developed for identification and prioritization of BGCs. These tools often rely on conserved genetic markers present in BGCs of certain natural products, such as polyketides (PKs), non-ribosomal peptides (NRPs), and terpenes [13–15]. While these methods have unearthed vast amounts of uncharacterized BGCs, they further expand on previously characterized classes of natural products. This raises the question of whether entirely novel classes of natural products could still be discovered. A few genome mining methods, such as ClusterFinder [16] and EvoMining [17,18], have tried to tackle this problem. These methods either use criteria true of all BGCs or build around the evolutionary properties of gene families found in BGCs, rather than using BGC-class-specific genetic markers. While the lack of clear genetic markers may result in a higher number of false positives, these methods have indeed charted previously uncovered biochemical space and led to the discovery of new natural products.

One class of natural products whose expansion has been fueled by the increased amount of genomic sequences available is that of the ribosomally synthesized and post-translationally modified peptides (RiPPs) [19]. RiPPs are characterized by a unifying biosynthetic theme: A small gene encodes a short precursor peptide, which is extensively modified by a series of enzymes that typically recognize the N-terminal part of the precursor called the leader peptide, and finally cleaved to yield the mature product [20]. Despite this common biosynthetic logic, RiPP modifications are highly diverse. The latest comprehensive review categorizes RiPPs into roughly 20 different classes [19], such as lanthipeptides, lasso peptides, and thiopeptides. Each of these classes is characterized by one or more specific modifications, such as the thioether bridge in lanthipeptides or the knot-like structure of lasso peptides. Despite the extensive list of known classes and modifications, new RiPP classes are still being found. Newly identified RiPP classes often carry unusual modifications, such as D-amino acids [21], addition of unnatural amino acids [22,23], β-amino acids [24], or new variants of thioether crosslinks [25,26]. These discoveries strongly indicate that the RiPP genomic landscape remains far from completely charted and that novel types of RiPPs with new and unique biological activities may yet be uncovered. However, RiPPs pose a unique and major challenge to genome-based pathway identification attempts: Unlike in the case of nonribosomal peptide synthetases (NRPSs) and polyketide synthetases (PKSs), there are no universally conserved enzyme families or enzymatic domains that are found across all RiPP pathways. Rather, each class of RiPPs comprises its own unique set of enzyme families to post-translationally modify the precursor peptides belonging to that class. Hence, while BGCs for known RiPP classes can be identified using conventional genome mining algorithms, a much more elaborate strategy is required to automate the identification of novel RiPP classes.

Several methods have made progress in tackling this challenge. “Bait-based” approaches such as RODEO [26–31] and RiPPer [32] identify RiPP BGCs by looking for homologs of RiPP modifying enzymes of interest and facilitate identifying the genes encoding these enzymes in novel contexts to find many new RiPP BGCs. A study was also described using a transporter gene as a query that is less dependent on a specific RiPP subclass [33]. However, these methods still require a known query gene from a known RiPP subclass. Another tool recently described, NeuRiPP, is capable of predicting precursors independent of RiPP subclass but is limited to precursor analysis [34]. Yet another tool, DeepRiPP, can detect novel RiPP BGCs that are chemically far removed from known examples but is mainly designed to identify new members of known classes [35]. In the end, an algorithm for the discovery of BGCs encoding novel RiPP classes will need to integrate various sources of information to reliably identify genomic regions that are likely to encode RiPP precursors along with previously undiscovered modifying enzymes.

Here, we present decRiPPter (Data-driven Exploratory Class-independent RiPP TrackER), an integrative algorithm for the discovery of novel classes of RiPPs, without requiring prior knowledge of their specific modifications or core enzymatic machinery. DecRiPPter employs a Support Vector Machine (SVM) classifier that predicts RiPP precursors regardless of RiPP subclass, and combines this with pan-genomic analysis to identify which putative precursor genes are located within specialized genomic regions that encode multiple enzymes and are part of the accessory genome of a genus. Sequence similarity networking of the resulting precursors and gene clusters then facilitates further prioritization. Applying this method to the gifted natural product producer genus Streptomyces, we identified 42 new RiPP family candidates. Experimental characterization of a widely distributed candidate RiPP BGC led to the discovery of a novel lanthipeptide that was produced by a previously unknown enzymatic machinery.


We thank Ian Molineux, Priscilla Kemp, and Heather Keller for discussions and advice throughout the work. We thank John Dunn and Barbara Lade for the pSCANS-5 vector. We thank Roger Brent, Eric Eisenstadt, Tom Knight, and members of the Endy group for additional discussions and sustained encouragement. We thank Jorge Borges and Adolfo Casares for ‘On Exactitude in Science’ ( Davis, 1946 ). We thank Austin Che, Heather Keller, Alex Mallet, Kathleen McGinness, Samantha Sutton, Ty Thomson, Elizabeth Vesilind, and Rebecca Ward for comments on the manuscript. We thank Felice Frankel for plaque photography and encouragement. This work was funded by grants to DE from the US Office of Naval Research, DARPA, and NIH. SK was supported by an NIH MIT BPEC training fellowship. Additional support was provided by MIT.