<!------------Discussion----------->
Cysteine is not a common amino acid, with an abundance of 2.3% genome-wide, 
being most often found paired in disulfides.  Disulfides in human proteins 
are found almost exclusively in oxidative environments, such as on the cell 
surface or in lysosomes, but not in the cytoplasm or the nucleus. 
Therefore,  high cysteine content, after discounting for iron-sulfur cluster 
proteins, is contra-indicative of a cytoplasmic location.  Similarly, the 
presence of glycosylation sites, as cell-surface indicators, suggests that 
cysteines in a protein will usually be paired.
<P>
The Cysteines track shows where cysteines occur along the peptide chain,
but does not predict whether a given cysteine occurs in a disulfide or
where its partner is located. The two cysteines 
linked in a disulfide are not necessarily 
located near one another in the amino acid sequence or even in the same chain.  
Rather, the correct cysteine residues are brought into proximity during 
protein folding, which may bring remote cysteine residues together. Mammals 
have an extensive disulfide isomerase activity 
in the endoplasmic reticulum believed to chaperon disulfides toward the 
correct pairing (and thus correct folding). 
<P>
Cystine-pair information is available when the 3-D structure is known, 
when the selected protein can be threaded to a known structure, when a 
satisfactory <em>ab initio</em> model exists, or by sequence alignment
homology. UniProtKB has also collected 
experimentally-determined disulfides from the literature. 
<P>
Various web tools can predict disulfide bonds with varying degrees of
success [<A HREF="#disulf_tools">1-4</A>]. Intermolecular disulfides are fairly common,
yet the partners often unknown and so very difficult to take into account.
Where reliably predictable, disulfides provide a strong geometrical
constraint on <em>ab initio</em> structure prediction to applicable proteins.
<P>
Disulfide bonds are fairly well-conserved evolutionarily in many protein 
families, even as the percent identity drops below 35%. However, 
in some deeply diverged families such as sulfatases, new disulfides have emerged
in subfamily lineages and no ancient disulfides are retained. Conserved
cysteines that are not part of an active site are distinguishable from sporadic
cysteines and are likely in a disulfide.
If the family is large enough and the number of cysteines 
fairly small, the pairing pattern can sometimes be inferred, starting with the
two 
best-conserved cysteines found in the deepest alignment. Other proteins, with 
complex disulfide knots, are intractable to homology methods. 
<P>
Certain protein domains have characteristic disulfide motifs, for example,
the CxxxCxxC 4Fe4S clusters in 
<A HREF="http://nar.oxfordjournals.org/cgi/content/full/29/5/1097" 
TARGET=_blank>radical SAM enzymes</A>. These are often preserved
even as the domain finds itself in a much larger protein with many
additional cysteines. The domain tool 
<A HREF="http://www.sanger.ac.uk/Software/Pfam/" TARGET=_blank>Pfam</A> 
provides 28 domain listings under disulfide.
<P>
Another special case occurs in transmembrane proteins. For example, ectodomain 
cysteines will not be paired with transmembrane or endodomain cysteines. Being 
external to the cell, they are likely in disulfides although the pairing is not 
always resolved and indeed may be intermolecular. 
<P>
<CENTER>
<IMG height=266 width=277 src="cys.jpg">
</CENTER>

<P>
<A NAME=disulf_tools></A>
Disulfide bond prediction references:
<OL>
<LI>Fariselli P, Casadio R.
<A HREF="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;list_uids=11673241&amp;dopt=Abstract"
TARGET=_blank>Prediction of disulfide connectivity in proteins
</A>. <em>Bioinformatics</em> 2001 Oct;17(10):957-64. 
<LI>Fariselli P, Riccobelli P, Casadio R.
<A HREF="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&amp;db=PubMed&amp;list_uids=10409827&amp;dopt=Abstract"
TARGET=_blank>Role of evolutionary information in predicting the
disulfide-bonding state of cysteine in proteins</A>. <em>Proteins</em> 1999 Aug
15;36(3):340-6.
<LI>Martelli PL, Fariselli P, Malaguti L, Casadio R.
<A HREF="http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12601133&dopt=Abstract"
TARGET=_blank>Prediction of the disulfide bonding state of cysteines in proteins 
with hidden neural networks</A>. <em>Protein Eng.</em> 2002 Dec;15(12):951-3.
<LI>Mucchielli-Giorgi MH, Hazout S, Tuffery P. <A
HREF="http://www.ncbi.nlm.nih.gov:80/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11835499&dopt=Abstract"
TARGET=_blank>Predicting the disulfide bonding state of cysteines using
protein descriptors</A>. <em>Proteins</em> 2002 Feb 15;46(3):243-9.
</OL>
