The glycan list is under construction, but you can search glycan structures registered in GlyTouCan.
GlycoCT format is encoding schema for carbohydrate sequences based on a connection table approach to describe carbohydrate sequences. The format is adopting IUPAC rules to generate a consistent, machine-readable nomenclature using a block concept to describe carbohydrate sequences like repeating units. It consists of two variants, a condensed format and an XML format. The condensed format allows for unique identification of glycan structures in a compact manner. The monosaccharide naming convention follows the following format: a-bccc-DDD-e:f|g:h, where a is the anomeric configuration (one of a, b, o, x), b is the configurational symbol (one of d, l, x), ccc is the three-letter code for the monosaccharide as listed in Table 1.1, DDD is the base type or superclass indicating the number of consecutive carbon atoms such as HEX, PEN, NON, e and f indicate the carbon numbers involved in closing the ring, g is the position of the modifier, and h is the type of modifier. For a, b, e, f and g, an x can be used to specify an unknown value. bcc and g : h may also be repeated if necessary. It is noted that substituents of monosaccharides are also treated as separate residues attached to the base residue. These substituents are distinguished by specifying one of the following codes immediately after the residue number: b=basetype, s=substituent, r=repeating unit, a=alternative unit. The list of substituents handled by GlycoCT is given in Table 1.2. The GlycoCT format follows something similar to the KCF format, where the residues are specified in a RES section, and the linkage in a LIN section. More details
List of monosaccharide and their three-letter codes used in GlycoCT.
|Monosaccharide name||Three-letter code||Superclass|
List of substituents used in GlycoCT.
Example of GlycoCT format:
RES 1b:b-dglc-HEX-1:5 2b:b-dgal-HEX-1:5 3b:b-dglc-HEX-1:5 4s:n-acetyl 5b:b-dgal-HEX-1:5 6b:a-lgal-HEX-1:5|6:d LIN 1:1o(4+1)2d 2:2o(3+1)3d 3:3d(2+1)4n 4:3o(3+1)5d 5:5o(2+1)6d
IUPAC suggests an extended IUPAC form by which structures are written across multiple lines. This is the format originally used by CarbBank, thus it is sometimes referred to as such. The representation of monosaccharides is the same as that of IUPAC format, where each monosaccharides residue is preceded by the anomeric descriptor and the configuration symbol and the ring size is indicated by an italic f or p. If any of α/β, D/L, f/p are omitted, it is assumed that this structural detail is unknown. Branches are written on a second line, or in brackets on the same line. This format is may substitute α and β with a and b, respectively. Arrows (→) may also be replaced by hyphens (-), and up (↑) and down (↓) arrows may be replaced by bars (|). More details
Example of CarbBank format: The N-glycan core structure represented in CarbBank (extended IUPAC) format.
a-D-Manp-(1-6)+ | b-D-Manp-(1-4)-b-D-GlcpNAc-(1-4)-a-D-GlcpNAc | a-D-Manp-(1-3)+
However, enter the following:
Linear Code® is a carbohydrate format that uses a single-letter nomenclature for monosaccharides and includes a condensed description of the glycosidic linkages. Monosaccharide representation is based on the common structure of a monosaccharide where modifications to the common structure are indicated by specific symbols, as in the following (Banin el al.(2002)). Stereoisomers (D or L) differing from the common isomer are indicated by apostrophe (‘). Monosaccharides with differing ring size (furanose or pyranose) from the common form are indicated by a caret (^). Monosaccharides differing in both of the above are indicated by a tilde (~). More details
List of common modifications as used in the Linear Code® format.
|Modification Type||Linear Code®|
Example of Linear Code®:
The KEGG Chemical Function (KCF) format for representing glycan structures was originally used to represent chemical structures (thus the name) in KEGG. KCF uses the graph notation, where nodes are monosaccharides and edges are glycosidic linkages. Thus to represent a glycan, at least three sections are required: ENTRY, NODE, EDGE, followed by three slashes ‘///’ at the end. More details
Example of KCF format: The N-glycan core structure represented in KCF format.
ENTRY XYZ Glycan NODE 5 1 GlcNAc 15.0 7.0 2 GlcNAc 8.0 7.0 3 Man 1.0 7.0 4 Man -6.0 12.0 5 Man -6.0 2.0 EDGE 4 1 2:b1 1:4 2 3:b1 2:4 3 5:a1 3:3 4 4:a1 3:6 ///
Web3 Unique Representation of Carbohydrate Structures (WURCS) as a linear notation for representing carbohydrates for the Semantic Web. More details
Example of WURCS format:
Powered by Ruby on Rails.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Copyright © 2019 GlyCosmos Portal v1.1.0