PDB Chemical Component Dictionary Format Description: mmCIF Format

mmCIF format combines collections of related data items (tokens) into categories. A category is essentially a table in which each token represents a row in the table. The question mark (?) is used to mark an item value as missing. A period (.) may be used to identify that there is no appropriate value for the item or that a value has been intentionally omitted.

 

Vectors and tables of data may be encoded in mmCIF using a loop_ directive. To build a table, the data item names corresponding to the table columns are preceded by the loop_ directive, and followed by the corresponding rows of data.

 

Note: Further description of mmCIF syntax and structure can be found here.

 

 

 

In an mmCIF format coordinate file the chem_comp category is used to describe the chemical components in an entry. The chemical name for the chemical component is given by chem_comp.name, the chemical formula by chem_comp.formula, and the molecular weight by chem_comp.forumla_weight.

 

For example entry 1a7p contains the non-standard residue acetic acid (ID code: ACY):

 

loop_

_chem_comp.id

_chem_comp.type

_chem_comp.mon_nstd_flag

_chem_comp.name

_chem_comp.pdbx_synonyms

_chem_comp.formula

_chem_comp.formula_weight

ACY non-polymer         . 'ACETIC ACID'   ? 'C2 H4 O2'        60.052  

#

 

Further information describing each non-standard residue is then provided in the Chemical Component Dictionary.Dictionary Record Format.

 

Note: Please see the mmCIF format dictionary for more information about the chem_comp category. Further description of mmCIF syntax can be found here.

 

 

 

Dictionary Record Format

In the mmCIF format Chemical Component Dictionary, each chemical component is defined by sets of tokens in the three categories: chem_comp (Table 1), chem_comp_atom (Table 2), chem_comp_bond (Table 3).

 

 

Table 1: chem_comp category

Token

 

Definition

 

Example

_chem_comp.id

 

The alphanumeric code for the chemical component.

 


HYP

_chem_comp.name

 

The name of the chemical component.

 


4-HYDROXYPROLINE

_chem_comp.type

 

The type of monomer.

 


L-peptide linking

_chem_comp.pdbx_type

 

A preliminary internal classification used by PDB.

 


ATOMP

_chem_comp.formula

 

The chemical formula of the chemical component.

 


C5 H9 N1 O3'

_chem_comp.pdbx_synonyms

 

Synonym list for the non-standard residue.

 


HYDROXYPROLINE

_chem_comp.mon_nstd_parent

 

A name of the parent monomer of the chemical component if the entry results from a modification of a standard monomer.

 


PRO

_chem_comp.pdbx_formal_charge

 

The formal charge on the chemical component.

 


+1

_chem_comp.mon_nstd_flag

 

Flag indicating whether or not the chemical component is a "standard" monomer.

 


n

_chem_comp.formula_weight

 

Formula mass in daltons of the chemical component.

 


131.131

Table 2: chem_comp_atom category: tokens in this section are looped through for each atom in the chemical component

Token

 

Definition

 

Example

_chem_comp_atom.comp_id

 

Same as _chem_comp.id

 


HYP

_chem_comp_atom.atom_id

 

Identifier for each atom in the chemical component.

 


CA

_chem_comp_atom.type_symbol

 

The atom type for each atom in the chemical component.

 


C

_chem_comp_atom.charge

 

The formal charge assigned to each atom in the chemical component.

 


0

_chem_comp_atom.model_Cartn_x

 

The x component of the coordinates for each atom in the chemical component specified as orthogonal angstroms.

 


26.052

_chem_comp_atom.model_Cartn_y

 

The y component of the coordinates for each atom in the chemical component specified as orthogonal angstroms.

 


5.609

_chem_comp_atom.model_Cartn_z

 

The z component of the coordinates for each atom in the chemical component specified as orthogonal angstroms.

 


5.594

_chem_comp_atom.pdbx_align

 

Determines which column the atom name appears in PDB coordinate files. The possible values are 0 or 1.

 


1

 

Table 3: chem_comp_bond category: tokens in this section are looped through for each bond in the chemical component

Token

 

Definition

 

Example

_chem_comp_bond.comp_id

 

Same as _chem_comp.id

 


HYP

_chem_comp_bond.atom_id_1

 

The id of the first of the two atoms that define the bond.

 


N

_chem_comp_bond.atom_id_2

 

The id of the second of the two atoms that define the bond.

 


CA

_chem_comp_bond.value_order

 

The bond order of the chemical bond associated with the specified atoms.

 


SING

Example: Acetic Acid

Diagram of Acetic Acid

 

Note: Diagrams are not included in the Chemical Component Dictionary. It is included here for illustrative purposes.

 

 

mmCIF Format Chemical Component Dictionary Entry for Acetic Acid  

data_ACY

#

_chem_comp.id               ACY

_chem_comp.name             'ACETIC ACID'

_chem_comp.type             non-polymer

_chem_comp.pdbx_type        HETAS

_chem_comp.formula          'C2 H4 O2'

_chem_comp.mon_nstd_flag    .

_chem_comp.formula_weight   60.052

#

loop_

_chem_comp_atom.comp_id

_chem_comp_atom.atom_id

_chem_comp_atom.type_symbol

_chem_comp_atom.charge

_chem_comp_atom.model_Cartn_x

_chem_comp_atom.model_Cartn_y

_chem_comp_atom.model_Cartn_z

_chem_comp_atom.pdbx_align

ACY C   C 0 ? ? ? 1 ACY O   O 0 ? ? ? 1

ACY OXT O 0 ? ? ? 1 ACY CH3 C 0 ? ? ? 1

ACY 1H  H 0 ? ? ? 0 ACY 2H  H 0 ? ? ? 0

ACY 3H  H 0 ? ? ? 0 ACY HXT H 0 ? ? ? 1

# loop_

_chem_comp_bond.comp_id

_chem_comp_bond.atom_id_1

_chem_comp_bond.atom_id_2

_chem_comp_bond.value_order

ACY C   O   DOUB

ACY C   OXT SING

ACY C   CH3 SING

ACY OXT HXT SING

ACY CH3 1H  SING

ACY CH3 2H  SING

ACY CH3 3H  SING

 

Return to Contents