![]() |
||||||
|
|
|
Download |
Diamond Version 5 User Manual: Structure filesFile formats
In this article: Previous article: Importing Registry settings and recent pictures OverviewThe native data format that is used by Diamond versions 3, 4, and 5 is the so-called "Diamond Document" format. The files have the extension ".diamdoc". A Diamond document file may contain one or more structure data sets, each with one or more structure pictures. The native data format that was used by Diamond version 2 was the "Diamond Structure File" format, with the file extension ".DSF". Although the older version 1 of Diamond used the same file extension, old and new DSF format are quite indifferent. See "Diamond Structure File Format (DSF 2)" and "Compatibility with Diamond 1.x File Format" for more details about these Diamond file formats. Foreign (non-Diamond) structure data formats are recognized automatically and converted into a temporary DSF 2 file where data are read into the Diamond document from. The file format recognition and conversion is optionally (default: yes) assisted by a "File Import Assistant". Please note: The Diamond Document format (.diamdoc) is the only file format that can store all kind of data entered or changed in a Diamond session, whereas the other file formats supported by Diamond usually store structural parameters and some textual data only. The DSF format also stores structure picture data but only data that are compatible with the version 2 of Diamond. Diamond Version 3,4,5 Document Format (diamdoc)This is the native file format of Diamond versions 3, 4, and 5 where Diamond stores all structure and picture data the user may have changed during a Diamond session. It is a proprietary and binary format where the contents of the several classes Diamond uses to define (crystal/molecular) structure as well as picture objects are directly streamed. A "diamdoc" file begins with a specific header to identify the file format and to tell Diamond how many structure as well as picture objects are stored in the file. It is followed by a series of objects describing the first structure data sets, then followed by N series of objects describing the N pictures associated with the first structure data set. This pattern is repeated, if there are multiple structure data sets defined in the "diamdoc" file. Diamond Structure File Format (DSF 2)
The ancestor of the Diamond document format is the "Diamond Structure File" format (DSF 2).
It was the native structure file format used by Diamond version 2.
Like the Diamond document format, this format is binary, which means that it cannot be read by a text processor like CIF format. The DSF 2 format is still in use by the current Diamond version as temporary structure data (but not picture data) format after conversion from foreign files, e.g. CIF. Diamond 1.x format
This format is binary like DSF 2, but uses a quite different format than Diamond 2, and is thus treated as a "foreign" format like the other formats below.
You may get informations about the compatibility with DSF 1 format from the chapter "Compatibility with Diamond 1.x File Format".
The DSF 1 format is recognized by its special header. Endeavour File formatsDiamond also supports some of the file formats used by Endeavour, Crystal Impact's application for structure solution from powder diffraction data. Endeavour Document Format This format is the native structure (and picture) file format used by Endeavour and uses the file extension *.edf. Like "diamdoc" and "dsf" format, this format is binary and proprietary, which means that it cannot be read by a text processor (in contrast to e.g. files in CIF format). The EDF format is fully compatible with the DSF 2 format, though it usually contains some special data from or for structure solution calculations, which are ignored by Diamond when reading an EDF file. Endeavour Configuration File format This format is an ASCII text file, the file extension is "*.cfg". The file contains the crystal structure data (space group, cell, atomic parameters) of a so-called Endeavour configuration, which is simply said the result of a structure solution calculation. The first line contains the name of the corresponding space group symmetry file (*.sdt). The second line contains Endeavour's internal space group number (which is identical with the internal space group numbers used by Diamond) and the Hermann-Mauguin-symbol. In the next (third) line, the unit cell data (a, b, c, α, β, γ) and the overall temperature factor are given. The remaining lines contain the atomic data (one per line), with the element, charge, Wyckoff symbol, atom coordinates (x, y, z), a parameter which tells whether the corresponding atom is fixed to its position or coordinates, and a number giving the number of the rigid body this atom belongs to (0 means single atom). Please find more details in the Endeavour manual and tutorial, accessible from Endeavour through the Help menu. Endeavour Molecule format This format is also an ASCII text file. An Endeavour Molecule file (extension *.emo) has three parts: In the first part (first line), an identifier for the molecule (e.g. its trivial name) is given. In the second part (starting with the second line) atoms (element and charge) and positions (given as cartesian coordinates) are given. The third part contains the list of the bonds between the atoms. Each bond is given as the number of the first atom in part one, the number of the second atom, and a bond order number that is currently only used to distinguish between fixed (-1.0) and rotatable (1.0) bonds. The Endeavour manual and tutorial, accessible from Endeavour through the Help menu describes the creation of emo-files in detail in the chapter “Solution of molecular structure (2,4,6-Triisopropylbenzenesulfonamide)”. Foreign File FormatsDiamond supports a lot of additional ("foreign") crystal structure and molecular file formats and offers an optional data import assistant to confirm the (automatically detected) file format.
Foreign files are opened in principally the same way as native Diamond structure files. That means, there is no separate "Import" command like in some other competitive applications. Moreover, Diamond recognizes the format automatically and creates a temporary DSF 2 file (as copy) from the original file, where it reads data from later. Read the article "Opening and importing structure files" about several ways to open a structure file or to insert selected structure data into a Diamond document. If the file contains one structure description only, a structure window will be created for that structure automatically. If there are multiple structure descriptions, a list of all entries in the file will be displayed, where you can select one or more structures from. For details, see the article "Multiple Structure Descriptions in One File".
Foreign file formats supported by Diamond and how they are recognized
The following file formats can be recognized and read by Diamond version 5:
Crystallographic Information File (CIF) CRYSTIN Download format Cambridge Structural Database (CSD) FDAT format
Brookhaven Protein Data Bank (PDB) format
Philips X'Pert Plus (IDF) format
SHELX-93 format
KPlot format
XYZ format MDL Molfile format SYBYL MOL format SYBYL MOL2 format Cerius 2 CSSR format Schakal format (only output)
Data items read from a Crystallographic Information File
You can get informations about the Crystallographic Information File (CIF) format from the World-Wide Web site of the International Union of Crystallography (IUCr) in Chester, U.K. The original paper describing the CIF standard has been published in Acta Cryst. (1991). A47, 655-685. The latest core dictionary is available from http://www.iucr.org/cif/cif_core/index.html.
The following CIF data items are evaluated by Diamond 2:
Data read from the CRYSTIN Download format
A "CRYSTIN Download" file can contain multiple structure descriptions, usually of entries of the Inorganic Crystal Structure Database (ICSD). Diamond reads the following data from each entry of the file. (This information is taken from the ICSD-CRYSTIN user's manual, by G.Bergerhoff, B.Kilger, C.Witthauer, R.Hundt, R.Sievers, Institut fuer Anorganische Chemie der Universitaet Bonn, 1986):
1st record of every entry:
Byte
1- 8 ' 0000 ' Record label
9-14 Collection code
15 free
16-21 Relative entry number
22-80 free
This record is followed by other records with various formats,
according to the selected categories.
Categories with unstructured text:
Byte
1 free
2- 3 figures according to the categories:
1 - NAME (N) Compound name
2 - RFCD (V) Refcode (only in CCDF)
3 - MINR (M) Mineral name
4 - FORM (F) Chemical formula
5 - TITL (T) Title of publication
7 - AUT (A) Author's name
9 - SGR (R) Space group symbol after Hermann-Mauguin
16 - REM (Z) Additional remarks
19 - PICT (K) Information to produce a drawing in
optimized view
4 0
5 Number of continuation records. 0 if the text contains
less than 73 characters.
6- 8 free
9-80 Text. After each 72 characters a continuation record is
used.
Category REF (Q):
Byte
1- 8 ' 600 ' Record label
9-10 Year of publication
11-15 CODEN
16-22 Volume
23-29 First page
30-36 Last page (in CCDF: repetition of first page)
37 free
38 Character (if the page numbers consist of figures
with a preceding alphabetic character);
otherwise free
39-42 Issue, if necessary. Otherwise 0.
Category CELL (E). Missing fields are indicated by -1.0.
Standard deviations are given in an absolute scale.
Byte
1- 3 ' 800 ' Record label
9-17 a (A)
18-25 Sigma(a)
26-34 b (A)
35-42 Sigma(b)
43-51 c (A)
52-59 Sigma(c)
60-66 Alpha (degrees)
67-74 Sigma(alpha)
75-80 free
Byte
1- 3 ' 810 ' Record label
9-15 Beta (degrees)
16-23 Sigma(beta)
24-30 Gamma (degrees)
31-38 Sigma(gamma)
39-41 Number of formula units
42-47 Measured density
48-55 Sigma(density)
56-66 Unit cell volume (A**3)
67-80 free
Category SYM (S):
Byte
1- 8 ' 100 ' Record label
9-16 1st line of symmetry matrix
17-24 2nd line
25-32 3rd line
33-80 free
Category PARM (P):
Byte
1- 8 ' 1200 ' Record label
9-10 Atomic symbol
11-13 Number
14-19 Oxidation state
20-22 Number of positions. In CCDF: Multiplicity of
the general position.
23 Wyckoff symbol
24-31 x
32-39 Sigma(x)
40-47 y
48-55 Sigma(y)
56-63 z
64-71 Sigma(z)
72-80 free
Byte
1- 8 ' 1210 ' Record label
9 Type of temperature factor (B, U, A or L)
10-20 Isotropic temperature factor if byte 9 is B oder U
21-29 Standard deviation of isotropic temperature factor
30-36 Site occupation, related to the multiplicity of the
special position
37-44 Standard deviation of the site occupation
45 Number of non-located hydrogen atoms (H or D) bonded
to the atom
46 H or D, if byte 45 is not 0
47-80 free
Categories BETA (B), BIJ (C) or UIJ (D):
Byte
1- 8 ' 1300 ' = Temperature factor type "Beta(i,j)"
' 1400 ' = Type "B(i,j)"
' 1500 ' = Type "U(i,j)"
9-10 Atomic symbol
11-13 Number
14-24 Factor (1,1)
25-33 Sigma
34-44 Factor (2,2)
45-53 Sigma
54-64 Factor (3,3)
65-73 Sigma
74-80 free
Byte
1- 8 ' 1310 ' = Temperature factor type "Beta(i,j)"
' 1410 ' = Type "B(i,j)"
' 1510 ' = Type "U(i,j)"
9-19 Factor (1,2)
20-28 Sigma
29-39 Factor (1,3)
40-48 Sigma
49-59 Factor (2,3)
60-68 Sigma
69-80 free
Category RVAL (I):
Byte
1- 8 ' 1800 ' Record label
9-17 R value
18-80 free
Category TEST (Y):
Byte
1- 8 ' 2000 ' Record label
9-10, 11-12, 13-14 etc.: Code figures
Category SMAT:
Byte
1- 4 ' 240' Record label
5- 6 Number of SMAT Records which follow immediately
7- 8 free
9 Keys for the Bravais translations:
0 = P 3 = B 7 = R (obverse)
1 = F 4 = C 8 = R (reverse)
2 = A 5 = I (7 or 8 only for space
groups in hexagonal setting)
10 free
11-18 t1
19-22 r12
23-26 r13
27-30 r14
31-38 t2
39-42 r21
43-46 r22
47-50 r23
51-58 t3
59-62 r31
63-66 r32
67-70 r33
71-73 Space group number (Int. Tables) or 0
Data read from a Philips X'Pert Plus IDF file
Diamond recognizes the IDF format by the keyword "[ANZAHLATOME]". Since an IDF file may contain multiple (N) phases, it extracts N sets of structural parameters. If N is greater than one, the Load dialog is displayed after you have opened an IDF file, where you can select one or more crystal structure descriptions from. This behaviour is comparable with the CIF/Crystin Import Structures dialog in X'Pert Plus. If N is one, a new structure window is opened automatically without further request.
Diamond uses the keyword "[PHASE n]" keyword in the IDF file as beginning of the n-th structural description. For every structure, Diamond extracts data from the following sections (keywords given in brackets) into the mentioned Diamond data fields
Data read from a SHELX file
The following data are read from a SHELX file:
Other data items, such as UNIT, AXIS, etc. are not read by Diamond. If the first four character of a line do not represent a SHELX keyword, this line will be interpreted as atom description, whereas four blanks define a comment.
Atomic parameters
Diamond reads at least four parameters following the atom symbol. The first parameter is used to derive the element symbol from the previous SFAC definitions. The following parameters are the x, y, and z value of the atomic position in fractional coordinates. These parameters may be followed by the site occupation factor, the isotropic or equivalent displacement parameter of type U, or the six values of the anisotropic displacement parameters of type U.
Data read from CSD FDAT format
A file in the FDAT format of the Cambridge Structural Database (CSD) may contain multiple entries. The '#' character followed by a maximum of eight characters or digits at the beginning of a line defines the entry point of the first or a following structure description.
The characters #2 through #9 are taken as CSD Reference Code (the first character of a line has number #1). The following informations of the first line of an entry will be read:
· The recording date of the entry, which has the format "yymmdd".
· The year of the publication of the entry, which has two digits only.
The following line, which is not mandatory in CSD FDAT format, contains the cell parameters and space group information:
· The cell parameters a, b, c, alpha, beta, and gamma in Angstroems and degrees, rsp.
· The standard uncertainties of the cell parameters.
· The measured density in grams per cubic centimeter.
· The space group number according to International Tables for Crystallography, Vol. A.
· The Hermann-Mauguin space group symbol.
· The number of formula units per unit cell.
The following line(s) define several numeric and text fields:
· The R value.
· Remarks including "Disorder field" and "Error field".
The next lines contain the coded symmetry matrices, if symmetry information is given.
If radii are given in the CSD FDAT file, these values will be read from the following line. These definitions consist of pairs of element symbol and radii in picometers. These values will be used as effective radii for the atom type definitions in Diamond.
For each atom in the parameter list only element symbol and the coordinates x, y, and z are given. These coordinates are fractional, unless no cell parameters have been defined.
Bond distances will be ignored.
Bibliographic informations
When converting a CSD FDAT file, Diamond searches for a file with the same name of the FDAT file but with the extension JNL, to retrieve additional bibliographic informations:
· The name of the compound.
· The structural formula.
· The name(s) of the author(s).
· The citation.
Data read from a PDB file
The first line of a Protein Data Bank (PDB) file begins with "HEADER". From the first line, also the recording date of the PDB entry as well as the database code are read.
The following table lists the keywords that are interpreted by Diamond when reading from a PDB file:
Data read from the XYZ format
The XYZ format is rather simple. The contents of the second line is treated as title. The following lines are atomic parameters containing element symbol and cartesian coordinates x, y, and z.
Compatibility with Diamond 1.x File Format
Version 2.0 or higher of Diamond uses a file format which is different from that format used by version 1.x of Diamond (DSF 1 format).
Since there are some new options in Diamond 2.0 such as rendered representation, the new file version can store more information than the older file version. In other words, the new file version can store every data that can be stored in the old file version, but not vice versa. That means, if you open an old Diamond 1.x file, some settings become default settings (like the ones for rendered representation). If you save a structure document in the old file format, these settings will get lost. (Only with the Diamond 2.0 format, you can store every data you have edited with Diamond version 2.0.)
The following table will show you where Diamond uses default settings if you open a file that has been created with Diamond 1.x - and where data would be dismissed, if you save data from a Diamond 2.0 document in the old file format.
Rendered representation
Since Diamond 1.x uses no rendered representation for the structure picture, there are no items for light properties, material properties, transparency, etc. in the DSF 1 format. When converting to DSF 2 format, default settings will be used. The default settings are described in the article "Rendered Representation: Overview
Designs of atoms in the structure picture
Diamond 2 uses three different colors (interior color, edge color, pattern color) for an atom, whereas Diamond 1.x uses one main color and some few options for a "secondary color". The radius of every atom can be changed individually in Diamond 2, whereas Diamond 1.x uses the radius which has been defined for the corresponding atom type.
Designs of bonds
Diamond 2 uses two different colors (interior color, edge color) for thick bonds, which are represented as tubes or truncated cones, whereas Diamond 1.x uses one main color and some few options for a "secondary color" (compare atom designs).
Atom and bond labels
Diamond 2 uses a third coordinate (z) for the distance between label and object reference point (e.g. the center of the atom), which is necessary for rendered representation. When converting label data from DSF 1 format, Diamond 2 sets the z-value to 0.5 (Angstroems) to avoid that the label text penetrates the atom sphere.
There is no limitation for the label distance (+/-1.27 Angstroems from the reference point in Diamond 1.x) in DSF 2 format.
In Diamond 2, every single label can have its own color, whereas DSF 1 offers two choices only: one color for all labels, or each label has the same color as the corresponding atom or bond.
Polyhedra
Since no material properties are defined for polyhedron faces in DSF 1 format, there is no transparency. Transparency will be set to zero, when DSF 1 data are converted to DSF 2 format, which results in opaque polyhedron faces.
Limitations for bibliographic data
The sum of the lengths of all bibliographic texts must not exceed 65,532 bytes in DSF 1 format, whereas DSF 2 has no restrictions for that.
Previous article: Importing Registry settings and recent pictures
|
Page last modified September 20, 2023. Copyright © 2023 Crystal Impact GbR. All rights reserved. Contact Webmaster |