Crystal Impact Home  
  About version 5 New functions Feature tour Download  

Diamond Version 5 User Manual: Structure files

Searching of Files

Search Diamond documents and structure files on your hard disk

In this article:
- File, folder, and format selection.
- Defining restraints for chemical composition, crystallography, as well as bibliographic references.
- Search for text strings and fragments in files.

Previous article: Access to COD ("Crystallography Open Database")
Next article: Database with inorganic structure types


Searching Structure Files

Structure files in several directories or even your complete hard disk can be searched for selected data. These files may be Diamond document files (*.diamdoc),  Diamond Structure Files (DSF 2 format, extension "*.dsf") or even foreign files that Diamond is capable of reading. In the latter case these foreign files will first be converted to temporary DSF 2 files and then searched.

To perform a search in selected files, choose the Search command from the File menu, or click the button in the standard toolbar. This will show the Search Files dialog as already described in the previous article "Access to COD ("Crystallography Open Database")":

File Search dialog

To define which files to search and what to search for, open the File Search Options dialog by clicking on the Search criteria... button. This dialog has multiple pages (tabs). The folders and file types to be searched are defined on the Location page of the dialog.


Location page of the "File Search Options" dialog

The lower half of this page refers to search of files on your hard disk. Set the checkmark in the checkbox "Search files and/or folders" to enable searching selected types of files in selected parts of your hard disk.

Remove the checkmark in the checkbox "Search database (sub-)files", unless you want to extend the search to the database (COD), too. Set the checkmark in the checkbox "Search files and/or folders". Add paths and file masks in the list "Files and folders to be searched". The path/file mask typically uses the form "<drive><path>\*.<extension>" but it can also be e.g. "C:\xxx\Smith-G-????.cif". Specify if sub-directories of the listed paths are to be included in the search or not.

Choose the file formats that may be considered during file search. Don't forget to adjust the file formats with the extensions used in the "Files and folders to be searched" list above.

Choosing the directory/ies and file type(s)

The file search can be performed in one directory or in all sub-directories of a selected directory. Enter the search path in the Path input field.

The search path defined in Path as well as the setting Include sub-directories define what files will be searched. The setting Foreign formats, too defines the formats of the files to be searched. If neither Include sub-directories nor Foreign formats, too are active, Diamond only searches files in DSF 2 format that match one of the names defined in the search path. The names in the path can have the "wild cards" * and/or ?. Additionally disk drive letter and directory should precede. Example: "C:\PROGRAMS\DIAMOND2\ICSDDATA\AL*.DSF" searches all files in the directory "\PROGRAMS\DIAMOND2\DIAMOND\ICSDDATA" of drive C: beginning with "AL" and having the extension "DSF".

To browse through the directories or to select other (network) drives, choose the Browse button. In the Browse dialog, choose the directory where you want to start the search, enter the file name mask (e.g. "*.CIF" or "AL*.DSF") in the input field File name, and then press OK.

The setting Include sub-directories enables you to search entire directories trees - even complete disk drives. Example: "C:\*.DSF" searches all DSF files of drive C:.

If the setting Foreign formats, too is active, not only files in DSF 2 format are searched but also files in foreign (i.e. non-DSF) formats - as far as they are defined in the search path. The same file formats as in the file open procedure of Diamond are supported: CIF, SHELX, CSD-FDAT, PDB, Crystin, and XYZ. Foreign formats are recognized automatically. Diamond creates temporary DSF 2 files for that search.


Searching for DSF (Diamond 2.x format) files in "C:\Temp\COD" and CIF files in "C:\Temp\COD-source". When you edit a file path, you can use the small button with the "..." to open the "Open File" dialog window. (Screenshot made with version 4 of Diamond.)


Choosing the search criterion

To choose what Diamond shall find in the selected files, use the input fields on the tabs Restraints and References, where the first contains chemical and crystallographic criteria, and the latter bibilographic criteria.

"Restraints" and "References" pages of the "File Search Options" dialog

Here you define restraints for chemical composition (elements, formula) and crystallographic data (space group, cell parameters) as well as for bibliographic data and concerning entry code and recording date or update.

Restraints page of File Search Options dialog

References page of File Search Options dialog

Since we have not enough sample files (but the database COD), we recommend to use the article "Access to COD ("Crystallography Open Database")" for details and for exercises how to search.

In any case you should bear in mind that the search process in files is slower than with a database, because the database files contain index entries for search acceleration, whereas simple files (Diamond documents, CIF etc.) are searched sequentially. Non-Diamond formats - like CIF - must be converted before the actual sequential search takes place.

 


"Find text" page of the "File Search Options" dialog

The search for text or a text fragment is different from the search on the other two pages "Restraints" and "References". Here you define a text (fragment) and select the text fields that Diamond shall search, whereas on "Restraints" and "References" every searchable field can have its own individual search item (text or number or formula etc.) The search for text (fragment) runs sequentially in any case, so it likely becomes a lengthy process.

The screenshot shows the available settings. Don't forget to set the checkmark in the checkbox "Find text or text fragment" to enable the sequential search for the text (fragment) in "Text (fragment)" in the fields with checkmarks set.


Searching for text fragment 'diazaphospholene' in publication title, common name and systematic name 


Starting and stopping the search

Now back to the Search Files dialog. After search criterion and path have been defined or checked, the searching procedure can be started with the Start search button. The search is performed in a separate thread, that means, you can continue your work with Diamond, while the file search is running. (For that case, you can minimize the Search Files window.)

You may stop the searching thread, before it is ready, by clicking again on the same button (which has the text "Stop search" during the search process).

Opening matching files

The number of scanned files as well as the number of files with at least one matching structural description is displayed above the result list, where all matching files will be listed. To open one or more files from the list of matching files, select the files and push the Open button. Otherwise push the Done button. The settings of the Search Files dialog (but not the result itself) will be stored in the Diamond section of the Windows Registry, including the search criteria.

Some hints for text search

If a text search has been selected, Diamond searches for the text specified in the Text (fragment) input field. It will search all text categories that are selected with the corresponding checkboxes, such as the authors' names, the title, several names, etc. For example, if both checkboxes for common and systematic name have been checked, and "chloride" has been entered in the text fragment field, all structures match with "chloride" either as common or systematic name. Select all text fields to be searched, if you do not know where the text (fragment) may be stored. (Please note: Diamond makes no text search in other fields, for example in space group symbols.)

If the checkbox as word is checked, a text fragment found is registered as match only, if it represents a whole word, but not if it is only part of a longer word. That means neither the last character before nor the first character after the matching fragment must be a letter.

If the checkbox case-sensitive is checked, the comparison is case-sensitive, that means "Structure" and "structure" would be treated as different.

Separate search thread
The search process, maybe a lengthy process lasting for several minutes, runs in a separate thread. You can minimize the "Search Files" window and continue working with Diamond on other documents. But this thread separation is/was possible in current and older versions of Diamond.


Previous article: Access to COD ("Crystallography Open Database")
Next article: Database with inorganic structure types