CCP4I Documentation for Programmers: Core Documentation

CCP4i Documentation for Programmers: Core Documentation

Map Handling Utilities (utils/map_utils.tcl)

These procedures are usually used in run scripts but they could be used within the main ccp4i process.

Many of the utilities require function from $CCP4I_TOP/src/CCP4_utils.tcl and that the data in $CCP4I_TOP/etc/crystal.lib and these are initiallised by commands at the top of map_utils.tcl which are run at the global level when the file is first sourced.

ExtendMap Interface to mapmask program to extend map to cover molecule

Argument list: <mapin> <mapout> <xyzin> <border>

mapin Full path name of input map

mapout Full path name of output map

xyzin Full path name of molecule coordinate file

border Border in Angstrom around the molecule

ConvertMapAsu Run mapmask convert map to cover the standard CCP4 asymmetric unit

Argument list: <mapin> <mapout>

mapin Full path name of input map

mapout Full path name of output map

ConvertMapFormat Convert from CCP4 to alternative map formats

Argument list: <format> <mapin> <mapout> <LOG_FILE> <args>

Beware that this uses non-CCP4 programs: mapman from Uppsala and mbkall from MSI. The CCP4i installer must have specified, in the CCP4i configure window, the command used to run these programs on their installation.

NB There is not an XtalView map format - the SF file is imported to XtalView and it generates maps.

format Required format: O, QUANTA

mapin Full path name of input map

mapout Full path name of output map

LOG_FILE Optional name of a log file for log output

-nonorm

When outputting O format files do not normalise

CreateMap Create a map of optional type in various formats

Argument list: <format> <HKLIN> <mapVar> <title> <prog_labin> <labin> <args>

The FFT program is run to create the map and, if necessary, a file

format Required file format: CCP4, O, QUANTA, XtalView

HKLIN Input MTZ file

mapVar Output map file name, if necessary the file extension is reset.

title A title for the FFT job and the map file

prog_labin The input program labels for FFT (in TCL list format)

labin The input MTZ labels for FFT (in TCL list format)

-cover xyzin border

Extend map to cover molecule in coordinate file xyzin with border (in Angstrom)

-xtal xtal_labin

If output format is XtalView then xtal_labin is list of XtalView 'column labels'

-fftargs fftargs

Additional command file arguments for running FFT. fftargs should be a text string which may include line breaks

-tmp

Make the output file a temporary file with extension .tmp

-saveccp4 ccp4map

If the output format is not CCP4 then also save the CCP4 map as a file ccp4map.

MakeOMapMacro Create a macro for O to display map(s)

Argument list: <macro_file> <map_list> <label_list> <sigma_list> <\>

macro_file Name of macro file

map_list List of maps to be displayed (in TCL list format)

label_list List of titles for maps (TCL list format same length as map_list)

sigma_list List of contouring sigma levels (TCL list format same length as map_list)

colour_list List of colours for maps ((TCL list format same length as map_list)

extent ?Radius of map to display within O

-mtz mtzfile

Name of MTZ file from which to extract information on asymmetric unit so macro can be set up to display whole asymmetric unit

-mol mollist mollabel

Also display molecules from mollist (a list of PDB files) which will be assigned labels from mollabel. mollist and mollabel are TCL lists of the same length.

MakeXtalMacro Create a macro and crystal file for xtalview to display map(s)

Argument list: <hklin> <crystal_file> <macro_file> <\>

hklin MTZ file containing relevant crystal data

crystal_file Name of the cystal file which will contain cell and symmetry info

macro_file Name of macro file

xtalfiles List of XtalView SF files (in TCL list format)

coefficients_list List of contouring coefficients (TCL list same length as xtalfiles)

radius_list List of map radii (TCL list same length as xtalfiles)

GetCellfromMtz Extract header information from an MTZ file

Argument list: <mtzfile> <space_groupVar> <cellVar> <latticeVar>

Runs mtzdump script

mtzfile Input MTZ file

space_groupVar Returned space group

cellVar Returned cell dimensions (as Tcl list)

latticeVar Returned cell lattice type

GetMapHeader Extract header information from a map file

Argument list: <mapfile> <space_groupVar> <cellVar> <xyzlimVar> <gridVar> <\>

Runs the mapdmp script

mapfile Input map file

space_groupVar Returned space group

cellVar Returned cell dimensions (as Tcl list)

xyzlimVar Returned xyz limits of map (as Tcl list)

gridVar Returned number of grid points in xyz (as Tcl list)

axesVar Return order of axes in map (as Tcl list of X Y Z)

ConvertMapCell Extend map to cover unit cell

Argument list: <mapin> <mapout>

mapin Full path name of input map

mapout Full path name of output map

WatPeak UNTESTED Search input differnce map for water peaks

Argument list: <mapin> <xyzin> <peakout> <symmetry> <args>

Uses the mapmask program to extend map to cover (only) the volume of the protein and then uses peakmax to find peaks in map without atoms. Watpeak then trims the peaklist to around the protein.

mapin Full path name of input map

xyzin Full path name of input coordinate file

peakout List of 'water' peaks in PDB format

symmetry Space group namr input to watpeak

-title title

Input a title line for watpeak (as output to PDB peaks file)

AddXtalMapFom Add an extra column of dummy FOMS to an XtalView phases file

Argument list: <hkl_file>

XtalView phases file must have three columns: F, FOM phase. This procedure insert a FOM column to any file created without one.

hkl_file Input/output XtalView phases file - the file is overwritten

CalcCellVolume Calculate the cell volume

Argument list: <cell> <volumeVar>

cell Cell dimensions

volumeVar Returned cell volume

Coordinate Handling Utilities (utils/pdb_utils.tcl)

CalcCellVolume Get the atomic contents of PDB file

Argument list: <cell> <volumeVar>

Thi is based on rwcontents program

Return the content in the format of a nest list { { element_type_1 n_atoms_type_1 } .... { element_type_n n_atoms_type_n } } #d_arg pdb_file Input PDB file

nresVar Returned number of residues in the file

contentsVar Returned the content of the PDB by element

total_heavyVar Returned the total number of non-hydrogen atoms

AtomType Return the Element name for a given atomic number

Argument list: <atomic_number>

atomic_number Atomic numer

GetAminoInfo Return some information for a given input amino acid type

Argument list: <AA> <nhVar>

This currently will only return the number of non-hydrogen atoms in the residue, but could easily be extended with alternative information

AA Input 3-letter amino acid code (upper case)

nhVar Return the number of non-hydrogen atoms

ParsePDBId Extract the cards relevant to the atom id from a PDB ATOM line

Argument list: <line> <atnamVar> <resnamVar> <residVar> <segidVar>

line Input line of PDB file (must be ATOM/HETATM card)

atnamVar Returned atom name

resnamVar Returned residue type

residVar Returned residue id

segidVar Returned segment id

ParsePDBIz Extract atomic number from PDB atom card (or element type from name)

Argument list: <line> <izVar>

If the atomic number is missing from columns 69-70 then guesstimate the element type from the atom name and return the atomic number

line Input line of PDB file (must be ATOM/HETATM card)

izVar Return the atomic number

PDBRemoveZeroOcc Remove atoms with zero occupancy from PDB file

Argument list: <filein> <fileout> <nzerosVar> <args>

filein Input PDB file name

fileout Output PDB file name

nzerosVar Return the number of zero occupancy atoms

-chain sel_chain

Apply the edit only to the specified chain

ExtractPdbColumns Extract columns from the ATOM records of a PDB file

Argument list: <file> <column_list> <outputVar> <args>

Extract columns from the ATOM records of a PDB file beware this assumes a very consistent PDB file format with spaces between all columns.

file - input file name

colum n_list - Tcl list of number of the column(s) to return (NB 0=first column)

outputVar - output as a Tcl list of lists

MergePdbFiles Merge 2 or more PDB files

Argument list: <fileout> <filesin>

This procedure removes the header info from the second and subsequent files. It does not check for unique chain names.

fileout Output file name

filesin A list of input files.

PdbGetChains Get a list of the chains and the first & last residue ids in a PDB file

Argument list: <file> <chainVar> <chain_rangeVar> <args>

file Input PDB file

chainVar Returned list of chain ids

chain_rangeVar Returned nested list of first and last residues in chains

-nowat

Ignore atoms of type HOH|WAT|H2O|SOL

-atomid

Return the ids of the first and last atoms in a chain

ReadSequence Read a sequence file

Argument list: <seqfile> <sequenceVar> <nresVar>

Skip lines beginning (> , # or ; Ignore all white space and characters except range A to Z

seqfile Input sequence file

sequenceVar Return text string with one letter sequence code

nresVar Returned number of residues

HandleHarvestFile Add the output harvest file to the output file list for a job

Argument list: <mode> <pname> <dname> <program>

Dependent on the mode add the name of a harvest file to the output file list for the job. This will make the harvest file visible to the user. The file name is generated automatically by the program from the environment variables - see $CCP4/html/harvesting.html. the same file name generation is reproduced here.

mode Harvest mode - should be NOHARVEST, PROJECT or HARVEST -see

pname project name

dname dataset name

program name of the program as understood by the harvest mechanism

FindNCSTransforms For multi-chain PDB find the inter-chain transformations using Lsqkab program

Argument list: <xyzin> <chains> <chain_range>

The procedure useful in a job script - currently used for analysis in Molrep job.

For all pairwise combinations of chains find the transformation to superpose the two chains and output in short table. The full output of Lsqkab is not reported in the log file.

xyzin Input PDB file

chains A list of chain names

chain_range A two element list of the first and last residue ids which will be superposed

Mol Procedures for Handling Coordinate Data

The series of procedures Mol* are for handling coordinate data and are particularly used in the Sketcher module but could be helpful elsewhere.The coordinate data is loaded into a global array which is usually called Mol but the name is passed into the utility procedures so potentially different molecules could be loaded into different arrays. NOTE: These utilities are fine for a small number of atoms but will become slow or frozen for anything like a real protein. The procedures for reading the data can extract a specified residue from a large PDB file efficiently.

The elements of the Mol array are indexed by the atom number na which starts at 1 (not 0):

Name,$na the atom name

Element,$na the atom element type

Type,$na the atom type (default is 'no_type')

Charge,$na the formal, unit charge on the atom

Coor,$na the coordinates as a tcl list [list $x $y $z]

frag,$na an integer indicating which fragment of the molecule the atom belongs to - this is only set after calling MolFindFragments

The elements indexed by the bond number nb:

Bonds,$nb a list of two atom number for two connected atoms

Bondtype,$nb the bondtype - an integer for the formal bond order

nAtoms number of atoms

nBonds number of bonds

chem_comp_id the monomer id as used in CIF libraries

XY A list of coordinates - the coordinate of each atom is a 3 item list.

The first item of the list is 'NULL' so the atom numbers which start at 1 can index into the list.

MolReadPDB Read some or all atoms from a PDB file into the Mol array

Argument list: <MolVar> <file> <args>

Reading the PDB file of a whole protein will be very slow. It is possible to read an individual residue using the select option. Note that the procedure sketch_open_file in sketch.tcl shows how to create a file selection window with extra options to select a residue.

MolVar The name of the global array to be loaded

file Name of PDB file name

-select {[list} restype chain_id resid_id\]

Read only the first residue which matches the input residue type (restype), chain id and residue id (resid_id). One or two of the three elements in the input can be null - and there will be no requirement to match that identifier.

-noh

Do not read in hydrogen atoms.

MolReadCif Read atoms from a CIF file into the Mol array

Argument list: <MolVar> <file> <args>

Reading many atoms will be slow.

MolVar The name of the global array to be loaded

file Name of CIF file name

-noh

Do not read in hydrogen atoms.

read_cif_name Strip the inverted commas (quotes) from a cif atom name

Argument list: <name>

CIF definitions require quotes about any atom name containing a quote character. This function returns the name stipped of surrounding quotes.

name Atom name from CIF file

MolReadSybyl Read some or all atoms from a Sybyl file into the Mol array

Argument list: <MolVar> <file> <args>

Reading many atoms will be slow. This procedure has not been seriously used or tested.

MolVar The name of the global array to be loaded

file Name of Sybyl file name

-noh

Do not read in hydrogen atoms.

MolConvertSybylType placemarker for code to convert Sybyl atom types

Argument list: <type>

MolConvertSybylBond placemarker for code to convert Sybyl bond types

Argument list: <type>

MolBoundingBox Find centre of mass and bounding box for coordinates in Mol

Argument list: <MolVar> <minboundVar> <maxboundVar> <aveVar> <args>

Useful for centering molecule in display

MolVar Name of Mol array

minboundVar Returned list of minimum values of x,y,z

maxboundVar Returned list of maximum values of x,y,z

aveVar Returned list of average coordinates, x,y,z (not weighted by mol wt.)

MolTranslate Apply a translation to all coordinates in Mol array

Argument list: <MolVar> <translate> <args>

MolVar Name of Mol array

translate Input list of translation vector x,y,z

-first first_atom

Only apply translation to range of atoms starting at first_atom

-last last_atom

Only apply translation to range of atoms finishing at at last_atom

MolWriteCifCoords Write out a CIF coordinate format file

Argument list: <MolVar> <file> <args>

File will have minimal pdb style info and crystal parameters set arbitarilly

MolVar Name of Mol array

file Output file name

-id monomer_id

Put the monomer id monomer_id in the CIF file

-tran translation

Apply a translation given by translation - a list of ztran,ytran,ztran This option is NOT implemented.

write_cif_name Put quotes about name if it contains a quote character

Argument list: <name>

Function returns an atom name suitable for output ot CIF file

name Input atom name

MolReadPdbRestraints Read MODRES/SSBOND/LINK/CISPEP cards from pdb file

Argument list: <MolVar> <file> <args>

The parameters of PDB restraints cards are read into a Mol array. The actual name of the array is defined externally so does not have to be, and should not be, the same array used for coordinates.

The array will contain elements nModres, nSsbond, nLink and nCispep which are number of each restraint. The other parameter names are best seen in the code. The expected format of the PDB file is that slightly extended from PDB and defined in Libcheck documentationn.

MolCheckPdbRestraints Not implemented - should check sensible restraints defined

Argument list: <restrainstVar>

MolCheckReadFile Not implemented - should check sensible atom name input etc.

Argument list: <MolVar> <args>

MolFindFragments Find all the separate non-bonded fragments in Mol array

Argument list: None

If not all bonds are defined that molecule will appear to be more that one fragment. This procedure returns the number of fragments (ideally this should be 1) and for each atom na assigns a value to Mol(frag,$na) which is an integer value - all atoms with the same value are in the same fragment.

MolChiralVolume Find the chiral volume of four atoms in the Mol array

Argument list: <MolVar> <at1> <at2> <at3> <at4> <volobsVar>

Find the chiral volume for atom at1 and its three neighbours

This needs to give the sign of the chirality to conform with that used to Libcheck.

MolVar Name of global Mol array

at1 Atom number of chiral centre

at2,at3,at4 Three neighbouring atoms of chiral centre

volobsVar returned value of chiral volume

MolSaveParam Make a backup of Mol array values

Argument list: <MolVar> <store> <paramlist>

Save some of the current Mol params to the Mol element store

MolVar the name of the global Mol array

store the name of the element used to store the parameters

paramlist the list parameters to be stored.

Utilities for Phasing Tasks (utils/phasing_utils.tcl)

Mostly used by run scripts for experimental phasing. Only Uniqueify may have some general use.

ExtractShelxLog UNTESTED - Extract info from the Shelx log file

Argument list: <filename> <attype>

This is intended to be used to create a 'crossword table' of distances between heavy atom sites. Need to discuss this with Eleanor.

filename Name of shelx log file

attype Name of heavy atom type expected in log file

MakeShelxDismat Convert the distance list from shelx log file into a neat distance matrix

Argument list: <dist_list> <dismatVar>

NpoMapScale Devise scale for NPO plots with orthogonal sections on same scale

Argument list: <space_group> <cell> <N_SECTIONS> <axes_list> <NPO_MAX_SIZE> <\>

space_group space group

cell Unit cell

N_SECTIONS Not used

axes_list List of axes along with sections are required (as X, Y or Z)

NPO_MAX_SIZE Maximum acceptable size of plot

NPO_SCALEVar Returned proposed scale

FindScaleitDiff Run scaleit to find optimal value for exclude cutoff for Patterson

Argument list: <HKLIN> <F1> <SIG1> <F2> <SIG2> <diffVar>

HKLIN Input MTZ file

F1 Input MTZ FP column

SIG1 Input MTZ SIGFP column

F2 Input MTZ FPH column

SIG2 Input MTZ SIGFPH column

diffVar Returned proposed exclude difference cutoff

scaleit_analysis Extract date from scaleit log file for correlation analysis

Argument list: <mode> <arrayname> <log> <n> <m>

mode Can be all, pair or disp

arrayname Name of array used to carry output data

log Name of input log file

n Optional identifier for the first dataset in pairwise analysis

m Optional identifier for the ssecond dataset in pairwise analysis

scaleit_write_tab Write the summary of the scaleit log file(s) to the table file

Argument list: <arrayname> <tab> <nsets> <disp_diff>

arrayname Name of array used to carry analysis data

tab name of the table output file

nset Number of sets of derivative data

disp_diff Logical 1= output anomalous differences table

Uniqueify Run the Tcl version of 'uniqueify' script to standardise MTZ and add FreeR column

Argument list: <HKLIN> <HKLOUT> <args>

HKLIN Input MTZ file

HKLOUT Output MTZ file

-extend RESOLUTION_MAX

Extend the resolution to RESOLUTION_MAX

-import IMPORT_FREER_MTZ IMPORT_FREER_LABIN

Import the FreeR column from another MTZ file

-keep LABIN_FREER

Keep the FreeR column in the input file

frac -FREER_FRACTION

Override the usual fraction for FreeR reflections (0.05)

-sysa

Keep the systematically absent reflections in the MTZ file

AMoRe Utilities (utils/amore_utils.tcl)

Used in the amore run script to handle the .mr files and to interact with the mr model database.

amore_get_tabling_data Extract data from the amore tabling log file

Argument list: <file> <boxVar> <rcomVar> <comVar> <eulerVar>

The position and extent of the model in the unit cell is extracted from the log file. This procedure called from amore.scriptwhich then saves information in the amore database

file Amore tabling log file

boxVar Returned minimal box (a Tcl list of 3 elements)

rcomVar Returned minimal Sphere (one value)

comVar Returned centre of mass (a Tcl list of 3 elements)

eulerVar Returned rotation applied to orient molecule (a Tcl list of 3 elements)

amore_calc_model_cell Use the model extent and radius to calculate reasonable model cell

Argument list: <xtl_cell> <box> <radius> <cellVar> <irmaxVar>

Use formula that irmax should be the minimum of ( 0.75 * smallest model box dimension) and ( 0.5 * smallest xtl cell dimension)

xtl_cell The cystal cell lengths

box The extent of the model

radius The minimal sphere enclosing the model

cellVar Returned estimated model cell

irmaxVar Returned the proposed radius for the integration sphere

amore_get_log_solution Extract solutions from log file

Argument list: <mode> <model_list> <log_file> <\>

Will find rotation/translation and fitting solutions and writes tothe CCP4i 'mr' file

mode Should be rot, tra, fit, shift or self.

model_list A list of the models used in this run of Amore - indicates the number of sollution lines make up one solution

log_file name of input log file

solution_file Name of output 'mr' solution file

amore_update_database Update the mr_database.def file

Argument list: <def_file> <model_title> <update_list> <args>

Data extracted from log files is saved in the amore database. #d_arg def_file name of the amore mr database file

model_title Name of the model in the mr database

update_list List of paramters in name & value pairs

-noreport

Do not report progress to log file

amore_get_solution Extract solutions from an 'mr' file and write in input format for Amore

Argument list: <file> <fix> <model_listVar> <output_model_listVar> <\>

All lines beginning with # in the mr fiel are ignored. FIX keyword and mode lnumbers are inserted. If this is the second solution file read then this is entered int othis procedure as solution0. All solutions in solution0 needed to be permed with all solutions input in file.

file Input mr solution file

fix Will be keyword 'FIX' or blank

model_listVar Returned list of models (i.e. the names of models in the mr datase), for the current 'known' structure.

output_model_listVar returned list models, for the structure to beoutput by the next run of Amore.

nsol Number of solutions to be read (default is 'all')

solutionVar Returned the text input for Amore listing 'known' solutions.

solution0 Optional input of list of 'known' solutions read from anothermr file

amore_block_mr_database send command to main ccp4i to block update of the mr database

Argument list: <mode>

mode 1= initialise block, 0= release the block