WaterNetworkAnalysis
WaterNetworkAnalysis functions
Align the trajectory. |
|
Align and extracts waters from trajectory. |
|
Extract waters for clustering analysis. |
|
Read results from files and generate a pdb file. |
|
Generate pdb file with clustering results. |
|
Return selection string for given residue ids. |
|
Compute centre of selection with MDAnalysis. |
|
Generate oxygen density maps. |
WaterNetworkAnalysis Module for preparation of raw trajectories for analysis of conserved waters for ConservedWaterSearch
- WaterNetworkAnalysis.align_and_extract_waters(center_for_water_selection: np.ndarray, trajectory: str, aligned_trajectory_filename: str, align_target_file_name: str, topology: str | None = None, every: int = 1, align_mode: str = 'mda', align_target: int | None = -1, align_selection: str = 'protein', probis_exec: str | None = None, dist: float = 12.0, SOL: str = 'SOL', OW: str = 'OW', HW: str = 'HW') tuple[np.ndarray] [source]
Align and extracts waters from trajectory.
Aligns the trajectory first and then extracts water molecules for further water clustering analysis. If trajectory has already been aligned, one can use
extract_waters_from_trajectory()
to extract the water molecules for water clustering analysis.- Parameters:
center_for_water_selection (np.ndarray) – Coordiantes around which all water molecules inside a radius
dist
will be seleceted for water clustering analysis.trajectory (str) – File name of the trajectory from which waters will be extracted.
aligned_trajectory_filename (str) – File name to which aligned trajectory will be saved.
align_target_file_name (str) – File name for saving the align target (usually pdb) if
align_target
isint
. If align target isNone
, the align target will be read from this file instead!topology (str | None, optional) – Topology file name. Defaults to
None
.every (int, optional) – Take every
every
snapshot instead of taking all the snapshots (every = 1) for alignment. Defaults to 1.align_mode (str, optional) – Align algorithm to use. “mda” uses MDAnalysis while “probis” uses the probis algorithm. Defaults to “mda”.
align_target (int | None, optional) – Align target. If
None
the align target is read from the align_target_file_name. If a number is given uses the given snapshot of the trajectory as the align target. If -1 uses the last snapshot. Defaults to -1.align_selection (str, optional) – Selection to align to. Defaults to “protein”.
probis_exec (str | None, optional) – location of probis executable if probis is used. If None it is downloaded from the internet. Defaults to None.
dist (float, optional) – Radius around
center_for_water_selection
to be used for extraction of water molecules. Defaults to 12.0.SOL (str, optional) – Residue name for waters. Defaults to “SOL”.
OW (str, optional) – Name of the oxygen atom. Defaults to “OW”.
HW (str, optional) – Name of the hydrogen atom. Defaults to “HW”.
- Returns:
Returns coordinates of oxygen atoms, first hydrogen atom and second hydrogen atom in three seperate numpy arrays. Each row in each array makes up coordinates of a single water molecule.
- Return type:
tuple[np.ndarray, np.ndarray]
Example:
# Generate water coordinates for clustering analysis from unaligned trajectory resids = [8,12,143,144] align_and_extract_waters( get_center_of_selection(get_selection_string_from_resnums(resids)), trajectory = 'trajectory.xtc', aligned_trajectory_filename = 'aligned_trj.xtc', align_target_file_name = 'aligned.pdb', topology = 'topology.tpr', every = 1, align_mode = "mda", align_target= 0, align_selection = "protein", dist = 10.0, )
- WaterNetworkAnalysis.align_trajectory(trajectory: str, output_trj_file: str, align_target_file_name: str, topology: str | None = None, every: int = 1, align_mode: str = 'mda', align_target: int | None = -1, align_selection: str = 'protein', probis_exec: str | None = None) None [source]
Align the trajectory.
Before running water clustering for identification of conserved water molecules the trajectory should be aligned first. Alignment can be done via MDAnalysis or using the probis algorithm. Whole protein is aligned by default. To select the align reference state either select an integer for
align_target
and specify a file name to which the align target will be saved to withalign_target_file_name
OR set align_target toNone
andalign_target_file_name
will be read and used as align target.The trajectory or topology should contain information on bond topology for alignment. Supported topology file types:
DATA DMS GSD MMTF MOL2 PARMED PDB ENT PSF TOP PRMTOP PARM7 TPR TXYZ ARC XML XPDB
Alternatively the whole trajectory can be provided in some of the above given file types as well.
- Parameters:
trajectory (str) – File name containing unaligned trajectory.
output_trj_file (str) – output file name for aligned trajectory.
align_target_file_name (str) – File name for saving the align target (usually pdb) if
align_target
is int. If align target isNone
, the align target will be read from this file instead.topology (str | None, optional) – Topology file name. Defaults to
None
.every (int, optional) – Take every
every
snapshot instead of taking all the snapshots (every = 1) for alignment. Defaults to 1.align_mode (str, optional) – Align algorithm to use. “mda” uses MDAnalysis while “probis” uses the probis algorithm. Defaults to “mda”.
align_target (int | None, optional) – Align target. If
None
the align target is read from thealign_target_file_name
. If a number is given uses the given snapshot of the trajectory as the align target. If -1 uses the last snapshot. Defaults to -1.align_selection (str, optional) – Selection to align to. Defaults to “protein”.
probis_exec (str | None, optional) – location of probis executable if probis is used. If
None
it is downloaded from the internet. Defaults toNone
.
Example:
# align the trajectory and save to a file align_trajectory( trajectory="trajectory.xtc", output_trj_file="aligned_trajectory.xtc", align_target_file_name='aligned.pdb', align_mode="mda", align_target=0, align_selection="protein", topology="topology.tpr", )
- WaterNetworkAnalysis.calculate_oxygen_density_map(selection_center: np.ndarray, trajectory: str, topology: str | None = None, dist: float = 12.0, delta: float = 0.4, every: int = 1, SOL: str | None = None, OW: str | None = None, output_name: str = 'water.dx') Density [source]
Generate oxygen density maps.
Generate oxygen density maps using MDAnalysis.
- Parameters:
selection_center (np.ndarray) – center of selection around which waters will be selected.
trajectory (str) – trajectory filename.
topology (str | None, optional) – Topology filename if available. Defaults to None.
dist (float, optional) – distance around selection center inside which the oxygen will be selected. Defaults to 12.0.
delta (float, optional) – bin size for density map. Defaults to 0.4 Angstroms.
every (int, optional) – Take every
n_every
snapshot instead of taking all the snapshots (every = 1) for alignment. Defaults to 1.SOL (str, optional) – Residue name of the water residue. If
None
it will be determined automatically. Defaults to None.OW (str, optional) – Name of the oxygen atom. If
None
it will be determined automatically. Defaults to None.output_name (str, optional) – name of the output file, it should end with ‘.dx’ . Defaults to “water.dx”.
- Returns:
returns MDA Density object containing the density map
- Return type:
Density
Example:
# Generate water oxygen density map near active site resids = [8,12,143,144] calculate_oxygen_density_map( get_center_of_selection( get_selection_string_from_resnums(resids)), trajectory = 'trajectory.pdb' ) )
- WaterNetworkAnalysis.extract_waters_from_trajectory(selection_center: np.ndarray, trajectory: str, topology: str | None = None, dist: float = 12.0, every: int = 1, SOL: str | None = None, OW: str | None = None, HW: str | None = None, extract_only_O: bool = False, save_file: str | None = None) tuple[np.ndarray, np.ndarray] [source]
Extract waters for clustering analysis.
Calculates water (oxygen and hydrogen) coordinates for all the waters in the aligned trajectory using MDAnalysis for further use in water clustering. The trajectory should be aligned previously.
- Parameters:
selection_center (np.ndarray) – coordinates of selection center around which waters will be selected.
trajectory (str) – Trajectory file name.
topology (str | None, optional) – Topology file name. Defaults to None.
dist (float, optional) – Distance around the center of selection inside which water molecules will be sampled. Defaults to 12.0.
every (int, optional) – Take every
every
snapshot instead of taking all the snapshots (every = 1) for alignment. Defaults to 1.SOL (str, optional) – Residue name of the water residue. If
None
it will be determined automatically. Defaults to None.OW (str, optional) – Name of the oxygen atom. If
None
it will be determined automatically. Defaults to None.HW (str, optional) – Name of the hydrogen atom in water. Names checked will be the provided name and the name with a 1 or 2 appended. If
None
it will be determined automatically. Defaults to None.extract_only_O (bool, optional) – If
True
only oxygen atom positions. Defaults to False.save_file (str | None, optional) – File to which coordinates will be saved. If none doesn’t save to a file. Defaults to None.
- Returns:
returns xyz numpy arrays that contain coordinates of oxygens, and combined array of hydrogen 1 and hydrogen 2 coordinates. If
extract_only_O
is True, returns only oxygen coordinates.- Return type:
tuple[np.ndarray, np.ndarray]
Example:
# Generate water coordinates for clustering analysis resids = [8,12,143,144] coordO, coordH = extract_waters_from_trajectory( get_center_of_selection(get_selection_string_from_resnums(resids)), trajectory = 'trajectory.xtc', topology = 'topology.tpr' )
- WaterNetworkAnalysis.get_center_of_selection(selection: str, trajectory: str, topology: str | None = None) np.ndarray [source]
Compute centre of selection with MDAnalysis.
Calculates coordinates in xyz of the centre of selection using MDAnalysis.
- Parameters:
- Returns:
returns array that contains coordinates of center of selection
- Return type:
np.ndarray
Example:
# find center of active site defined by residue ids resids = [8,12,143,144] get_center_of_selection(get_selection_string_from_resnums(resids))
- WaterNetworkAnalysis.get_selection_string_from_resnums(resids: list[int], selection_type: str = 'MDA') str [source]
Return selection string for given residue ids.
Returns the selection command string for different programs based on amioacid residue IDs list given.
- Parameters:
- Returns:
selection command in form of a string
- Return type:
Example:
# list of resids resids = [8,12,143,144] # print PYMOL selection string get_selection_string_from_resnums(resids, selection_type = "PYMOL")
- WaterNetworkAnalysis.make_results_pdb_MDA(water_type: list[str], waterO: np.ndarray, waterH1: np.ndarray, waterH2: np.ndarray, output_fname: str, protein_file: str | None = None, ligand_name: str | None = None, mode: str = 'SOL') None [source]
Generate pdb file with clustering results.
The water molecules determined by the clustering procedure are written in a pdb file which also contains protein and the ligand. Waters are labeled based on their hydrogen orientations (FCW for fully conserved, HCW for half conserved and WCW for weakly conserved). Uses MDAnalysis for construction of the pdb file. First 4 arguments of the function can be read from the results file by using cws.utils.read_results() or directly from the
cws.water_clustering.WaterClustering
class.- Parameters:
waterO (np.ndarray) – numpy array containing coordinates of conserved waters’ oxygens.
waterH1 (np.ndarray) – numpy array containing coordinates of conserved waters’ first hydrogen
waterH2 (np.ndarray) – numpy array containing coordinates of conserved waters’ second hydrogen
output_fname (str) – name of the output pdb file. Must end in ‘.pdb’.
protein_file (str | None, optional) – file which contains protein and ligand. It should be aligned in the same way as the trajectory used for calculation of conserved waters. If None no protein is saved. Defaults to None.
ligand_name (str | None, optional) – residue name of the ligand. If None no ligand is saved. Defaults to None.
mode (str, optional) –
mode in which conserved waters will be saved. Options:
”SOL” - default mode. Saves water molecules as SOL so that visualisation softwares can recognise them as waters. No distinction is made between different types of conserved waters.
”cathegorise” - cathegorises the waters according to hydrogen orienation into fully conserved (FCW), half-coserved (HCW) and weakly conserved (WCW). This mode makes visualisers not able to recognise the waters as water/sol but usefull for interpreting results.
Example:
# Generate pdb results file make_results_pdb_MDA( *cws.utils.read_results(), output_fname = 'results.pdb', mode = 'cathegorise' )
- WaterNetworkAnalysis.read_results_and_make_pdb(fname: str = 'Clustering_results.dat', typefname: str = 'Type_Clustering_results.dat', protein_file: str = 'aligned.pdb', ligand_name: str | None = None, output_fname: str | None = None, mode: str = 'SOL') None [source]
Read results from files and generate a pdb file.
- Parameters:
fname (str, optional) – File name with clustering results - coordinates of water molecules. Defaults to “Clustering_results.dat”.
typefname (str, optional) – File name which contains the types of each water molecule. Defaults to “Type_Clustering_results.dat”.
protein_file (str, optional) – File name which contains the reference structure trajectory has been aligned to. Defaults to “aligned.pdb”.
ligand_name (str | None, optional) – Residue name for the ligand. If none is given, no ligand is extracted and visualised/saved in the pdb file. Defaults to None.
output_fname (str | None, optional) – Name of the output file (pdb prefered). Defaults to None.
mode (str, optional) –
mode in which conserved waters will besaved. Options:
”SOL” - default mode. Saves water molecules as SOL so that visualisation softwares can recognise them as waters. No distinction is made between different types of conserved waters.
”cathegorise” - cathegorises the waters according to hydrogen orienation into fully conserved (FCW), half-coserved (HCW) and weakly conserved (WCW). This mode makes visualisers not able to recognise the waters as water/sol but usefull for interpreting results.
Example:
# Generate pdb results file read_results_and_make_pdb( fname = 'Results.dat', typefname = 'TypeResults.dat', protein_file = 'aligned_protein.pdb', ligand_name = 'UBY', output_fname = 'results.pdb', mode = 'cathegorise', )