atom numbering??

04-03-2005 23:09:03

This method returns the standardized molecule.

See

http://www.chemaxon.com/marvin/doc/api/chemaxon/marvin/plugin/CalculatorPlugin.html#setMolecule(chemaxon.struc.Molecule,boolean,boolean)

In PMapper the built-in standardization is configured in the <StandardizerConfiguration> subsection of the configuration XML.

Unfortunately in the current version the standardized molecule is used for atom indexing, but it is not returned to the caller.

This is a bug which will be fixed in the next major JChem release. I am planning to implement a similar solution as in the case of plugins: pmaps will be returned either in the original atom order or the atom order of the standardized molecule, depending on a parameter.

There is a related topic in this subject:

http://www.chemaxon.hu/forum/viewpost1502.html#1502

However, there are two more problems that remain:

(1) If you use outside standardization with a separate Standardizer object then PMapper and the plugins have no chance to know the original molecule and therefore the standardized molecule will determine the atom order

(2) In case of microspecies, the plugin returns the microsepcies molecules generated from the standardized molecule. In this case standardization include dehydrogenization which changes the atom order. Therefore if you use these microspecies as an input to PMapper, then the output cannot be returned in the original atom order (which is the atom order of the input molecule, unknown to PMapper) - and since these microspecies exist as molecule objects in memory, there is no atom order the user knows. I have no solution to this - suggestions are welcome.

PS: you did not attach you pmapper and standardizer config XML - but I could run your program with my sample XMLs.

07-03-2005 12:21:36

I agree that it would be useful to implement some sort of atom identification mechanism - we already considered this but would require a lot of work - it is not clear to me how to generate these ID-s for simmetrical atoms? The SMILES form is not a good basis because not every molecule can be exported to SMILES - and the order in the SMILES string may change as the molecule changes - even if the atoms are left unchanged.

An obvious identifier is the MolAtom object pointer itself. I know this is very programmatic and not very user-friendly for a printout but you can use this in your program - plugins also use the object pointers to identify atoms in the molecule before and after standardization, in this way you can refer to atoms by their index in the original molecule.

The atom symbol, charge, coordinates can be retrieved through the MolAtom API:

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html

Symbol:

String getSymbol()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html#getSymbol()

Charge:

int getCharge()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html#getCharge()

Coordinates:

double getX(), getY(), getZ()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html#getX()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html#getY()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MolAtom.html#getZ()

How these are changed by standardization depends on your standardization tasks. Basically standardization changes your original molecule, that is, keeps the original Molecule object, therefore the identification with MolAtom object pointers will work. MolAtom objects may disappear, change some of their properties (e.g. charge) and new MolAtom objects may be created. Atom coordinates are changed by Clean actions.

07-03-2005 15:37:44

To get the atom objects:

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#getAtomArray()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#getAtomCount()

http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#getAtom(int)

Back to the microspecies problem: unfortunately the microspecies are generated molecules having different atom objects from the original molecule.

I attach some "tricky" example code that shows the correspondence between the original molecule atoms and atoms in the microspecies. The tricky part is that I use the fact that the only standardization task that changes the atom order is dehydrogenize - in our case. The correspondence is based on the following facts:

1) Although the microspecies contain different atoms from the origianl molecule,

the microspecies atom order corresponds to the atom order in the dehydrogenized input molecule.

2) The correspondence between atoms in the original molecule and atoms in the dehydrogenized molecule can be done by first collecting the atoms in a list, then perform dehydrogenize, and finally look for the list positions of atoms in the dehydrogenized molecule.

The example code maps oxygen and nitorgen atom objects in the original molecule to the corresponding atom objects in the microspecies, outputs the atom object pointer, the atom symbol and the charge.

Run it by