clean function

ChemAxon 60ee1f1328

14-06-2005 13:48:25

Hello,





I would like to learn more about the molecule clean function in Marvin.





Firstly when I clean a molecule I see that I always obtain a particular absolute stereo chemistry. Can this structure be considered the lowest energy conformer? Or is this simply a single random absolute diastereoisomer?





If my latter assumption is correct, can I use/call clean in order to generate all the diastereoisomers for a given "unclean" molecule?


(I am happy that I can complete this directly after a single clean but would like to know if it is possible or going to be possible)





Can you out line the difference between clean 2D and 3D and how I can invoke my choice of either.

ChemAxon 25dcd765a3

21-06-2005 18:49:01

Quote:
Can this structure be considered the lowest energy conformer?
In 2D we cannot speak about energy. It is just a "nice" arrangement of the atoms in the molecule. It has no connection to the 3D coordinates of the atoms.
Quote:
Can you out line the difference between clean 2D and 3D and how I can invoke my choice of either.
Clean 2D arranges the molecule in 2D, however 3D tries to find the minimum energy based on the atom coordinates.


2D is used by chemists to show molecules in a sheet of paper.


3D is used for calculation.
Quote:
can I use/call clean in order to generate all the diastereoisomers
In 3D cleaning yes, but not in 2D.





all the best


Andras

ChemAxon d76e6e95eb

22-06-2005 16:53:34

We can develop a new plugin able to generate all stereoisomers of a given molecule. Then you can generate the conformers of any of the stereoisomers. Will this solution help you?

ChemAxon 60ee1f1328

23-06-2005 07:54:42

Hello,





I've just finished writing a Java class that resolves an input molecule into all it's diastereoisomers in 2D, it relies on the clean 3D function to identify the chiral centres and then determines all combinations - so don't worry providing this as a plug in for now thanks.





It would be useful to know how to access all the 3D co-ordinate information most efficiently from an instantiated molecule, maybe I copy to a clob and then read sequentially from the clob?





Cheers,


Daniel.

ChemAxon 60ee1f1328

23-06-2005 10:41:06

OK - so I've noticed now that for all my diastereoisomers I have the same underlying co-ordinates in 2D even though all the chiral centres are flagged differently!





So in fact if you can advise me how I can take my 2d SMILES and generate all 3D co-ordinate data on a per diastereoisomer basis that would be much obliged, maybe this is the calculator plug-in you referred to previously!?

ChemAxon 60ee1f1328

23-06-2005 10:52:16

Hello again,





It appears that each of my diastereoisomers requires a further clean in 3D in order to generate the relevant set of 3D co-ordinates





i.e. I am seeing different 3D co-ordinates for each of my diastereoisomers after a further clean 3D step (directly after the 2D resolve)?


Can you confirm?





Thanks for your help!

ChemAxon 9c0afc9aaf

23-06-2005 11:35:21

Hi,





Some quick comments (my colleagues will correct me if I'm wrong)





1. We do not really understand the question about clob. You are working inside a Java program, right ? (not using Oracle Cartridge, etc.)





2. In general you do not need to clean in 2D before you clean in 3D.





3. Clean3D will use the stereo information stored in the structure, so the 3D coordinates will differ even if cleaning in 2D gave back the same coordinates with an other wedge direction (e.g.: F[C@@H](Cl)Br and F[C@H](Cl)Br )





Best regards,





Szilard

ChemAxon 60ee1f1328

23-06-2005 12:42:10

1) Sorry ignore my clob reference I am working in Java and just want to extract the 3D co-ordinate information from the molecule object. So indication of how this could be achieved would be helpful (Java data type and Molecule.methods).





2) I don't think I am cleaning in 2D before 3D. I am only using Clean 3D to identify chiral centres and then have my own code to resolve in 2D.





3) "Clean3D will use the stereo information stored in the structure" - great this is exactly what I want and referred to this in my last note. I'm not sure I understand the rest of your statement.





Can you confirm correlation between the following co-ordinates (generated using Clean 3D) with the SMILES string, which represent 3 separate diastereoisomers?


If so, I think I am a very happy chappy!?





1)


C[C@H]([C@H](C)[C@](C)(N)O)c1:c:c:c:c:c1








-0.6933 -0.6718 -1.9398 C 0 0 0 0 0 0 0 0 0 0 0 0


0.0367 -0.5071 -0.5759 C 0 0 2 0 0 0 0 0 0 0 0 0


-0.8683 -0.7205 0.7230 C 0 0 2 0 0 0 0 0 0 0 0 0


0.0031 -0.5233 2.0094 C 0 0 0 0 0 0 0 0 0 0 0 0


-1.6905 -2.0766 0.7916 C 0 0 2 0 0 0 0 0 0 0 0 0


-2.6541 -2.2719 1.9884 C 0 0 0 0 0 0 0 0 0 0 0 0


-0.8107 -3.2517 0.7478 N 0 0 0 0 0 0 0 0 0 0 0 0


-2.5834 -2.2011 -0.3084 O 0 0 0 0 0 0 0 0 0 0 0 0


0.8000 0.8066 -0.6028 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2456 0.7868 -0.6133 C 0 0 0 0 0 0 0 0 0 0 0 0


2.9923 2.0044 -0.5827 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2705 3.2384 -0.5431 C 0 0 0 0 0 0 0 0 0 0 0 0


0.8419 3.3067 -0.5336 C 0 0 0 0 0 0 0 0 0 0 0 0


0.1101 2.0812 -0.5607 C 0 0 0 0 0 0 0 0 0 0 0 0





2) C[C@H]([C@H](C)[C@@](C)(N)O)c1:c:c:c:c:c1





-0.6752 -0.6703 -1.9470 C 0 0 0 0 0 0 0 0 0 0 0 0


0.0411 -0.5090 -0.5752 C 0 0 2 0 0 0 0 0 0 0 0 0


-0.8706 -0.7200 0.7179 C 0 0 2 0 0 0 0 0 0 0 0 0


0.0027 -0.5305 2.0046 C 0 0 0 0 0 0 0 0 0 0 0 0


-1.6942 -2.0773 0.7919 C 0 0 1 0 0 0 0 0 0 0 0 0


-2.6436 -2.2787 1.9987 C 0 0 0 0 0 0 0 0 0 0 0 0


-2.6315 -2.2262 -0.3244 N 0 0 0 0 0 0 0 0 0 0 0 0


-0.8226 -3.2037 0.7450 O 0 0 0 0 0 0 0 0 0 0 0 0


0.8057 0.8055 -0.5997 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2514 0.7858 -0.6096 C 0 0 0 0 0 0 0 0 0 0 0 0


2.9978 2.0022 -0.5785 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2758 3.2373 -0.5384 C 0 0 0 0 0 0 0 0 0 0 0 0


0.8475 3.3051 -0.5288 C 0 0 0 0 0 0 0 0 0 0 0 0


0.1158 2.0797 -0.5565 C 0 0 0 0 0 0 0 0 0 0 0 0





3) C[C@H]([C@@H](C)[C@](C)(N)O)c1:c:c:c:c:c1





-1.2108 -1.0936 -2.3365 C 0 0 0 0 0 0 0 0 0 0 0 0


-0.4713 -0.9368 -0.9659 C 0 0 2 0 0 0 0 0 0 0 0 0


-1.3436 -1.1318 0.3601 C 0 0 1 0 0 0 0 0 0 0 0 0


-2.1897 -2.4521 0.3327 C 0 0 0 0 0 0 0 0 0 0 0 0


-0.4974 -1.0428 1.7051 C 0 0 2 0 0 0 0 0 0 0 0 0


-1.2665 -1.0855 3.0474 C 0 0 0 0 0 0 0 0 0 0 0 0


0.2257 0.2209 1.8604 N 0 0 0 0 0 0 0 0 0 0 0 0


0.4926 -2.0675 1.7475 O 0 0 0 0 0 0 0 0 0 0 0 0


0.2965 0.3696 -0.9979 C 0 0 0 0 0 0 0 0 0 0 0 0


1.7420 0.3458 -1.0050 C 0 0 0 0 0 0 0 0 0 0 0 0


2.4925 1.5605 -0.9582 C 0 0 0 0 0 0 0 0 0 0 0 0


1.7742 2.7965 -0.9123 C 0 0 0 0 0 0 0 0 0 0 0 0


0.3458 2.8695 -0.9173 C 0 0 0 0 0 0 0 0 0 0 0 0


-0.3899 1.6472 -0.9601 C 0 0 0 0 0 0 0 0 0 0 0 0

ChemAxon 25dcd765a3

23-06-2005 14:40:41

1)
Quote:
So indication of how this could be achieved would be helpful
Code:



Molecule m = ....


double[][] coordinates = new double[m.getAtomCount()][];


for (int i=0; i<m.getAtomCount(); i++){


double[] xyz = new double[3];


MolAtom a = m.getAtom(i);


xyz[0] = a.getX();


xyz[1] = a.getY();


xyz[2] = a.getZ();


coordinates[i] = xyz;


}






2)


You can use m.getChirality(i) to see if the the i-th atom has chirality or not.


And change the Chirality if you want.








All the best


Andras

ChemAxon 25dcd765a3

23-06-2005 16:37:59

So if you want to generate diastereomers you don't even have to clean the molecule.


Just select the atoms for which you would like to change its chirality and change its Parity.


The main difference between chirality and parity is that parity uses just the local neighbours of the center. You can work with it as a local chirality.


For more information about parity see ctfile.pdf from http://www.mdli.com/downloads/public/ctfile/ctfile.jsp.





Andras

User 65315e6b18

23-06-2005 16:56:04

inhibox wrote:
1) Sorry ignore my clob reference I am working in Java and just want to extract the 3D co-ordinate information from the molecule object. So indication of how this could be achieved would be helpful (Java data type and Molecule.methods).





2) I don't think I am cleaning in 2D before 3D. I am only using Clean 3D to identify chiral centres and then have my own code to resolve in 2D.





3) "Clean3D will use the stereo information stored in the structure" - great this is exactly what I want and referred to this in my last note. I'm not sure I understand the rest of your statement.





Can you confirm correlation between the following co-ordinates (generated using Clean 3D) with the SMILES string, which represent 3 separate diastereoisomers?


If so, I think I am a very happy chappy!?


If you use the "S{nofaulty}" or "S{fine}" option string then you can be sure that if you get a valid structure (no error message and the coordinates differ from 0.000) than it corresponds to your input. Otherwise, Clean3D might give bad chirality if it cannot build the structure with the requirements.

ChemAxon 60ee1f1328

24-06-2005 10:05:29

Thanks for all your comments so far:





OK, so I take the explicit SMILES string:


C[C@H]([C@H](C)[C@](C)(N)O)c1:c:c:c:c:c1





I copy it into MarvinSketch, Clean in 3D (fine) and then save the result as an sdf file and in that file I get the following 3D co-ordinate information for each atom:





-0.6933 -0.6718 -1.9398 C 0 0 0 0 0 0 0 0 0 0 0 0


0.0367 -0.5071 -0.5759 C 0 0 2 0 0 0 0 0 0 0 0 0


-0.8683 -0.7205 0.7230 C 0 0 2 0 0 0 0 0 0 0 0 0


0.0031 -0.5233 2.0094 C 0 0 0 0 0 0 0 0 0 0 0 0


-1.6905 -2.0766 0.7916 C 0 0 2 0 0 0 0 0 0 0 0 0


-2.6541 -2.2719 1.9884 C 0 0 0 0 0 0 0 0 0 0 0 0


-0.8107 -3.2517 0.7478 N 0 0 0 0 0 0 0 0 0 0 0 0


-2.5834 -2.2011 -0.3084 O 0 0 0 0 0 0 0 0 0 0 0 0


0.8000 0.8066 -0.6028 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2456 0.7868 -0.6133 C 0 0 0 0 0 0 0 0 0 0 0 0


2.9923 2.0044 -0.5827 C 0 0 0 0 0 0 0 0 0 0 0 0


2.2705 3.2384 -0.5431 C 0 0 0 0 0 0 0 0 0 0 0 0


0.8419 3.3067 -0.5336 C 0 0 0 0 0 0 0 0 0 0 0 0


0.1101 2.0812 -0.5607 C 0 0 0 0 0 0 0 0 0 0 0 0





Alternatively I have implemented your suggested code as follows


(plus added the Molecule instatiation and clean):





public static double [][] Smile3DCoOrds (String pSMILE)


{





double[][] coordinates = null;





try





{





MolHandler molhandler = new MolHandler (pSMILE);





Molecule molecule = molhandler.getMolecule();





// Clean in 3D for current molecule to generate 3D co-ordinates


molecule.clean (3,null,null);





coordinates = new double[molecule.getAtomCount()][];





for (int i=0; i<molecule.getAtomCount(); i++)





{





double[] xyz = new double[3];





MolAtom atom = molecule.getAtom(i);





xyz[0] = atom.getX();





xyz[1] = atom.getY();





xyz[2] = atom.getZ();





coordinates = xyz;





}





}


catch (chemaxon.formats.MolFormatException e) {System.err.println("Exception: " + e.getMessage());}





return coordinates;





}





and find that the return co-ordinates to be as follows:





-0.9190037757024326 -0.44544233581618564 1.8336252040002332


-0.5882866631818193 -0.1056277493820057 0.35355616309437354


0.05325462385459692 -1.2915906824510948 -0.4910048931547616


0.36344030494780966 -0.809423302912829 -1.942964120074158


-0.6824272124090076 -2.693637402058817 -0.499846240626573


0.01819382424771933 -3.9147889410413095 -1.1405807491111726


-2.0251390327300767 -2.633741166274911 -1.0895042098970817


-0.8722677299781303 -3.224572493181758 0.8058402243053578


0.11237130376868945 1.2403708195316459 0.3670464420642852


1.5457386170232952 1.3296811647373046 0.5468423227823658


2.218558519234817 2.5895950423418514 0.5393711634610115


1.4187353616260425 3.7610802271994013 0.35666663689312583


-1.982767673340824E-4 3.737210555845434 0.1795729695882757


-0.6429698639341699 2.4608862634632724 0.18137908667471772





I had expected them to be the same, but they are quite different?





So I am not sure what each set actually represents and which set


most accurately reflects the input SMILES string.





I would tend to assume that the first set is more correct - so how can I code up more accurately the events that are occuring when I save my Molecule object to an sdf file? Note: If I don't use the clean step I get all 0 for xyz. Also could you please expand on "S{nofaulty}" or "S{fine}" are these clean options?





Cheers!


Daniel.

User 65315e6b18

24-06-2005 15:48:21

inhibox wrote:



Alternatively I have implemented your suggested code as follows


(plus added the Molecule instatiation and clean):





public static double [][] Smile3DCoOrds (String pSMILE)


{





double[][] coordinates = null;





try





{





MolHandler molhandler = new MolHandler (pSMILE);





Molecule molecule = molhandler.getMolecule();





// Clean in 3D for current molecule to generate 3D co-ordinates


molecule.clean (3,null,null);





coordinates = new double[molecule.getAtomCount()][];





for (int i=0; i<molecule.getAtomCount(); i++)





{





double[] xyz = new double[3];





MolAtom atom = molecule.getAtom(i);





xyz[0] = atom.getX();





xyz[1] = atom.getY();





xyz[2] = atom.getZ();





coordinates = xyz;





}





}


catch (chemaxon.formats.MolFormatException e) {System.err.println("Exception: " + e.getMessage());}





return coordinates;





}





and find that the return co-ordinates to be as follows:





...


Hi Daniel,





I suggest using





...


molecule.clean (3,'S{fine}',null);


...





in your code to reproduce the results of Cleaning via "Fine" in the GUI.





Best wishes,





Ödön

ChemAxon 60ee1f1328

27-06-2005 09:35:55

Thanks for the tip.


As a result of inclusing the fine option, you can see the data is now sufficiently close to the marvin output to be considered effectively the same - which is great news. Any comments as to why there is a slight apparent difference in the data generated by Marvin and chemaxon.struc.* molecule clean 3D?


Thanks to all for the helpful comments.


Daniel.





-0.6711475282672117 -0.7035128661350979 -1.9544691800018694


0.028984026241118843 -0.4971720670219365 -0.5816078577574647


-0.8463660585071333 -0.721134845150119 0.7263298257309404


0.011418058265422637 -0.5043199333585086 2.015256573280843


-1.681042908494619 -2.0594487832488513 0.8146911948731156


-2.673882930563363 -2.2694346637924756 1.981310352980771


-0.8278284617501732 -3.252767515290087 0.7628508431534876


-2.595551522779221 -2.1923497630914417 -0.2658636839887707


0.7953573606063009 0.8077718674923933 -0.6229688092392194


2.2397287114632096 0.7821595263506397 -0.6228615720845174


2.995960485529115 1.9923118673991196 -0.5808792468981122


2.270439369761874 3.224934884568398 -0.5411525429370995


0.8423574792157545 3.308849022488354 -0.5455733251425579


0.11157391927892613 2.084113268789612 -0.5850625719695478

User 65315e6b18

27-06-2005 10:55:21

inhibox wrote:
Thanks for the tip.


As a result of inclusing the fine option, you can see the data is now sufficiently close to the marvin output to be considered effectively the same - which is great news. Any comments as to why there is a slight apparent difference in the data generated by Marvin and chemaxon.struc.* molecule clean 3D?


Thanks to all for the helpful comments.


Daniel.





-0.6711475282672117 -0.7035128661350979 -1.9544691800018694


0.028984026241118843 -0.4971720670219365 -0.5816078577574647


-0.8463660585071333 -0.721134845150119 0.7263298257309404


0.011418058265422637 -0.5043199333585086 2.015256573280843


-1.681042908494619 -2.0594487832488513 0.8146911948731156


-2.673882930563363 -2.2694346637924756 1.981310352980771


-0.8278284617501732 -3.252767515290087 0.7628508431534876


-2.595551522779221 -2.1923497630914417 -0.2658636839887707


0.7953573606063009 0.8077718674923933 -0.6229688092392194


2.2397287114632096 0.7821595263506397 -0.6228615720845174


2.995960485529115 1.9923118673991196 -0.5808792468981122


2.270439369761874 3.224934884568398 -0.5411525429370995


0.8423574792157545 3.308849022488354 -0.5455733251425579


0.11157391927892613 2.084113268789612 -0.5850625719695478
Hi Daniel,





Please, try pasting your smiles string into the MSketch GUI, apply Clean/3D and then use Edit/Source (Format/MDL mol or other structure format) and you will see a practically exact match with your latest coordinate set. Exporting/Saving from the GUI might alter the orientation but should keep the internal structure of the molecule intact.





Best wishes,





Ödön