tautomer canonicalization

User 538416f930

16-02-2006 21:38:15

Is there currently a way to generate one (the most stable) canonical (standardized) tautomeric form for an input molecule, independent from the tautomeric form of the input structure, i.e. canonicalize tautomers?





Thanks,





Stephan

ChemAxon d76e6e95eb

17-02-2006 06:55:03

The most stable tautomer calculation is just under development, so that canonical tautomer form generation will be available soon.

User 538416f930

17-02-2006 18:48:46

Thanks. This will be very valuable! -S.

User d83ec9d6e4

15-11-2006 16:13:21

Tried to find the canonical tautomer of abacavir - so that I can collapse all of my abacavir records into one compound. I am using the latest 3.2 release. I notice that several forms of this same molecule do not canonicalize to one single structure. I have found a work-around for this particular example, but worry that it does not generalize.





Initially, I just want to use cxcalc at the command line:





C:\>cxcalc "NC1=NC(NC2CC2)=C3N=CN([C@@H]4C[C@H](CO)C=C4)C3=N1" canonicaltautomer


id structure


1 OC[C@H]1C[C@H](C=C1)n2cnc3c(NC4CC4)[nH]c(=N)nc23





C:\>cxcalc "OC[C@H]1C[C@H](C=C1)n2cnc3c(NC4CC4)[nH]c(=N)nc23" canonicaltautomer


id structure


1 Nc1nc2n(cnc2c(=NC3CC3)[nH]1)[C@@H]4C[C@H](CO)C=C4





C:\>cxcalc "Nc1nc2n(cnc2c(=NC3CC3)[nH]1)[C@@H]4C[C@H](CO)C=C4" canonicaltautomer


id structure


1 OC[C@H]1C[C@H](C=C1)n2cnc3c2[nH]c(=N)[nH]c3=NC4CC4





You'll notice in particular that #2 and #3 canonicalize to each other, and not to a common structure.





So instead, I tried using the tautomerPlugin directly, and then find the lowest string amongst all the tautomer forms for each molecule. This seems to give me one unique structure from all of the tautomer forms above. I didn't set any of the parameters for the plugin. I thought setting 'canonical' or 'dominant' would help, but I get the same problem above. Instead, I went the other route of enumerating all tautomer forms and comparing strings.





if (tautomerPlugin == null) {


tautomerPlugin = new TautomerizationPlugin();





// set plugin params


Properties params = new Properties();


//params.setProperty("single", "true");


//params.setProperty("pH", "7.4");


//params.setProperty("type", "structure");


//params.setProperty("max", "1");


//params.setProperty("canonical", "true");


//params.setProperty("dominants", "true");


tautomerPlugin.setParameters(params);


}





tautomerPlugin.setMolecule(mol);





tautomerPlugin.run();


if (tautomerPlugin.getStructureCount() > 0) {


Molecule taut = tautomerPlugin.getStructure(0);


taut.aromatize();


String canonical = taut.toFormat("smiles:u");


int best = 0;


for (int i=0; i<tautomerPlugin.getStructureCount(); i++) {


taut = tautomerPlugin.getStructure(i);


taut.aromatize();


if (taut.toFormat("smiles:u").compareTo(canonical) < 0) {


canonical = taut.toFormat("smiles:u");


best = i;


}


}


return tautomerPlugin.getStructure(best);


}


return mol;

User 851ac690a0

16-11-2006 13:49:37

Hi,





I see you use JChem 3.2


New version (3.2.1) of JChem can be excepted soon.








Marvin 4.1.3 release (this will be included with JChem 3.2.1) is available in the test section:


http://www.chemaxon.com/test/marvin


This new test version was released two days ago.








At this site you can try the canonicalization of abacavir. (I tried, and it works well.)











Your method to find canonical form seems to be interesting! I will check more thoroughly and inform you from the results.





Thanks.


Jozsi

User d83ec9d6e4

16-11-2006 15:17:57

Thanks for the update. I'll wait for the newest release.