User 677b9c22ff
04-07-2007 08:48:58
Hi,
just some comments for people who import data from different molecular databases and different vendor sources like Daylight or MDL ISIS or PubChem or ZINC or Accelrys or Tripos or or or. [deprecated -see my comments below]
I just saw the very interesting PPT that MDL and JCHEM handle aromatic systems in a different way. I did not know that. Basically it says that
* MDL® treats 5-membered heterocycles as non-aromatic (Kekulé structure)
* ChemAxon aromatizes any 4n+2 ring system
So it is a good practice (GLP) during the import of SDF or SMILES files into Instant-JChem and JChem Base to use the Standardizer which is currently too much hidden within Instant-JChem for my taste. I think people have to be forced to use it, even for the lack of usability and annoyance. I found out that I need to use the standardizer after importing several 100k structures and doing some expensive calculations on them.
I thought (simple minded as I am) that the "remove duplicate structures" feature always magically works. And it does not (of course). So if the substances are not canonized via INCHI, FICUS, uuuuu or any other canonizer, the system can not detect that the compounds are actually the same.
So it is very important to use the right-click
New JCHEM database table and define a Standarizer and then click the remove duplicate box. It is *not* a good practice to just import the SDF or SMILES file if you want to use the duplicate or overlap filter. (Here I am not quite sure if the duplicate filter performs an iternal canonization including aromatization and mesomeriszation but this is not important if the standardizer is used anyway.)
For aromatic systems this would include:
* Clean
* Dearomatize
* Aromotize
* Mesomerize
Kind regards
Tobias Kind
just some comments for people who import data from different molecular databases and different vendor sources like Daylight or MDL ISIS or PubChem or ZINC or Accelrys or Tripos or or or. [deprecated -see my comments below]
I just saw the very interesting PPT that MDL and JCHEM handle aromatic systems in a different way. I did not know that. Basically it says that
* MDL® treats 5-membered heterocycles as non-aromatic (Kekulé structure)
* ChemAxon aromatizes any 4n+2 ring system
So it is a good practice (GLP) during the import of SDF or SMILES files into Instant-JChem and JChem Base to use the Standardizer which is currently too much hidden within Instant-JChem for my taste. I think people have to be forced to use it, even for the lack of usability and annoyance. I found out that I need to use the standardizer after importing several 100k structures and doing some expensive calculations on them.
I thought (simple minded as I am) that the "remove duplicate structures" feature always magically works. And it does not (of course). So if the substances are not canonized via INCHI, FICUS, uuuuu or any other canonizer, the system can not detect that the compounds are actually the same.
So it is very important to use the right-click
New JCHEM database table and define a Standarizer and then click the remove duplicate box. It is *not* a good practice to just import the SDF or SMILES file if you want to use the duplicate or overlap filter. (Here I am not quite sure if the duplicate filter performs an iternal canonization including aromatization and mesomeriszation but this is not important if the standardizer is used anyway.)
For aromatic systems this would include:
* Clean
* Dearomatize
* Aromotize
* Mesomerize
Kind regards
Tobias Kind