Non-Standardized Versions of Molecules Stored

User 7910dcb734

01-09-2014 12:13:17

Hi all,


We've encountered some strange behaviour. We have a ChemAxon table set up with standardization set to remove all but the largest fragment (full standardization settings attached). 


When I insert a molecule + salt, followed by the same molecule with its salt, the second molecule is correctly identified as a duplicate and not inserted - I assume because the standardization rules are correctly followed and the salt removed when comparing the two molecules.


However, if I export the molecule in the database, what comes out is the molecule + salt. I assumed that the standardized version of the imported molecules would be stored, but it appears to be whatever version of the molecule arrives first.


(I notice this behaviour also happens with tautomers/isomers, but it makes sense there - is this related?)


I've attached a salt/salt free pair if you want to confirm.


Cheers,


Brendan

ChemAxon d4fff15f08

02-09-2014 09:37:01

Hi Brendan,


 


The behaviour described by you is the expected one from JChem. It is because we store and use your original structure in its original format for export and hit visualization. The standardized structure is stored too (cd_smiles/cd_smarts field in the DB). If you want to export those ones too, you must add the cd_smiles/cd_smarts fields too when initiating an export, so they will be included in the outfile (in smiles/smarts format). We have to mention that in those cases when the original structure contains such features that could not be represented in smiles/smarts, the field cd_smiles will contain NULL. 


 


Best regards,


Norbert

User 7910dcb734

02-09-2014 09:38:43

Hi Norbert,


That's exactly the information I was looking for. Glad it's expected behaviour.


Many thanks,


Brendan