Incorrect Standardizer Configuration Script?

User bc9a7e94b9

30-04-2009 21:29:55

Hi,


 


I wanted to standardize a set of molecules such that salts were considered duplicates and that all (de)protonated species were converted to their neutral form. I also wanted to aromatize the molecules.


In addition, as I eventually intend to compute descriptors for my data set (including those based upon substructures), and for a PubChem data set, I decided to incorporate the transformations recommended here:


http://www.chemaxon.com/forum/viewpost16681.html&highlight=neutralize+transform#16681 />
This was achieved as follows:
standardize -c final-stand.xml 2008LiData.sdf -f sdf -o 2008LiData_stand.sdf

I notice (upon viewing the molecules) that this does not appear to have taken care of N+ radicals, for example, that were present
in my original SDF file.

Should this have happened, or have I simply misunderstood the function of Neutralize?

Is there anything obviously wrong with my script? I tried to remove all transformations which I thought were inconsistent with those carried out previously.

Following this, I tried to calculate the following sets of descriptors (N.B. the input file is the same
as the output file above, save for the addition of a some more fields using a Python script).

cxcalc -S -o 2008LiData_stand_IC50_ID_act_CAdesc.sdf pka -t basic pka -t acidic logp mass wienerindex acceptorcount donorcount 2008LiData_stand_IC50_ID_act.sdf

>> CxcalcErrors.txt 2>&1

This failed to calculate pka, logp, acceptorcount or donorcount descriptors for entires 195 and 405.

When I used an identical configuration script, save for the Neutralize statement,
I was able to calculate all relevant desciptors without any exceptions being reported.

I'm using JChem 5.1.3, on a Windows XP machine (32 bit, Intel Pentium 4).

I attach all relevant files.

Thanks in advance for any assistance.

ChemAxon e08c317633

04-05-2009 14:50:22










RichardMR wrote:

I notice (upon viewing the molecules) that this does not appear to have taken care of N+ radicals, for example, that were present in my original SDF file.

Should this have happened, or have I simply misunderstood the function of Neutralize?



Quaternary amines are not neutralized by the "neutralize" action, and for example a tertiary amine with a radical is taken into account as a quaternary amine. I think this is the case in this situation.











RichardMR wrote:

I wanted to standardize a set of molecules such that salts were considered duplicates and that all (de)protonated species were converted to their neutral form. I also wanted to aromatize the molecules.


In addition, as I eventually intend to compute descriptors for my data set (including those based upon substructures), and for a PubChem data set, I decided to incorporate the transformations recommended here:


http://www.chemaxon.com/forum/viewpost16681.html&highlight=neutralize+transform#16681 />
This was achieved as follows:
standardize -c final-stand.xml 2008LiData.sdf -f sdf -o 2008LiData_stand.sdf

Is there anything obviously wrong with my script? I tried to remove all transformations which I thought were inconsistent with those carried out previously.

Following this, I tried to calculate the following sets of descriptors (N.B. the input file is the same
as the output file above, save for the addition of a some more fields using a Python script).

cxcalc -S -o 2008LiData_stand_IC50_ID_act_CAdesc.sdf pka -t basic pka -t acidic logp mass wienerindex acceptorcount donorcount 2008LiData_stand_IC50_ID_act.sdf

>> CxcalcErrors.txt 2>&1

This failed to calculate pka, logp, acceptorcount or donorcount descriptors for entires 195 and 405.

When I used an identical configuration script, save for the Neutralize statement,
I was able to calculate all relevant desciptors without any exceptions being reported.

I'm using JChem 5.1.3, on a Windows XP machine (32 bit, Intel Pentium 4).

I attach all relevant files.

Thanks in advance for any assistance.



The 195th and 405th molecules in the 2008LiData_stand_IC50_ID_act.sdf have valence error. Transforms can easily cause valence errors, please check your transformations in your standardizer configuration file.


I think the  file 2008LiData_stand_IC50_ID_act.sdf was created using a different standardier config than the attached final-stand.xml, becuse I don't get any errors if I standardize 2008LiData.sdf using final-stand.xml config, and then calculate the listed properties with cxcalc:


 


standardize -c final-stand.xml 2008LiData.sdf -f sdf | cxcalc pka -t basic pka -t acidic logp mass wienerindex acceptorcount donorcount >result.txt


I attached result.txt, which includes results for molecules 195 and 405.


Zsolt

User bc9a7e94b9

05-05-2009 09:53:17

Dear Zolt,


 


Thank you for taking the time to look at this and clarify the expected behaviour of Neutralize vis-a-vis N+ radicals.


 


I believe I may have made a typo at some point, which meant that the configuration file I handed you was not the same as the one I actually used originally. Having manually corrected some of the original SMILES strings (to remove N+ radicals), using the posted config script and cxacalc as specified didn't generate any exceptions.


 


Sorry for inadvertently wasting your time.


Regards,


 


Richard