Regenerate chemical database with 3.0.10

User dfeb81947d

31-03-2005 13:39:37

Hi everybody,





I'm regenerating the chemical database under an old version with jchem 3.0.10. (using classes12.jar of oracle9.2)


I have for several times the following warnings:
jchem3.0.10 wrote:
WARNING: smarts import: ambiguous meaning of "H" in [H;v1].


"H" is now interpreted as Hydrogen atom, not total H count!


To avoid this warning message, please use "H1" for hydrogen count and


"#1" for H atom. Or to force H count in this situation, please use import option


"d".
how could I use the import option "d" with JChemManager?


And with the API for importation using UpdateHandler?





Thank you so far for your answer.


Best Regards





Jacques

ChemAxon 9c0afc9aaf

01-04-2005 12:56:52

Hi,








You have imported structures that contained "[H;v1]".


This notation is not valid in SMILES.





(It is valid in SMARTS, but SMARTS is for queries, so they cannot be stored in the table)





Do you know where these structures came from / do you have the original input file ?





Best regards,





Szilard

User dfeb81947d

01-04-2005 15:42:42

Hi,





yes for sure, the molfile contains she String [H,v1]


(see mol1.mol)


How could I do to force the transformation of [H,v1] into H using API (UpdateHandler) ?





Thank you for your help


Best Regard,

ChemAxon 9c0afc9aaf

01-04-2005 20:01:19

Hi Jacques,





You can use Standardizer to convert the files before importing them again into an empty table, example:





standardize -c "[H;v1]>>[H]" < input.mol > output.mol





Of course you should have all structure files that were imported to the database.


If you don't, you can still export them to .sdf or .mrv format, and re-import that file after running the standardizer.





Best regards,





Szilard

User dfeb81947d

04-04-2005 09:28:28

Hi,





Thank you very much for your reply.


What if I use the Standardizer class in the API?


Code:
protected static String readMolecule(byte[] mol) throws MolFormatException, SearchException {


   standardizer = new Standardizer(new File("standardizer.xml"));


   return (standardizer.standardize((new MolHandler(mol)).getMolecule())).toFormat("mol");


}






I construct the XML configuration files standardizer.xml as follow:


Code:
<?xml version="1.0" encoding="UTF-8"?>


<!-- Standardizer configuration file -->





<StandardizerConfiguration Version ="0.1" schemaLocation="standardize.xsd">


    <Actions>


   <Action ID="aromatize" Act="aromatize"/>


   <Action ID="dehydrogenize" Act="dehydrogenize" Optional="true"/>


   <Clean ID="cleanIfNeeded"/>


    </Actions>


</StandardizerConfiguration>



is it enough?


What do I need to add to convert the [H;v1] into H?





Do I need a licence key for that operation?





Kind Regards,


Jacques

ChemAxon fb166edcbd

04-04-2005 17:55:00

You can use Standardizer for this purpose but it is not necessary: if you simply call


Code:
MoleculeGraph.aromatize()



http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#aromatize()


and
Code:
MoleculeGraph.hydrogenize(false)



http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#hydrogenize(boolean)


then that does the same thing.


If your original molecules are cleaned then there is no need for another clean() since you do not add any atoms or change bonds between existing atoms, but you can also call


Code:
MoleculeGraph.clean(2, null)
to clean your molecules in 2D:


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#clean(int,java.lang.String)





If you use Standardizer then you do not need a license for actions aromatize and dehydrogenize but you need one if you also want to run clean through Standardizer - you can process 2000 clean actions without a license key.





You can also use a simple action string instead of a configuration XML if you only have a couple of simple tasks with default settings. For example:


Code:



standardize -c "aromatize..dehydrogenize" mol1.mol -f mol -o mol2.mol





will aromatize and dehydrogenize your molecule.


You can also use this string format in the API:


http://www.chemaxon.com/jchem/doc/api/chemaxon/reaction/Standardizer.html#Standardizer(java.lang.String)

ChemAxon fb166edcbd

04-04-2005 21:04:49

One more remark on our license policy:


without a valid license key Standardizer can only be used for evaluation and not for production.

User dfeb81947d

06-04-2005 10:49:48

Thank you very much for all your explanation.


I prefere using API for Standardize the molecules, so that I can include them into a process of a java application.


Thanks again.

ChemAxon 9c0afc9aaf

07-04-2005 12:14:25

Hi,





Some additional thoughts:





-Aromatization is not necessary to remove these hydrogens.


Please ask you chemists, usually they prefer the original Kekule form.





-You can control the implicitization of hydrogens better with MoleculeGraph.implicitizeHydrogens:


http://www.chemaxon.com/marvin/doc/api/chemaxon/struc/MoleculeGraph.html#implicitizeHydrogens(int)





- if you do not want to remove any hydrogens, you can just clear the valence property without changing anything else (recommended solution):


Code:



        int count=mol.getAtomCount();


        for (int x=0; x< count ; x++) {


            MolAtom atom=mol.getAtom(x);


            if (atom.getAtno()==1 && atom.getValence()==1) {


                //clearing the valence property:


                atom.setValenceProp(-1);


            }


        }





- as we have probably mentioned, these molecules do not cause any trouble except for flooding the console with these warnings.


(the search works properly even without altering these structures)





Best regards,





Szilard

User d6c1b7eb8c

18-08-2009 09:12:16

As Jacques was asking, please could you example how 'to use the import option "d"', what class is it refering to in the API?  I have the same problem but do not wish to modify the original database/standardise it, and simply wish to get rid of the console warnings.

Thank you

ChemAxon a3d59b832c

18-08-2009 20:56:38

Hi Benjamin,


The warning comes from the SMARTS import module, and the mentioned "d" option relates to the same module. See more at the SMILES/SMARTS format documentation:


http://www.chemaxon.com/marvin/help/formats/smiles-doc.html#ioptions


The parameter belongs to the following class:


http://www.chemaxon.com/marvin/help/developer/beans/api/chemaxon/formats/MolImporter.html


(The input format can be specified together with the format-specific options, like "smarts:d" )


 


However, the database import or regeneration modules do not expose this option externally.


 


To understand better the situation, I have a few questions:


1. In what context do you exactly get the warning?


2. What is the atom expression which contains the ambiguous H symbol? (It should be quoted in the warning.)


3. Does your database indeed contain SMARTS strings (so this is a query or any table/index), or just some oddly formatted smiles strings cause the warning?



Thanks,


Szabolcs

User d6c1b7eb8c

19-08-2009 09:35:31

Thanks very much, thats sorted my problem.

I was getting the error when importing a SMARTs database line by line as string and passing it a SMARTS query using MolHandler(smarts, True). It was occuring on [O,Sv2;H] due to its daylight-style usage of H.


I've imported it with daylight compatibility to fix it now by using:


MI = MolImporter()

mol = MI.importMol('smarts','d')

smarts = MolHandler(mol).toFormat('smarts')


Thanks,
Ben

ChemAxon a3d59b832c

19-08-2009 11:30:54

Hi Ben,


Thanks for getting back. I am glad that the problem is resolved now.


Best regards,
Szabolcs