H loss during the Daylight aromatization

User d68ef9d5a9

18-08-2005 19:05:10

Hi,

I am inserting this molecule to my database. We need to store Daylight aromatized structure mol file into our cd_structure of the structure table. The import file format is sd file containing Kecule structure form. These are the basic codes to make the insertion.

MolHandler mh=new MolHandler(structureString); //sdf file attached

// mh.addHydrogensToAromaticHeteroAtoms();

// this would not help at all.

Molecule mole=mh.getMolecule();

mole.dearomatize();

System.out.print(mole.toFormat("smiles")+"; ");

//correctly “N1C=CC2=CC=CC=C12”

mole.aromatize(MoleculeGraph.AROM_DAYLIGHT);

System.out.println(mole.toFormat("smiles"));

// correctly “c1ccc2[nH]ccc2c1”

// this.uh.setInputMolecule(mole); //See my comments

this.uh.setValuesForFixColumns(101, mole.toFormat("mol"));

this.uh.setValueForAdditionalColumn(1,new Integer(6001), Types.INTEGER);

this.uh.execute();

Now if you look into the cd_smiles field in Structure table, it became “c1ccc2nccc2c1”. This smile is incorrect according to our discussion in one of my previous posts. The difference from “c1ccc2[nH]ccc2c1” to “c1ccc2nccc2c1” is 1 H molecular weight.

Yes, this problem can be resolved by add molecule object into the UpdateHandler (see the commented out code). In this way, the smiles (“c1ccc2[nH]ccc2c1”) can be generated correctly. But the cd_structure contains mol file that is not consistent with the cd_smiles. It looks like that the fingerprints in the table were created based cd_smiles, not cd_structure when the molecule is set into UpdataHandler (I can tell this when I try certain searches).

The real problem is inconsistency of information between cd_structure and cd_smiles. Although this situation is better than losing 1 H unit, but it still creates problem in structure display. To compromise the structure orientation required from our chemists, we display structure in Marvin or MarvinSketch with mol file instead of smiles. Therefore if a user copy a Daylight aromatized structure in Marvin and paste in another interface, and tries to search this structure, unless user deliberately add an H to the aromatic N, it will not hit the right structure because the direct intepretation from the Daylight Aromatized structure to smiles will be “c1ccc2nccc2c1” that is not equal the cd_smiles.

I think the real problem here is the Daylight aromatization in the level of mol file. The needed function is to reinstall the H on Daylight-aromatic heteroatoms at mol file level whenever the aromatization is called, and this information has to be remembered regardless how user wants the structure to be displayed.

I am using JChem base 3.0.14 in testing this. Let me know if I did not explain the problem clearly.

Ben Li

Neurogen Corporation

ChemAxon a3d59b832c

19-08-2005 07:52:12

Hi Ben,

We have found a solution to your problem. We will store the implicit hydrogen in the molfile as an attached data to the atom. (This is also called data sgroup, the support for this has just been released in Marvin 4.0.) When importing the molfile, Marvin/JChem will recognize these special data attachments and convert them back to implicit H. I will let you know when this implementation is ready.

Best regards,

Szabolcs

User d68ef9d5a9

23-08-2005 13:12:17

Hi Szabolcs,

With what time frame do you think we can get this problem fixed? We are very anxiously waiting for the solution you have proposed.

I appreciate your help and time.

Ben Li

ChemAxon a3d59b832c

23-08-2005 13:25:40

Hi Ben,

I have already implemented the above mentioned change, but that introduced some undesired side effects. After these are resolved, we will give out a new release for you. (We are currently in the very final phases of testing JChem 3.1, but because of the problems we could not include the new feature in that version.) I guess you will get a working version in a matter of weeks.

Best regards,

Szabolcs

User d68ef9d5a9

25-08-2005 15:28:26

Hi Szabolcs,

I appreciate your help and effort. This is very critical to our company because our management decided to use Daylight arromatization representation for our compounds. Current implementation in our company leaves certain holes in database, which may cause undesired behaviors. This is quite worrysome to our group. So I hope your solution can be available as early as possible so that we can fill these holes early.

Best regards,

Ben Li

ChemAxon a3d59b832c

26-08-2005 12:55:15

Ben,

I fixed the remaining issues, so the new feature is ready to be released. It will appear in JChem version 3.1.1.

If you would you like to get an alpha release before that, we can prepare it for you.

Best regards,

Szabolcs

User d68ef9d5a9

26-08-2005 13:24:29

Thank you, Szabolcs.

While I am waiting for the 3.1.1, I definitely like to have the alpha version to test on my development zone.

Do you know when the 3.1.1 will be available?

Anyway, I appreciate your help.

Ben

ChemAxon 9c0afc9aaf

26-08-2005 16:28:14

Hi Ben,

We expect 3.1.1 late next week, or the week after.

I will prepare a test version sooner though, probably Monday.

Best regards,

Szilard

ChemAxon 9c0afc9aaf

29-08-2005 18:09:26

Hi,

The latest test version is out, and available for download:

http://www.chemaxon.com/download.php?d=/data/download/jchem/test

Best regards,

Szilard

ChemAxon a3d59b832c

12-09-2005 12:08:43

Ben,

JChem 3.1.1 is out, but it is still using Marvin 4.0.1, which does not include the solution for the implicit H solution. We are currently working on stabilizing the next Marvin release and a JChem version using it will soon follow.

Did you have a chance to check the solution in the test version?

Please do not hesitate to contact us if you need any help.

Best regards,

Szabolcs

User d68ef9d5a9

12-09-2005 14:31:33

Hi Szabolcs,

Thank you for your information. I wonder if the Jchem.jar has already contained H implicit function. Absolutely we need both working fine. But if the marvin is not available, I may be able to get around if the jcham.jar has this function.

I have tested some scenarios of H implicit with the test version. I have seen this information has been inserted into molfile. However I am still checking if ISIS accepts this kind of notations.

Best regards,

Ben Li

ChemAxon a3d59b832c

12-09-2005 20:56:04

Hi Ben,

benli wrote:

Thank you for your information. I wonder if the Jchem.jar has already contained H implicit function. Absolutely we need both working fine. But if the marvin is not available, I may be able to get around if the jcham.jar has this function.

Unfortunately this functionality is missing from jchem.jar also. (jchem.jar always contains the classes of a released version of Marvin as well. The new feature is implemented in the IO classes and these belong to Marvin. We will release Marvin & then JChem ASAP.)

benli wrote:

I have tested some scenarios of H implicit with the test version. I have seen this information has been inserted into molfile. However I am still checking if ISIS accepts this kind of notations.

ISIS should be able to read the data sgroups, but they will not be converted to hydrogens. They will be seen as labels IMPL_H<n> (n is the number of implicit hydrogens on the particular Nitrogen atom.).

Best regards,

Szabolcs

User d68ef9d5a9

15-09-2005 13:19:33

Hi Szabolcs,

I understand. But we really need the Daylight aromatization issue resolved in the new version as soon as possible. The molfiles in our database are causing troubles in our research activities.

I appreciate your help and hard work.

Ben Li

ChemAxon a3d59b832c

05-11-2005 15:33:13

JChem 3.1.2 is out and contains the above mentioned fix. Did it work OK in the debug version we sent you a few weeks ago?

All the best,

Szabolcs