Unique SMILES and atom map numbers

User 6ef33138f9

04-01-2007 22:40:28

Hello,





In some cases, generating unique SMILES for a molecule with atom map numbers produces different results than the same molecule with no atom map numbers. I'm using Marvin 4.1.2.





Code:



      // This is the original SMILES that Marvin thinks is unique


      String originalSmiles = "[CH3:1][S:2][C:25]1=[C:24]([NH2:15])[C:23]2=[C:22]([N:21]([CH2:13][CH3:14])[CH:20]=[C:19]


([C:17]([OH:16])=[O:18])[C:27]2=[O:28])[C:10]([CH:11]=[CH2:12])=[C:26]1[N:6]3[CH2:5][CH:4]([CH3:3])[NH:9][CH2:8][CH2:7]3";


      Molecule mol = MolImporter.importMol(originalSmiles);


      String uniqueSmilesWithAtomMapNumbers = mol.toFormat("smiles:u-a");


      assertEquals(originalSmiles, uniqueSmilesWithAtomMapNumbers);  // OK


      


      // Take the SMILES above and remove the atom map numbers (and brackets) by hand


      String uniqueSmilesWithAtomMapsRemovedByHand = "CSC1=C(N)C2=C(N(CC)C=C(C(O)=O)C2=O)C(C=C)=C1N3CC(C)NCC3";


      


      // Remove atom map numbers and generate another unique SMILES


      for (MolAtom atom: mol.getAtomArray())


         atom.setAtomMap(0);


      String uniqueSmilesFromMoleculeWithoutAtomMaps = mol.toFormat("smiles:u-a");


      


      assertEquals(uniqueSmilesWithAtomMapsRemovedByHand, uniqueSmilesFromMoleculeWithoutAtomMaps);  // FAILS!








This seems similar to a problem my colleague reported last year:


http://chemaxon.com/forum/ftopic732.html#2730





Thanks,


Chris

User f359e526a1

05-01-2007 07:37:05

Hello, adding maps can change the order of SMILES indeed. Unique SMILES relies on graph invariants and when calculating graph invariants, one of the atomic parameters we are considering are atom maps. In this meaning a map is considered as extra information similar to charge or bond order, etc. Why exactly you want to make sure the two strings are equal?

User 6ef33138f9

05-01-2007 17:58:05

Really? I'm surprised that atom maps are included in the invariants, since they're just extra tags or metadata and not part of the chemical structure. Also, it's not documented on your page or in the original Daylight paper:


http://www.chemaxon.com/jchem/marvin/doc/user/smiles-doc.html





So, the short answer to your question "why exactly you want to make sure the two strings are equal?" is that I think they should be equal. (-:





The long answer to your question is: We have code that creates a molecule (with atom map numbers) from various fragments/ligands. We need the atom map numbers to keep track of where the atoms in the final molecule came from (i.e. from which fragment). However, what we want to save is the unique SMILES without atom map numbers, so it's really unique (since the same molecule could be created in different ways from different fragments). So, we first get the unique SMILES with atom map numbers. Then we record information about the fragments' atoms, in terms of indices into the unique SMILES. Then we save the "real" unique SMILES without atom map numbers. This worked in Marvin 3.5.7, which is what we've been using so far.





Thanks,


Chris

User f359e526a1

05-01-2007 21:04:10

OK, I was afraid you want to check the duplicates in a database or something like that.





Meanwhile I checked the code, apparently there is a workaround to get rid of the mappings, likely it will be in the next minor release.

User 6ef33138f9

08-01-2007 15:22:09

Thank you. Can you tell me more about the workaround and when it will be available? We'd like to upgrade from Marvin 3.5.7 to the latest 4.x version, but this issue is blocking us currently.





Thanks,


Chris

User f359e526a1

08-01-2007 16:15:14

Unfortunatelly the 4.1.5 release is imminent, it can be included only to 4.1.6 (that is about a month to arrive). If you need it badly, we can make an alfa pre-release of 4.1.6 during the week - that will not have other bugfixes that are due to 4.1.6 but this fix will be there.

User 6ef33138f9

08-01-2007 20:34:11

It would be great if we could get an alpha version with the fix. That way we can continue our testing with that version to make sure there are no other issues for us, and then upgrade to 4.1.6 when it's released later. Please let me know where I can download the alpha when it's available.





Thank you,


Chris

User 6ef33138f9

17-01-2007 15:56:42

Hello,





Were you able to create an alpha version with this fix?





Thanks,


Chris

ChemAxon 7c2d26e5cf

17-01-2007 17:12:41

The pre-release of Marvin 4.1.6 is available here:


http://www.chemaxon.com/test/marvin

User f359e526a1

17-01-2007 19:31:27

JChem also ready with this Marvin at http://www.chemaxon.com/download.php?d=/data/download/jchem/test