User 73531e86ff
02-12-2010 10:50:21
We are having an issue with molfiles that are stored in our database that contain protein labels in the connection table instead of explicit atoms.
This is possibly a non-standard molfile since the specification doesn't mention that this can be done. However, Marvin and other tools seem happy to import and display these files.
Our database also stores SMILES for each compound and that is the column which is indexed for searching. Therefore, when a we need to search for a structure, we use jchem to convert the Molfile into a SMILES. Unfortunately, jchem doesn't automatically expand the groups so protein labels from the molfile are represented in the SMILES by a star and thus the structure is not found.
If we retrieve the SMILES from the database and then search it works fine because the SMILES is explicitly specifying all the atoms in the protein. However, users mostly retrieve the Molfile since it contains preferred coordinates for rendering.
Without changing *all* the protein molfiles in our database is there a way that JChem or the standardizer can convert these Molfiles into the corresponding SMILES? I did try the "Alias to Atom" and "Expand Group" standardizer functions but they didn't work for the example molfile I have.