discovering cd_id after importMols

User fa1369adab

07-12-2005 19:08:44

Is there a recommended way to discover the cd_id of the molecule I have just imported using Importer.importMols()? I am only importing a single molecule. I need the cd_id so I can add more information to the database record that JChem has stored the Molecule in.





Raphael

ChemAxon 9c0afc9aaf

07-12-2005 22:36:50

Raphael,





For importing individual molecules I suggest UpdateHandler:





http://www.chemaxon.com/jchem/doc/api/chemaxon/jchem/db/UpdateHandler.html





You can get the cd_id value of the inserted structure by calling execute(true).


Alternatively you can also set values for additional columns before insertion by using setValueForAdditionalColumn.





Please see the usage example in the API documentation.





Best regards,





Szilard

User 2082812c83

24-04-2006 19:44:15

Szilard,





Using UpdateHandler as you suggest is fine for one molecule but if you are importing a large number of molecules it would be good to have a way to get the identifiers of those molecules back after the import.





This is especially true if over time you are importing more data into an existing table as in such a case you dont have an easy way to distinguish between the molecules that were existing in the table and those that were just added.





Is there a way to get the ids of the molecules just imported? I could not see anything obvious in the Importer classes interface. Right now it seems the only way to do this is to reimport all the molecules and then use the getDuplicateIDs call.





When the number of molecules is large this is going to lead to poor performance.





Thanks,





James

ChemAxon 9c0afc9aaf

24-04-2006 20:24:23

Quote:
Using UpdateHandler as you suggest is fine for one molecule
It is also fine for a large number of molecules.


In fact Importer also uses UpdateHandler.


The only trick is that you should not create a new UpdateHandler for each structure.
Quote:
This is especially true if over time you are importing more data into an existing table as in such a case you don't have an easy way to distinguish between the molecules that were existing in the table and those that were just added.
A simple solution to distinguish the structures of different import processes is to add a column which stores a import ID.


This works fine even in the case of an unexpected problem (e.g. power failure).


You need UpdateHandler for this too.





If you still prefer to use Importer instead, I can easily implement an option to collect imported IDs in memory for the next JChem release.





Best regards,





Szilard

User 2082812c83

27-04-2006 05:34:40

Ok then I will just use UpdateHandler as it is important for the processing I am performing to know if a molecule is a duplicate and to know the cd_id of any molecule added to the structure table that is not a duplicate.





Thanks,





James

ChemAxon 9c0afc9aaf

27-04-2006 09:10:55

OK.





I will implement the option for collecting the imported ID numbers regardless in Importer for the next release as it may be a useful feature anyway.





Szilard