Registration of duplicate structures as new lots

User 0f1e393145

29-11-2012 01:40:56

Hey All,

I'm having a bit of trouble regarding registration of new lots/batches of duplicate compounds. Not quite sure how to approach this problem. I need to import an SDF file containing about 12k compounds which contain many duplicates. The import would need to be done through the API and when a duplicate structure is encountered the structure would not be registered in the structure table, but rather it's related table holding data on the lots of that structure.

Say the following were structures and JChem would be encountering them from Top>Bottom while processing the SDF:

Structure X: Registered as CDID=1

Structure Y: Registered as CDID=2

Structure X: Registered as CDID=1-2 in lot table

Structure W: Registered as CDID=3

Structure Z: Registered as CDID=4

Structure X: Registered as CDID=1-3 in lot table

If anyone can assist, or point me in the right direction it would be greatly appreciated! Thanks!
FYI: I would not be opposed to performing this in JChem Cartridge, but IJC would be preferred.


ChemAxon e189db4705

29-11-2012 02:09:54

Hi Adam,

so I suppose you are writing your own script doing this import completely, right?

I think you can use mol importer to parse and read sdf and then insert each molecule into the main structure table. If duplicates are not allowed you'll get error return code from insert operation. You can use DFEntityDataProvider for inserting rows. The insert method returns DFUpdateInfo object. You can ask isDuplicate on this object and so you get the cd_id of that duplicate which is already in table (use getId method for this).

If this happens you can insert the molecule into the second jchem table. You can also make these two tables connected throught 1:N relationship and so for each registerd molecule you'll see duplicates from detail table. Not sure if this is needed. If so, just create a relationship and insert FK in detail with value of master cd_id of duplicate.

Let us know if anything is not clear.