How to bulk download properties from chemicalize.org

User b85af45de7

04-03-2013 04:46:52

Hi,


I am looking for a way to download the calculated properties for all chemicals on the site in a bulk download, so that I can query them from a perl program that I have written. Is there a way to do this?


Thanks!

ChemAxon 6c76bc6409

04-03-2013 09:39:44

Hi Sujaya,


We don't store the calculation results, so I cannot give this data.


I could however give you the molecule database itself (~340k structures) as a SMILES or MRV or SDF and you could perform calculations and analysis yourself.


Would that help you?


Andras

User b85af45de7

04-03-2013 10:21:56

Hi Andras,


Thanks for the prompt response. Yes, it would help to have the database of molecules in SMILES format, and then I could perform the calculations. Is there a way you could provide this?


Thanks!

ChemAxon 6c76bc6409

05-03-2013 19:17:44

Hi Suyaja,


http://www.chemicalize.org/downloads/chemicalizeorg-structures.smi.zip (2.5MB)


If you come to any interesting conclusions from analysing this dataset, we would really love to hear about it. Also I should mention that if you publish any scientific articles/create presentations based on this data, we would love it if you could cite ChemAxon and chemicalize.org.


https://www.chemaxon.com/forum/ftopic977.html


Enjoy!
Andras 

User b85af45de7

06-03-2013 05:55:00

Thanks for the link to download. I see that the downloaded file has only SMILES strings. Is there an easy way to get pubchem ID (if it exists) for these strings, since that is what I would like to index by? 


I will certainly acknowledge chemicalize and chemaxon if we do use and publish anything. 


Thanks again!

ChemAxon 6c76bc6409

06-03-2013 07:50:12

If you're interested in the pubchem IDs, I recommend downloading the chemicalize.org structures from pubchem. We are a datasource on pubchem for a few months now.


http://www.ncbi.nlm.nih.gov/pcsubstance?term=%22954%22%5Bsourcename%5D&cmd=search&db=pcsubstance


BR,
Andras 

User 8e7155957e

04-02-2014 21:18:29

That's wonderful that there's a database (?) with ~340,000 compounds listed.  That .smi file extension has me baffled, though: what is it and how do I open such a file, please?  Thanks.

User 3610fe66e1

26-02-2015 16:32:28

Thanks for the question.