get a bulk download of the properties for huge amount data

User 3d832cf8c7

06-11-2013 04:07:12

I have a text file with around one million molecule names in (SMILES) format , for each of the molecules I need their structures and properties. If I write a script to download all these files for each of the molecules, after a while connection is refused and I can't download more than 5000 files. Its for a machine learning project .If you could either give me a sdf file containing the structure information for all the molecules or not block the IP I would be very grateful.


 


 


 


Regards


Spoorthi Ravi

ChemAxon 6c76bc6409

07-11-2013 08:42:17

Hi Spoorthi,


chemicalize.org is designed to handle load from our regular users through the browser, it is not well equipped to handle batch processing of hundreds of thousands of structures. In fact it destabilized the servers so we had to take measures to prevent denial of service to other users.


Since you are from Georgia Tech, I think you are eligible for an academic research license. Under this license you could get access to the exact same developer toolkit we used to build chemicalize.org, you could run these calculations locally being much more efficient and probably you could fine tune select calculations as well.


http://www.chemaxon.com/my-chemaxon/my-academic-license/


On this page you can find all relevant information, including signing up for such a license, so you should be able to calculate again with a day or so.


The toolkit is available in java, .net and REST web service forms, if you let me know exactly what information you need, I can give you pointers how to start coding.


Andras

User 3d832cf8c7

08-11-2013 15:50:53

Hello, Thank you for the fast reply. My academic license just got approved . I have a list of 2 million SMILES names of molecules. I want to be able to download their sdf files or someway get their properties using a script . If I paste the SMILES formula in the search bar I can download the file but when I run the script to do this for all the molecules it breaks at random points sometimes after downloading 20000 files , sometimes after 1000 because of connection timed out /connection refused , so now that I have the academic license does it mean I wont get blocked or if that is not the case I would really appreciate it if you could tell me how to proceed to get these properties


 


 


Thank you again for all the help

ChemAxon 6c76bc6409

09-11-2013 08:52:16

Hi,


As I said, you cannot use chemicalize.org for such a project. 


With your academic license, you can take our developer tools and write a java / .net / web service based application that does the same calculations on your computer.


If you can let us know what type of calculations you want to do, we can help you with code examples.


Andras

ChemAxon b124dd5f17

09-11-2013 08:56:30

As a GUI based alternative you can download Instant JChem and populate a project with your molecules and predict properties and then export the project, or just do whatever you want to be doing in Instant JChem.

User 3d832cf8c7

09-11-2013 18:26:16

Hello again


 


I need to get the 3d coordinates of all the molecules for which I have the SMILES formula. the coordinates was available in a the sdf format before. Hoe do I proceed with InstantJchem or is there anything I can do with the REST api

User 3d832cf8c7

09-11-2013 21:09:16

I've installed Tomcat exactly  like in the installation guide and created a .shemaxon directory, downloaded my license and put the license.cxl file there and I stiil get this error  {"errorCode":0,"errorMessage":"No valid license found. Contact administrator!","rootCause":"chemaxon.license.LicenseException"}



Please help me fix this ehat do I have to do next 

User 3d832cf8c7

09-11-2013 21:16:17

correction .chemaxon file

User 3d832cf8c7

10-11-2013 23:23:52

there a bunch of licenses that I have Marvin Applets


Marvin Beans
Instant JChem 
JChem Base
JChem Cartridge
Standardizer
Screen
Reactor
Fragmenter
JKlustor
Metabolizer
Markush Search
Protonation Plugin Group
Partitioning Plugin Group
Charge Plugin Group
Isomers Plugin Group
Conformation Plugin Group
Geometry Plugin Group
Huckel Analysis Plugin
Refractivity Plugin
HBDA Plugin
Markush Enumeration Plugin
Structure to Name Plugin
Name to Structure
JChem for Excel
Structure Search
IUPAC naming plugin
Calculations Pack
Web Services Server
Structural Frameworks Plugin
Structure Checker
Predictor Plugin
MCES
3D Alignment
3D Screen
Molecular Descriptors
Instant JChem VIZ
ECFP/FCFP
Document to Structure
NMR Predictor


JChem for Office. Is  web services included in that because the only thing I need for each of these molecules is  the their 3D coordinates (sdf files) and it says no valid license found on installation of Web Services .Please can you tell me how I can proceed 

ChemAxon e07e2a364b

19-11-2013 16:38:17

Dear Spoorthi Ravi,


   Your problem may be that the .chemaxon directory is created in the wrong user home (as tomcat typically runs under a different user than your account). You can verify this by cheking whether further files are created next to your license file (e.g.: ws-config.xml). If it is so, the directory location is fine.


 


Can you share us some more detail about your installation details: OS type/version (please include the name of the distribution if it is a Linux installation), ´JVM vendor and version, JChem WS version. Hope this help.