I am trying to import ChEMBL database from ftp://ftp.ebi.ac.uk/pub/databases/chembl/ into IJC, however for some tables I get "New entity creation failed: For input String: NULL" error when I try to promote the table "COMPOUND_PROPERTIES" to IJC entity, see attached image. The table on which I get the error is created using DDL:
MOLREGNO INTEGER DEFAULT NULL,
MW_FREEBASE DOUBLE DEFAULT NULL,
ALOGP DOUBLE DEFAULT NULL,
HBA INTEGER DEFAULT NULL,
HBD INTEGER DEFAULT NULL,
PSA DOUBLE DEFAULT NULL,
RTB INTEGER DEFAULT NULL,
RO3_PASS VARCHAR(3) DEFAULT NULL,
NUM_RO5_VIOLATIONS INTEGER DEFAULT NULL,
MED_CHEM_FRIENDLY VARCHAR(3) DEFAULT NULL,
CONSTRAINT COMPOUND_PROPERTIES_FK_MOLREGNO FOREIGN KEY (MOLREGNO) REFERENCES COMPOUNDS (
CONSTRAINT UK_CP_MOLREGNO UNIQUE (MOLREGNO)
I created all tables and imported data outside of IJC using a SQL script, except for COMPOUNDS tables which was created and imported using IJC. The table COMPOUND_PROPERTIES has 32,658 (out of 517,261) rows where all columns have NULL values except MOLREGNO, however I was able to promote to entities tables with null values rows (except ID column) to IJC before without any errors.
I am using IJC 3.0.4 on Mac OS X 10.6
I think the basic problem is that the table does not have a primary key column. IJC must have such a column to use to identify the rows in the table. If you create such a column (e.g. as the first column) then the table can be promoted OK:
ID INTEGER GENERATED BY DEFAULT AS IDENTITY(START WITH 1) PRIMARY KEY
Values for this column will be generated automatically for you when you insert rows into the table.
I tried to add a new column with GENERATED BY DEFAULT AS IDENTITY but Derby won't allow me to do it on existing table, [Error Code: 30000, SQL State: 42601] ALTER TABLE statement cannot add an IDENTITY column to a table. I was able however to do: ALTER TABLE COMPOUND_PROPERTIES ADD PRIMARY KEY (MOLREGNO) which worked partially. I was able to promote the table to entity in IJC but not all fields were promoted, ALOGP and several other columns were not promoted to fields. If I try to promote them to fields I get an error similar to the previous one (see image).
It seems that this is caused by the DOUBLE column types which are not handled correctly by IJC. This is probably the actual cause of the original error you reported, not the lack of a primary key constraint.
The good news is that this is already fixed in the next release which should be out soon. In the meantime I would suggest reverting to a REAL column type (assuming you are working in Derby database). REAL would probably be a better choice anyway as the IJC field handles the values as Float datatypes (32 bits) not Double (64 bits) so it avoids any risk of data conversion, and I don't think the values for LogP or PSA etc. really merit double precision data types :-)