Parsing diff in evaluator and API

User 677b9c22ff

21-07-2008 19:08:46

Hi,


i have a large XML file with SMARTS and get





Code:



evaluate -e 'matchcount("*/[D2]=[D2]/*")' "CCC"








no error





and with





Code:



evaluate -e 'matchcount("*/[D2]=[D2]\*")' "CCC"








I get





Code:



Exception in thread "main" chemaxon.nfunk.jep.TokenMgrError: Lexical error at line 1, column 25.  Encountered: "*" (42), after : "\'matchcount(*/[D2]=[D2]\\"


        at chemaxon.nfunk.jep.ParserTokenManager.getNextToken(ParserTokenManager.java:641)


        at chemaxon.nfunk.jep.Parser.getToken(Parser.java:1671)


        at chemaxon.nfunk.jep.Parser.jj_3_4(Parser.java:1496)


        at chemaxon.nfunk.jep.Parser.jj_3R_23(Parser.java:1507)


        at chemaxon.nfunk.jep.Parser.jj_3R_11(Parser.java:1241)


        at chemaxon.nfunk.jep.Parser.jj_3_1(Parser.java:1300)


        at chemaxon.nfunk.jep.Parser.jj_2_1(Parser.java:1172)


        at chemaxon.nfunk.jep.Parser.Start(Parser.java:34)


        at chemaxon.nfunk.jep.Parser.parseStream(Parser.java:18)


        at chemaxon.nfunk.jep.JEP.parseExpression(JEP.java:370)


        at chemaxon.jep.ChemJEP.compile(ChemJEP.java:116)


        at chemaxon.jep.Evaluator.compile(Evaluator.java:565)


        at chemaxon.jep.Evaluator.compile(Evaluator.java:494)


        at chemaxon.jep.Evaluator.main(Evaluator.java:810)








using the SMARTS matching example for substructure mining from forum with the JAVA API code I get no errors; only in evaluator.





I also saw that MVIEW rewrites some of the SMARTS


like */[D2]=[D2]\* becomes *\[*;D2]=[*;D2]/* and


like */[D2]=[D2]/* becomes *\[*;D2]=[*;D2]\*





I am working with jchem.version=3.2.11


jchem.veruln=3_2_11


# include only numeric characters (and dots)


# give this property if version id incluses alphabetic characters


jchem.vernum=3.2.11


jchem.date=2007.09.18


jchem.table.version=41


marvin.package=marvin-all-4_1_13.zip





I guess this is not the latest version :-)





Cheers


Tobias

ChemAxon e08c317633

22-07-2008 15:31:26

Hi,





The backslashes ("\") are interpreted as escape characters by your shell, I think. Try this:
Code:
evaluate -e 'matchcount("*/[D2]=[D2]\\*")' "CCC"



Regards,


Zsolt

User 677b9c22ff

23-07-2008 00:57:30

Hi,


the error came from an XML file, I just used the command line for showing the error (if it is one).





evaluate -f eval-error.xml "CCC"





eval-error.xml


Code:
array(


matchcount("*/[D2]=[D2]\*"),


matchcount("*/[D2]=[D2]/*"),


matchcount("*/[D2]=[D2]\\*"))








error


Code:



C:\chemistry\qsar\datasets\drug>evaluate -f eval-error.xml "CCC"


Exception in thread "main" chemaxon.nfunk.jep.TokenMgrError: Lexical error at line 2, column 25.  Encountered: "*" (42),


 after : "\"*/[D2]=[D2]\\"


        at chemaxon.nfunk.jep.ParserTokenManager.getNextToken(ParserTokenManager.java:641)


        at chemaxon.nfunk.jep.Parser.getToken(Parser.java:1671)


        at chemaxon.nfunk.jep.Parser.jj_3_4(Parser.java:1496)


        at chemaxon.nfunk.jep.Parser.jj_3R_23(Parser.java:1507)


        at chemaxon.nfunk.jep.Parser.jj_3R_11(Parser.java:1241)


        at chemaxon.nfunk.jep.Parser.jj_3_10(Parser.java:1430)


        at chemaxon.nfunk.jep.Parser.jj_2_10(Parser.java:1235)


        at chemaxon.nfunk.jep.Parser.ArgumentList(Parser.java:1029)


        at chemaxon.nfunk.jep.Parser.Function(Parser.java:998)


        at chemaxon.nfunk.jep.Parser.UnaryExpressionNotPlusMinus(Parser.java:825)


        at chemaxon.nfunk.jep.Parser.PowerExpression(Parser.java:770)


        at chemaxon.nfunk.jep.Parser.UnaryExpression(Parser.java:757)


        at chemaxon.nfunk.jep.Parser.MultiplicativeExpression(Parser.java:545)


        at chemaxon.nfunk.jep.Parser.AdditiveExpression(Parser.java:463)


        at chemaxon.nfunk.jep.Parser.RelationalExpression(Parser.java:319)


        at chemaxon.nfunk.jep.Parser.EqualExpression(Parser.java:237)


        at chemaxon.nfunk.jep.Parser.AndExpression(Parser.java:194)


        at chemaxon.nfunk.jep.Parser.OrExpression(Parser.java:151)


        at chemaxon.nfunk.jep.Parser.MultipleExpression(Parser.java:101)


        at chemaxon.nfunk.jep.Parser.Expression(Parser.java:85)


        at chemaxon.nfunk.jep.Parser.ArgumentList(Parser.java:1030)


        at chemaxon.nfunk.jep.Parser.Function(Parser.java:998)


        at chemaxon.nfunk.jep.Parser.UnaryExpressionNotPlusMinus(Parser.java:825)


        at chemaxon.nfunk.jep.Parser.PowerExpression(Parser.java:770)


        at chemaxon.nfunk.jep.Parser.UnaryExpression(Parser.java:757)


        at chemaxon.nfunk.jep.Parser.MultiplicativeExpression(Parser.java:545)


        at chemaxon.nfunk.jep.Parser.AdditiveExpression(Parser.java:463)


        at chemaxon.nfunk.jep.Parser.RelationalExpression(Parser.java:319)


        at chemaxon.nfunk.jep.Parser.EqualExpression(Parser.java:237)


        at chemaxon.nfunk.jep.Parser.AndExpression(Parser.java:194)


        at chemaxon.nfunk.jep.Parser.OrExpression(Parser.java:151)


        at chemaxon.nfunk.jep.Parser.MultipleExpression(Parser.java:101)


        at chemaxon.nfunk.jep.Parser.Expression(Parser.java:85)


        at chemaxon.nfunk.jep.Parser.Start(Parser.java:35)


        at chemaxon.nfunk.jep.Parser.parseStream(Parser.java:18)


        at chemaxon.nfunk.jep.JEP.parseExpression(JEP.java:370)


        at chemaxon.jep.ChemJEP.compile(ChemJEP.java:116)


        at chemaxon.jep.Evaluator.compile(Evaluator.java:565)


        at chemaxon.jep.Evaluator.compile(Evaluator.java:494)


        at chemaxon.jep.Evaluator.main(Evaluator.java:810)








Version is Marvin 4.1.11


Cheers


Tobias

ChemAxon e08c317633

23-07-2008 21:08:43

Hi Tobias,





You are right, the escape characters are handled by the JEP engine (the Chemical Terms engine is based on the JEP engine), sorry for the mistake.





So the solutions:





1. You have to use escape characters. Your expression in right form with escape characters looks like this:
Code:
array(


matchcount("*/[D2]=[D2]\\*"),


matchcount("*/[D2]=[D2]/*"),


matchcount("*/[D2]=[D2]\\*"))






2. Use mols.smarts file for defining molecules. See the Predefined Molecules and Molecule Sets section of the Chemical Terms manual for more.





Example mols.smarts content:
Code:
*/[D2]=[D2]\*   cis


*/[D2]=[D2]/*   trans



After putting the example mol.smarts file to the .chemaxon (Linux) or chemaxon (Windows) directory in your user home you can refer to the molecules in file as "cis" or "trans".





Examples:
Code:
evaluate -e "matchcount(cis)" "C/C=C\C"


1



Code:
evaluate -e "matchcount(trans)" "C/C=C/C"


1



Code:
evaluate -e "matchcount(cis)" "C/C=C/C"


0



Note: the way mols.smarts is handled has changed in JChem 5.0 version. For details see the manual.





I hope this helps.





Zsolt