Predictor plugin training

Predictor can be applied for molecular property prediction when molecular properties can be expressed as the sum of atomic contributions. You have to create a training file which contains the structures and the experimental values of the property you would like to create a prediction for.

Training set creation via cxtrain

The command line program cxtrain is available for logP, pKa and custom prediction training.

  1. Create a structure file of any molecule file format from your experimental data (easily done with Instant JChem). The file must contain the following information: In the example below this file is my_data_mp.sdf.
  2. Execute the following command from command line:
    cxtrain prediction -t MP -i meltingpoint my_data_mp.sdf
    The data tagged MP is
  3. Use this data via via cxcalc, Chemical Terms or Marvin's Predictor Plugin.

cxtrain options list

Molecular Property Prediction Trainer, (C) 1998-2011 ChemAxon Ltd.
version 5.4.0
Trains molecular property predictions: pKa, logP, etc.
 
Usage:
  cxtrain  [options] [input file (training set)]

Prediction:
  pka                              train pKa prediction
  logp                             train logP prediction
  prediction                       train custom prediction

General options:
  cxtrain -h, --help               this help message
  -i, --training-id   sets the training ID
  -l, --list                       list available training ID's
  -g, --ignore-error               continue with next molecule on error

pKa options:
  -V, --validation       validation results file path

logP options:
  -t, --tag              name of the SDFile tag that stores the
                                   experimental logP values
  -a, --add-built-in-training-set  add built-in logP training set

Custom prediction options:
  -t, --tag              name of the SDFile tag that stores the
                                   experimental property values

Examples:
  cxtrain pka -i mypka pKa_trainingset.sdf
  cxtrain logp -t LOGP -i mylogp -a logP_trainingset.sdf
  cxtrain logp --list
  cxtrain prediction -t PAMPA -i mypampa pampa_trainingset.sdf

Known issues

MarvinSketch and MarvinWiew applet cannot access pKa correction library files and logP/predictor training parameter files stored on server. Applet allows only to use trainings stored on local computer.

Predictor plugin teaching in Instant JChem

The training of the Predictor plugin is simplest by using the graphical interface of Instant JChem where the logP and general property trainings are available. See the IJC documentation for details.