Usage

In /scripts you may find some scripts prepared to run the default values with the only input being the dataset to be used, through the argument –dataset.

NOTE: Configure the paths to the datasets by editting the file in qdf/settings.py:

DATASET_PATH = ".../QuantumDeepField_molecule/dataset"
SAVE_PATH = ".../QuantumDeepField_molecule/output"

1. Preprocessing (for training):

python preprocess_train.py --dataset=$dataset_trained

e.g _python preprocess_train.py –dataset=QM9under7atoms_homolumoeV

Options:

dataset [required]: [string] dataset to be used in pre-training. From those that can be installed directly from the cloned repository the options are:
- “QM9under14atoms_atomizationenergy_eV”
- “QM9full_atomizationenergy_eV”
- “QM9full_homolumo_eV” Note: Two properties (homo and lumo)
- ““

2. Training:

python train.py --dataset=$dataset_trained --num_workers=$num_workers --seed=$seed --device=$device

e.g _python train.py –dataset=QM9under7atoms_homolumoeV

Options:

dataset [required]: [string] dataset to be used in pre-training. From those that can be installed directly from the cloned repository the options are:
- “QM9under14atoms_atomizationenergy_eV”
- “QM9full_atomizationenergy_eV”
- “QM9full_homolumo_eV” Note: Two properties (homo and lumo)
- ““
num_workers: [int] number of workers to use for the dataloader. Defaults to 1.
seed: [int] integer used to specify the seed for the model initialization. Defaults to 1729.
device: [string] device to use for training and inference in the model, options are [“cuda”, “cpu”], if None is specified it will use “cuda” if available in your system, else will use “cpu” (slower).

3. Preprocessing inference (predict):

python preprocess_predict.py --dataset_train=$dataset_trained --dataset_predict=$dataset_predict

e.g python preprocess_predict.py –dataset_train=QM9under7atoms_homolumo_eV –dataset_predict=QM9full_homolumo_eV

Options:

dataset_train [required]: [string] dataset that was used in pre-training. It is use to look for and load the appropriate orbital dictionaries so that the preprocessing done in the prediction dataset is coherent to what was done in pre-processing the original dataset trained on.
dataset_predict [required]: [string] dataset to be used in prediction.

4. Prediction (Inference):

python predict.py --dataset_train=$dataset_trained --dataset_predict=$dataset_predict --model_path=$model_path --num_workers=$num_workers --seed=$seed --device=$device

e.g python predict.py –dataset_train=QM9under7atoms_homolumo_eV –dataset_predict=QM9full_homolumo_eV –model_path=”../pretrained/model”

Options:

dataset_train [required]: [string] dataset that was used in pre-training. It is use to look for and load the appropriate orbital dictionaries so that the preprocessing done in the prediction dataset is coherent to what was done in pre-processing the original dataset trained on.
dataset_predict [required]: [string] dataset to be used in prediction.
model_path [required]: [string] path to file where the pre-trained model is saved.
num_workers: [int] number of workers to use for the dataloader. Defaults to 1.
seed: [int] integer used to specify the seed for the model initialization. Defaults to 1729.
device: [string] device to use for training and inference in the model, options are [“cuda”, “cpu”], if None is specified it will use “cuda” if available in your system, else will use “cpu” (slower).