MORLD Tutorial

Demo: Optimizing Ponatinib for DDR1 Kinase

In this tutorial, we will demonstrate how to optimize the compound ponatinib to discover novel inhibitors against DDR1 kinase. You can download the prepared PDB file of DDR1 kinase here.

According to AutoDock Vina, ponatinib exhibits a binding energy of approximately -13.1 kcal/mol against DDR1. Using MORLD, we aim to enhance this binding energy, creating molecules with a stronger affinity for DDR1.

Step	Method
Step 1	Navigate to the "Job Submission" page to start a new job.
Step 2	Enter a descriptive job name, e.g., `demo_pona`. Optional: To receive email notifications upon job completion, enter your email address in the "Email" field. Optional: For secure access to your results, set a password in the "Password" field. The result file will be compressed with the password.
Step 3	Input the canonical SMILES string of ponatinib into the "Initial molecule" text field: `CC1=C(C=C(C=C1)C(=O)NC2=CC(=C(C=C2)CN3CCN(CC3)C)C(F)(F)F)C#CC4=CN=C5N4N=CC=C5`
Step 4	Upload the prepared PDB file of DDR1 kinase (`3zosA_prepared.pdb`). Note: The server now automatically preprocesses uploaded PDB files into a format suitable for docking. This process includes removing water molecules and ligands. As a result, you can directly upload PDB files obtained from sources such as RCSB without any manual preparation.
Step 5	Number of Modification: This parameter controls the extent of molecular modification in each episode. A larger value allows for more significant changes from the initial molecule, while a smaller value generates structures more similar to the starting molecule. The allowed range is 1 to 10. Number of Generation: This determines the number of molecules MORLD will generate. A higher number of episodes allows MORLD to explore more chemical space and potentially achieve better results. However, if docking errors occur, the actual number of generated molecules might be less. The allowed range is 1 to 1000.
Step 6	Enter the binding site information. These values are derived from PDB ID 3ZOS using AutoDockTools (ADT): Center (x, y, z): `-7.5, 2.5, -40` Size (x, y, z): `24, 20, 20`
Step 7	Click the "Submit" button to submit your job. You can monitor its progress on the Queue page. Note: With default settings, the job may take approximately two days to complete. For a quick test run, reduce the "Max num. of modification" or "Num. of training episode" parameters.

Understanding the Queue Page

Track the status of your submitted jobs on the "Queue" page.

Category	Description
Job ID	Unique identification number assigned to your submitted job.
Status	Not started yet: Your job is queued on the server and waiting to be processed. If your job doesn't start within a few minutes, please re-check your input parameters or contact the server administrator at jjs092@kaist.ac.kr. Job is running: Your job is currently being processed on the server. Job is finished: Your job has completed successfully, and the result file is ready for download. Error occurred: Your job was terminated due to an error. Please verify your input parameters for correctness. For detailed error information, download the `error_log.txt` file via the "Error" link on the Queue page.
Job Name	The name you assigned to your job during submission.
Submitted Time	Timestamp indicating when you submitted the job.
Download Link	Success: A "Download" link will appear upon successful job completion, allowing you to download the results.

Understanding Result Files

The downloaded result file contains the following files and folders:

Category	Contents
`optimized_total.csv`	This CSV file contains all generated molecules from the optimization process. The number of molecules corresponds to the "Num. of training episode" you set. For example, if you set N episodes, this file will contain N optimized molecules. The columns represent: Generation Number, SMILES code, Binding Energy (Qvina score), SA score, and QED score. You can sort this data by Binding Energy to identify molecules with the highest binding affinity or filter by SA score to exclude molecules with unrealistic structures.
`(PDB_ID)_cleand.pdb`	The preprocessed pdb file for docking.
`pose_*.pdbqt`	These PDBQT files representing the binding poses of each optimized molecule. Files are named as `pose_##.pdbqt`, where `##` corresponds to the molecule's generation number. Use the generation number from `optimized_total.csv` to find the docking pose of a specific molecule, especially those with high docking scores. You can visualize these `.pdbqt` files with molecular visualization tools like PyMOL to examine the binding interactions with the target protein.
`(job_id)_config.json`	This file contains a detailed record of all input parameters you specified.