Version 3 (modified by 9 months ago) ( diff ) | ,
---|
Using AlphaFold multimer to predict the structure of protein complexes
Background
As soon as the effectiveness of AlphaFold2 for protein structure prediction became evident, workers began to adapt it to predicting protein structure complexes. This effort led to AlphaFold-Multimer. While the best place to start a search for a predicted structure for a single protein sequence is likely to be an online database, you will likely have to compute the predicted structures for multimeric protein complexes.
Running AlphaFold-Multimer using ChimeraX
As with structure prediction for monomeric proteins, ChimeraX is a good starting point due to its intuitive graphical user interface and convenient visualization tools. You will need to install ChimeraX on a desktop or laptop computer, but the AlphaFold predictions will be made using computing resources in the cloud.
Running AlphaFold using computing resources at Whitehead
It may happen that the freely available computational resources accessed via ChimeraX are a constraint on completing your AlphaFold predictions. In that case, you can make the predictions locally using a command like the following:
sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=/path/to/fasta/file,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_2.3.2_slurm.sh
In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory. In this example, the job that is submitted to the SLURM scheduler might look like:
#!/bin/bash #SBATCH --job-name=AF2 # friendly name for job. #SBATCH --nodes=1 # ensure cores are on one node #SBATCH --ntasks=1 # run a single task #SBATCH --cpus-per-task=8 # number of cores/threads requested. #SBATCH --mem=64gb # memory requested. #SBATCH --partition=nvidia-t4-20 # partition (queue) to use #SBATCH --output output-%j.out # %j inserts jobid to STDOUT #SBATCH --gres=gpu:1 # Required for GPU access export TF_FORCE_UNIFIED_MEMORY=1 export XLA_PYTHON_CLIENT_MEM_FRACTION=4 export OUTPUT_NAME='model_1' export ALPHAFOLD_DATA_PATH='/alphafold/data.2023b' # Specify ALPHAFOLD_DATA_PATH cd $AF2_WORK_DIR singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70 # Email the STDOUT output file to specified address. /usr/bin/mail -s "$SLURM_JOB_NAME $SLURM_JOB_ID" $USERNAME@wi.mit.edu < $AF2_WORK_DIR/output-${SLURM_JOB_ID}.out