Context Navigation

Changes between Version 4 and Version 5 of SOPs/AlphaFoldMultimer

Timestamp:: 04/23/24 07:44:20 (14 months ago)
Author:: twhitfie
Comment:: --

Legend:

: Unmodified
: Added
: Removed
: Modified

SOPs/AlphaFoldMultimer

-              v4
+              v5
 {{{
 sbatch
+sbatch RunColabFold_multimer_1.5.5.slurm
 }}}
 In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job that is submitted to the SLURM scheduler might look like:
+In the command above the job (i.e. RunColabFold_multimer_1.5.5.slurm) that is submitted to the SLURM scheduler might look like:
 {{{
 …
 #SBATCH --cpus-per-task=8 # number of cores/threads requested.
 #SBATCH --mem=64gb # memory requested.
 #SBATCH --partition=nvidia-A6000-20 # partition (queue) to use
 #SBATCH --output AFbatch1.5.5.out # write output to file.
+#SBATCH --partition=nvidia-t4-20 # partition (queue) to use
+#SBATCH --output AFbatch.out # write output to file.
 #SBATCH --gres=gpu:1 # Required for GPU access
 …
 cd ${workpath}
+colabfold_batch --msa-mode mmseqs2_uniref_env --model-type alphafold2_multimer_v3 --rank multimer fasta/RALF23_FERONIA_complex.fa RALF23_FERONIA_CF_complex
+colabfold_batch --msa-mode mmseqs2_uniref_env --model-type alphafold2_multimer_v3 --rank multimer fasta/proteins.fa output
+}}}
+In the commands above, you will need to substitute the path to your working directory along with paths to your fasta file and output directory.  In the example above, the fasta file (i.e. proteins.fa) is within a subdirectory, called "fasta", of the working directory.  Likewise, the output will be written in a subdirectory, called "output", of the working directory.  When using ColabFold, be sure to separate the amino acid sequences for individual proteins with a colon, like in this example:
+{{{
+>proteins
+RMKQLEDKVEELLSKNYHLENEVARLKKLVGER:
+RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
+}}}
+The following instructions allow you to run AlphaFold-Multimer locally without using ColabFold:
+{{{
+sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=/path/to/fasta/file,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_multimer_2.3.2_slurm.sh
+}}}
+In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job (i.e. RunAlphaFold_multimer_2.3.2_slurm.sh) that is submitted to the SLURM scheduler might look like:
+{{{
 #!/bin/bash
 #SBATCH --job-name=AF2                  # friendly name for job.
+#SBATCH --job-name=AF2M                 # friendly name for job.
 #SBATCH --nodes=1                       # ensure cores are on one node
 #SBATCH --ntasks=1                      # run a single task
 …
 cd $AF2_WORK_DIR
 singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70
+singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=multimer --pdb_seqres_database_path=/data/pdb_seqres/pdb_seqres.txt --uniprot_database_path=/data/uniprot/uniprot.fasta --num_multimer_predictions_per_model=1
 # Email the STDOUT output file to specified address.
 …
 }}}
 The following instructions allow you to run AlphaFold-Multimer locally without using ColabFold:
+Unlike when using ColabFold, when running AlphaFold as above, the input fasta file "example.fa" should be a list of fasta entries, one per amino acid sequence within the multimeric complex.  For example:
 {{{
+sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=/path/to/fasta/file,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_2.3.2_slurm.sh
+>proteinA
+RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
+>proteinB
+RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
 }}}
-In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job that is submitted to the SLURM scheduler might look like:
-{{{
-#!/bin/bash
-#SBATCH --job-name=AF2                  # friendly name for job.
-#SBATCH --nodes=1                       # ensure cores are on one node
-#SBATCH --ntasks=1                      # run a single task
-#SBATCH --cpus-per-task=8               # number of cores/threads requested.
-#SBATCH --mem=64gb                      # memory requested.
-#SBATCH --partition=nvidia-t4-20        # partition (queue) to use
-#SBATCH --output output-%j.out          # %j inserts jobid to STDOUT
-#SBATCH --gres=gpu:1                    # Required for GPU access
-export TF_FORCE_UNIFIED_MEMORY=1
-export XLA_PYTHON_CLIENT_MEM_FRACTION=4
-export OUTPUT_NAME='model_1'
-export ALPHAFOLD_DATA_PATH='/alphafold/data.2023b' # Specify ALPHAFOLD_DATA_PATH
-cd $AF2_WORK_DIR
-singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70
-# Email the STDOUT output file to specified address.
-/usr/bin/mail -s "$SLURM_JOB_NAME $SLURM_JOB_ID" $USERNAME@wi.mit.edu < $AF2_WORK_DIR/output-${SLURM_JOB_ID}.out
-}}}