Changes between Version 4 and Version 5 of SOPs/AlphaFoldMultimer


Ignore:
Timestamp:
04/23/24 07:44:20 (9 months ago)
Author:
twhitfie
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • SOPs/AlphaFoldMultimer

    v4 v5  
    1313 
    1414{{{
    15 sbatch
     15sbatch RunColabFold_multimer_1.5.5.slurm
    1616}}}
    1717
    18 In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job that is submitted to the SLURM scheduler might look like:
     18In the command above the job (i.e. RunColabFold_multimer_1.5.5.slurm) that is submitted to the SLURM scheduler might look like:
    1919
    2020{{{
     
    2626#SBATCH --cpus-per-task=8 # number of cores/threads requested.
    2727#SBATCH --mem=64gb # memory requested.
    28 #SBATCH --partition=nvidia-A6000-20 # partition (queue) to use
    29 #SBATCH --output AFbatch1.5.5.out # write output to file.
     28#SBATCH --partition=nvidia-t4-20 # partition (queue) to use
     29#SBATCH --output AFbatch.out # write output to file.
    3030#SBATCH --gres=gpu:1 # Required for GPU access
    3131
     
    3535cd ${workpath}
    3636
    37 colabfold_batch --msa-mode mmseqs2_uniref_env --model-type alphafold2_multimer_v3 --rank multimer fasta/RALF23_FERONIA_complex.fa RALF23_FERONIA_CF_complex
     37colabfold_batch --msa-mode mmseqs2_uniref_env --model-type alphafold2_multimer_v3 --rank multimer fasta/proteins.fa output
     38}}}
    3839
     40In the commands above, you will need to substitute the path to your working directory along with paths to your fasta file and output directory.  In the example above, the fasta file (i.e. proteins.fa) is within a subdirectory, called "fasta", of the working directory.  Likewise, the output will be written in a subdirectory, called "output", of the working directory.  When using ColabFold, be sure to separate the amino acid sequences for individual proteins with a colon, like in this example:
     41
     42{{{
     43>proteins
     44RMKQLEDKVEELLSKNYHLENEVARLKKLVGER:
     45RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
     46}}}
     47
     48The following instructions allow you to run AlphaFold-Multimer locally without using ColabFold:
     49
     50{{{
     51sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=/path/to/fasta/file,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_multimer_2.3.2_slurm.sh
     52}}}
     53
     54In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job (i.e. RunAlphaFold_multimer_2.3.2_slurm.sh) that is submitted to the SLURM scheduler might look like:
     55
     56{{{
    3957#!/bin/bash
    4058
    41 #SBATCH --job-name=AF2                  # friendly name for job.
     59#SBATCH --job-name=AF2M                 # friendly name for job.
    4260#SBATCH --nodes=1                       # ensure cores are on one node
    4361#SBATCH --ntasks=1                      # run a single task
     
    5573
    5674cd $AF2_WORK_DIR
    57 singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70
     75singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=multimer --pdb_seqres_database_path=/data/pdb_seqres/pdb_seqres.txt --uniprot_database_path=/data/uniprot/uniprot.fasta --num_multimer_predictions_per_model=1
    5876
    5977# Email the STDOUT output file to specified address.
     
    6179}}}
    6280
    63 The following instructions allow you to run AlphaFold-Multimer locally without using ColabFold:
     81Unlike when using ColabFold, when running AlphaFold as above, the input fasta file "example.fa" should be a list of fasta entries, one per amino acid sequence within the multimeric complex.  For example:
    6482
    6583{{{
    66 sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=/path/to/fasta/file,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_2.3.2_slurm.sh
     84>proteinA
     85RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
     86
     87>proteinB
     88RMKQLEDKVEELLSKNYHLENEVARLKKLVGER
    6789}}}
    68 
    69 In the command above, substitute your own user id, fasta file and the paths to both the fasta file and the working directory.  In this example, the job that is submitted to the SLURM scheduler might look like:
    70 
    71 {{{
    72 #!/bin/bash
    73 
    74 #SBATCH --job-name=AF2                  # friendly name for job.
    75 #SBATCH --nodes=1                       # ensure cores are on one node
    76 #SBATCH --ntasks=1                      # run a single task
    77 #SBATCH --cpus-per-task=8               # number of cores/threads requested.
    78 #SBATCH --mem=64gb                      # memory requested.
    79 #SBATCH --partition=nvidia-t4-20        # partition (queue) to use
    80 #SBATCH --output output-%j.out          # %j inserts jobid to STDOUT
    81 #SBATCH --gres=gpu:1                    # Required for GPU access
    82 
    83 export TF_FORCE_UNIFIED_MEMORY=1
    84 export XLA_PYTHON_CLIENT_MEM_FRACTION=4
    85 
    86 export OUTPUT_NAME='model_1'
    87 export ALPHAFOLD_DATA_PATH='/alphafold/data.2023b' # Specify ALPHAFOLD_DATA_PATH
    88 
    89 cd $AF2_WORK_DIR
    90 singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70
    91 
    92 # Email the STDOUT output file to specified address.
    93 /usr/bin/mail -s "$SLURM_JOB_NAME $SLURM_JOB_ID" $USERNAME@wi.mit.edu < $AF2_WORK_DIR/output-${SLURM_JOB_ID}.out
    94 }}}