4 | | The success of [https://www.nature.com/articles/s41586-021-03819-2 DeepMind's AlphaFold protein folding algorithm] in the CASP14 structural prediction assessment has been widely celebrated and has profoundly invigorated the structural biology community. Today, if you have a protein sequence for which you'd like to learn a high quality predicted structure, an excellent place to start is the [https://alphafold.ebi.ac.uk/ AlphaFold Protein Structure Database]. An alternative database to search is the [https://esmatlas.com/resources?action=fold ESM Metagenomic Atlas], where you may find predicted structures for orphan proteins with few sequence homologs. |
| 4 | The success of [https://www.nature.com/articles/s41586-021-03819-2 DeepMind's AlphaFold protein folding algorithm] in the CASP14 structural prediction assessment has been widely celebrated and has profoundly invigorated the structural biology community. Today, if you have a protein sequence for which you'd like to learn a high quality predicted structure, an excellent place to start is the [https://alphafold.ebi.ac.uk/ AlphaFold Protein Structure Database]. An alternative database to search is the [https://esmatlas.com/resources?action=fold ESM Metagenomic Atlas], where you may find predicted structures for orphan proteins with few sequence homologs. |
| 5 | |
| 6 | === Running AlphaFold using ChimeraX === |
| 7 | |
| 8 | If you cannot find a predicted structure for your protein within the databases listed above, perhaps because amino acid substitutions relative to the reference sequence are present, [https://www.cgl.ucsf.edu/chimerax/ ChimeraX] is an [https://www.youtube.com/watch?v=gIbCAcMDM7E easy place to start due to its graphical user interface] and convenient visualization tools. |
| 9 | |
| 10 | === Running AlphaFold locally === |
| 11 | |
| 12 | It may happen that the freely available computational resources accessed via ChimeraX are a constraint on completing your AlphaFold predictions. In that case, you can make the predictions locally using a command like the following: |
| 13 | |
| 14 | {{{ |
| 15 | sbatch --export=ALL,FASTA_NAME=example.fa,USERNAME='user',FASTA_PATH=proteins,AF2_WORK_DIR=/path/to/working/directory ./RunAlphaFold_2.3.2_slurm.sh |
| 16 | }}} |
| 17 | |
| 18 | In this example, the job that is submitted to the SLURM scheduler might look like: |
| 19 | |
| 20 | {{{ |
| 21 | #!/bin/bash |
| 22 | |
| 23 | #SBATCH --job-name=AF2 # friendly name for job. |
| 24 | #SBATCH --nodes=1 # ensure cores are on one node |
| 25 | #SBATCH --ntasks=1 # run a single task |
| 26 | #SBATCH --cpus-per-task=8 # number of cores/threads requested. |
| 27 | #SBATCH --mem=64gb # memory requested. |
| 28 | #SBATCH --partition=nvidia-t4-20 # partition (queue) to use |
| 29 | #SBATCH --output output-%j.out # %j inserts jobid to STDOUT |
| 30 | #SBATCH --gres=gpu:1 # Required for GPU access |
| 31 | |
| 32 | export TF_FORCE_UNIFIED_MEMORY=1 |
| 33 | export XLA_PYTHON_CLIENT_MEM_FRACTION=4 |
| 34 | |
| 35 | export OUTPUT_NAME='model_1' |
| 36 | export ALPHAFOLD_DATA_PATH='/alphafold/data.2023b' # Specify ALPHAFOLD_DATA_PATH |
| 37 | |
| 38 | cd $AF2_WORK_DIR |
| 39 | singularity run -B $AF2_WORK_DIR:/af2 -B $ALPHAFOLD_DATA_PATH:/data -B .:/etc --pwd /app/alphafold --nv /alphafold/alphafold_2.3.2.sif --data_dir=/data/ --output_dir=/af2/$FASTA_PATH --fasta_paths=/af2/$FASTA_PATH/$FASTA_NAME --max_template_date=2050-01-01 --db_preset=full_dbs --bfd_database_path=/data/bfd/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt --uniref30_database_path=/data/uniref30/UniRef30_2023_02 --uniref90_database_path=/data/uniref90/uniref90.fasta --mgnify_database_path=/data/mgnify/mgy_clusters_2022_05.fa --template_mmcif_dir=/data/pdb_mmcif/mmcif_files --obsolete_pdbs_path=/data/pdb_mmcif/obsolete.dat --use_gpu_relax=True --model_preset=monomer --pdb70_database_path=/data/pdb70/pdb70 |
| 40 | |
| 41 | # Email the STDOUT output file to specified address. |
| 42 | /usr/bin/mail -s "$SLURM_JOB_NAME $SLURM_JOB_ID" $USERNAME@wi.mit.edu < $AF2_WORK_DIR/output-${SLURM_JOB_ID}.out |
| 43 | }}} |