| 1 | == Creating and using virtual environments == |
| 2 | |
| 3 | === Conda environments === |
| 4 | |
| 5 | Start by downloading and installing conda somewhere that will have enough room to hold lots of applications (so not your home directory) |
| 6 | |
| 7 | {{{ |
| 8 | # Get the Miniforge installer |
| 9 | wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh |
| 10 | bash Miniforge3-Linux-x86_64.sh |
| 11 | # Miniforge3 will now be installed into this location: |
| 12 | # [choose your preferred location] |
| 13 | /nfs/BaRC/USER/conda |
| 14 | }}} |
| 15 | |
| 16 | Create your desired environment |
| 17 | |
| 18 | {{{ |
| 19 | # Activate the environment (pointing to where you installed conda) |
| 20 | eval "$(/nfs/BaRC/USER/conda/bin/conda shell.bash hook)" |
| 21 | # Create a new environment |
| 22 | # If you don't include '--no-default-packages' you'll also get everything on your PATH |
| 23 | /nfs/BaRC/USER/conda/bin/conda create --name RNAseq_2024a --no-default-packages |
| 24 | }}} |
| 25 | |
| 26 | Activate the environment |
| 27 | |
| 28 | '''conda activate RNAseq_2024a''' |
| 29 | |
| 30 | Add applications to your environment, specifying versions (if you want the install commands to be reproducible). These will be installed under your original conda location. |
| 31 | The newest version of some software can cause problems (such as with STAR: "Genome version: 2.7.1a is INCOMPATIBLE with running STAR version: 2.7.11b") or conda incompatibilities. |
| 32 | |
| 33 | {{{ |
| 34 | conda install -c bioconda STAR=2.7.11b |
| 35 | conda activate RNAseq_BaRC_20241220 |
| 36 | conda install -c bioconda multiqc=1.25.2 |
| 37 | conda install -c bioconda fastqc=0.12.1 |
| 38 | conda install -c bioconda STAR=2.7.1 |
| 39 | conda install -c bioconda subread=2.0.8 |
| 40 | }}} |
| 41 | |
| 42 | Get a list of packages in our environment |
| 43 | |
| 44 | '''conda list -n RNAseq_2024a''' |
| 45 | |
| 46 | Leave the environment |
| 47 | |
| 48 | '''conda deactivate''' |
| 49 | |
| 50 | Go back to environment |
| 51 | |
| 52 | '''conda activate RNAseq_2024a''' |
| 53 | |
| 54 | The name of your current environment should be obvious from the command line. |
| 55 | |
| 56 | '''(RNAseq_2024a) gbell@sparky ~$''' |
| 57 | |
| 58 | Save the environment |
| 59 | |
| 60 | '''conda env export > RNAseq_BaRC_20241219.environment.yml''' |
| 61 | |
| 62 | Someone else should be able to create new environment from this YAML file |
| 63 | |
| 64 | '''conda env create -f RNAseq_BaRC_20241219.environment.yml''' |
| 65 | |
| 66 | Remove a problem piece of software from the environment |
| 67 | |
| 68 | '''conda remove STAR''' |
| 69 | |
| 70 | If you no longer want the environment |
| 71 | |
| 72 | '''conda remove -n ENV_NAME --all''' |
| 73 | |
| 74 | If we want to use slurm, we need to add the path to the slurm commands. |
| 75 | Is there a better way to do this? |
| 76 | |
| 77 | '''export PATH=$PATH:/opt/slurm/bin''' |
| 78 | |
| 79 | To test the environment -- the RNA-seq Hot Topics exercises should work. |
| 80 | |
| 81 | See also the Whitehead IT **conda** page: https://clusterguide.wi.mit.edu/software/conda/ |
| 82 | |
| 83 | === Singularity environments === |
| 84 | |
| 85 | [[https://docs.sylabs.io/guides/3.5/user-guide/introduction.html | Singularity containers]] allow you to create and run containers that package up pieces of software in a way that is portable and reproducible. Some software now comes in this way so that "installation" is simply downloading a SIF file. |
| 86 | |
| 87 | One example is [[https://github.com/NBISweden/AGAT | AGAT]] (Another Gtf/Gff Analysis Toolkit), which provides [[https://github.com/NBISweden/AGAT?tab=readme-ov-file#using-singularity | instructions]] on how to download and run the AGAT container that includes a series of applications: |
| 88 | |
| 89 | {{{ |
| 90 | # Download |
| 91 | singularity pull docker://quay.io/biocontainers/agat:1.0.0--pl5321hdfd78af_0 |
| 92 | # Run |
| 93 | singularity run agat_1.0.0--pl5321hdfd78af_0.sif |
| 94 | # When finished |
| 95 | exit |
| 96 | }}} |
| 97 | |
| 98 | Then one can run commands such as 'agat_convert_sp_gff2gtf.pl'. The trouble is that the environment doesn't include our usual filesystem, making it not very useful. The 'singularity' command needs to be modified to also include the required folder(s), such as the following one-line command |
| 99 | |
| 100 | {{{ |
| 101 | singularity run -B /lab/BaRC_projects:/lab/BaRC_projects --cleanenv --pwd /lab/BaRC_projects /nfs/BaRC_Public/apps/AGAT/agat_1.0.0--pl5321hdfd78af_0.sif |
| 102 | # Go where we want |
| 103 | cd /lab/BaRC_projects |
| 104 | # Check that the environment includes our desired files/folders |
| 105 | ls |
| 106 | }}} |
| 107 | |
| 108 | One problem is that this is an older version of AGAT (v1.0.0). Another problem is that some of the commands require samtools, which is not present in the container. What can we do about this? |
| 109 | |
| 110 | One solution to build a customized singularity container is to use the [[https://seqera.io/containers/ | Sequera Container Builder]]. We can search for and add AGAT (not the first hit) and samtools, then specifying that we want a singularity container. Then click on "Get Container". When it's ready, run 'singularity pull' on the oras link, like |
| 111 | |
| 112 | '''singularity pull oras://community.wave.seqera.io/library/agat_samtools:d30ed34317069fe6''' |
| 113 | |
| 114 | We end up downloading a file like 'agat_samtools_d30ed34317069fe6.sif'. Then we can do the 'singularity run' command like above and get to run both AGAT and samtools. |
| 115 | |
| 116 | By the way, including '--cleanenv' in the 'singularity run' command is to prevent the container from reading the environment from your .bashrc file. If you want to include those aliases, etc. then remove '--cleanenv'. |
| 117 | |
| 118 | See also the Whitehead IT **singularity** page: https://clusterguide.wi.mit.edu/software/singularity/ |