Friday, June 3, 2016

bcl2fastq v2.15 on ubuntu 14.04

Demultiplexing 

Demultiplexing =  reorganizing the FASTQ files +  generating the statistics and reporting files.

Reorganizing FASTQ Files

The first step of demultiplexing is reorganizing the base call files, based on the index
sequence. This step is done the following way for each cluster:
1 Get the raw index for each Index Read from the BCL file.
2 Identify the appropriate sample for the index based on the sample sheet.
3 Optional: Detect and correct up to two errors on the barcode, and identify the
appropriate sample. If there are multiple Index Reads, detect and correct up to two
errors in each Index Read.
4 Optional: Detect the presence of adapter sequence at the end of read. If adapter
sequence is detected, trim or mask (with N) the corresponding base calls.
5 Append the read to the appropriate new FASTQ file for each read.
6 If the index cannot be identified, the data are written into an Undetermined sample
file, unless the sample sheet specifies a sample for reads without index

Compiling bcl2fastq v2.15 on Ubuntu 12.04 and 14.04

Wed 27 August 2014 — Filed under notes; tags: linux
Illumina provides a program for demultiplexing sequencing output called bcl2fastq. They get a gold star for releasing the source - the downside is that they release binaries only for RHEL/CentOS, and no build instructions for Ubuntu. So how hard could it be?

Ubuntu 14.04 (Trusty Tahr)

I thought I'd start here since the packages are more up to date (turns out it's a good thing I did, see the morass below). There's some documentation from Illumina for compiling from source here. There's not a lot to go on, other than a list of dependencies, which boils down to:
  • zlib
  • librt
  • libpthread
  • gcc 4.1.2 (with c++)
  • boost 1.54 (with its dependencies)
  • cmake 2.8.9
Really the only tricky part was figuring out the required packages, which didn't correspond particularly well to the list of dependencies above. I didn't bother trying to install specific version of any of the dependencies other than boost 1.54.
On an Amazon AWS EC2 instance (m3.medium, ubuntu-trusty-14.04-amd64-server-20140607.1 ami-e7b8c0d7):
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install zlibc
sudo apt-get install libc6 # provides librt and libpthread
sudo apt-get install gcc
sudo apt-get install g++
sudo apt-get install libboost1.54-all-dev
sudo apt-get install cmake
From there, compilation more or less works as advertised:
wget ftp://webdata2:webdata2@ussd-ftp.illumina.com/downloads/Software/bcl2fastq/bcl2fastq2-v2.15.0.4.tar.gz
tar -xf bcl2fastq2-v2.15.0.4.tar.gz
cd bcl2fastq
mkdir build
cd build
PREFIX=/usr/local
sudo mkdir -p ${PREFIX:?}
../src/configure --prefix=${PREFIX:?}
make
sudo make install
We wanted this version to coexist with an older one, so I renamed the executable:
sudo mv $PREFIX/bin/bcl2fastq $PREFIX/bin/bcl2fastq2
syntax error at /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment/Config.pm line 761, near "}" /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment/Config.pm has too many errors. Compilation failed in require at /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment.pm line 61.
sudo apt-get install libexpat1-dev
sudo apt-get install xsltproc
The reason for the errors is that bcl2fastq is not compatible with the default perl 5.18 of Ubuntu 14.04. You need to install an older perl version to execute the script. Use the following commands to install, e.g. 5.14, to path/perlbrew/:
cd path/perlbrew/
wget http://install.perlbrew.pl -O install_perlbrew.sh
export PERLBREW_ROOT=path/perlbrew/ && bash install_perlbrew.sh
source ./etc/bashrc
perlbrew install perl-5.14.4
perlbrew switch perl-5.14.4
perlbrew install-cpanm
cpanm XML/Simple.pm
http://nhoffman.github.io/borborygmi/compiling-bcl2fastq-on-ubuntu.html

No comments:

Post a Comment