Demultiplexing
Demultiplexing = reorganizing the FASTQ files + generating the statistics and reporting files.
Reorganizing FASTQ Files
The first step of demultiplexing is reorganizing the base call files, based on the index
sequence. This step is done the following way for each cluster:
1 Get the raw index for each Index Read from the BCL file.
2 Identify the appropriate sample for the index based on the sample sheet.
3 Optional: Detect and correct up to two errors on the barcode, and identify the
appropriate sample. If there are multiple Index Reads, detect and correct up to two
errors in each Index Read.
4 Optional: Detect the presence of adapter sequence at the end of read. If adapter
sequence is detected, trim or mask (with N) the corresponding base calls.
5 Append the read to the appropriate new FASTQ file for each read.
6 If the index cannot be identified, the data are written into an Undetermined sample
file, unless the sample sheet specifies a sample for reads without index
Compiling bcl2fastq v2.15 on Ubuntu 12.04 and 14.04
Wed 27 August 2014 — Filed under notes; tags: linuxTable of Contents
Illumina provides a program for demultiplexing sequencing output called
bcl2fastq
. They get a gold star for releasing the source - the downside is that they release binaries only for RHEL/CentOS, and no build instructions for Ubuntu. So how hard could it be?Ubuntu 14.04 (Trusty Tahr)
I thought I'd start here since the packages are more up to date (turns out it's a good thing I did, see the morass below). There's some documentation from Illumina for compiling from source here. There's not a lot to go on, other than a list of dependencies, which boils down to:
- zlib
- librt
- libpthread
- gcc 4.1.2 (with c++)
- boost 1.54 (with its dependencies)
- cmake 2.8.9
Really the only tricky part was figuring out the required packages, which didn't correspond particularly well to the list of dependencies above. I didn't bother trying to install specific version of any of the dependencies other than boost 1.54.
On an Amazon AWS EC2 instance (m3.medium, ubuntu-trusty-14.04-amd64-server-20140607.1 ami-e7b8c0d7):
sudo apt-get update sudo apt-get upgrade sudo apt-get install zlibc sudo apt-get install libc6 # provides librt and libpthread sudo apt-get install gcc sudo apt-get install g++ sudo apt-get install libboost1.54-all-dev sudo apt-get install cmake
From there, compilation more or less works as advertised:
wget ftp://webdata2:webdata2@ussd-ftp.illumina.com/downloads/Software/bcl2fastq/bcl2fastq2-v2.15.0.4.tar.gz tar -xf bcl2fastq2-v2.15.0.4.tar.gz cd bcl2fastq mkdir build cd build PREFIX=/usr/local sudo mkdir -p ${PREFIX:?} ../src/configure --prefix=${PREFIX:?} make sudo make install
We wanted this version to coexist with an older one, so I renamed the executable:
sudo mv $PREFIX/bin/bcl2fastq $PREFIX/bin/bcl2fastq2
syntax error at /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment/Config.pm line 761, near "}" /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment/Config.pm has too many errors. Compilation failed in require at /usr/local/lib/bcl2fastq-1.8.4/perl/Casava/Alignment.pm line 61.sudo apt-get install libexpat1-devsudo apt-get install xsltprocThe reason for the errors is that bcl2fastq is not compatible with the default perl 5.18 of Ubuntu 14.04. You need to install an older perl version to execute the script. Use the following commands to install, e.g. 5.14, to path/perlbrew/:
cd path/perlbrew/wget http://install.perlbrew.pl -O install_perlbrew.shexport PERLBREW_ROOT=path/perlbrew/ && bash install_perlbrew.shsource ./etc/bashrcperlbrew install perl-5.14.4perlbrew switch perl-5.14.4perlbrew install-cpanmcpanm XML/Simple.pm
http://nhoffman.github.io/borborygmi/compiling-bcl2fastq-on-ubuntu.html
No comments:
Post a Comment