A Tool for breaking up a reference into intervals of alternating regions of N and ACGT bases.
Summary
Used for creating a broken-up interval list that can be used for scattering a variant-calling pipeline in a way that
will not cause problems at the edges of the intervals. By using large enough N blocks (so that the tools will not be
able to anchor on both sides) we can be assured that the results of scattering and gathering the variants with the
resulting interval list will be the same as calling with one large region.
Input
A reference file to use for creating the intervals
Which type of intervals to emit in the output (Ns only, ACGT only or both).
An integer indicating the largest number of Ns in a contiguous block that will be "tolerated" and not
converted into an N block.
Output
An interval list (with a SAM header) where the names of the intervals are labeled (either N-block or ACGT-block) to
indicate what type of block they define.
Usage example
Create an interval list of intervals that do not contain any N blocks for use with haplotype caller on short reads
java -jar picard.jar ScatterIntervalsByNs \
R=reference_sequence.fasta \
OT=BOTH \
O=output.interval_list