Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing
experiments, same implementation as
CollectWgsMetrics, with different defaults: lacks baseQ and mappingQ filters
and has much higher coverage cap.
This tool computes metrics that are useful for evaluating coverage and performance
of whole genome sequencing experiments. These metrics include the percentages of reads that pass
minimal base- and mapping- quality filters as well as coverage (read-depth) levels.
The histogram output is optional and for a given run, displays two separate outputs on the y-axis while using a single set
of values for the x-axis. Specifically, the first column in the histogram table (x-axis) is labeled 'coverage' and
represents different possible coverage depths. However, it also represents the range of values for the base quality scores
and thus should probably be labeled 'sequence depth and base quality scores'. The second and third columns (y-axes)
correspond to the numbers of bases at a specific sequence depth 'count' and the numbers of bases at a particular base
quality score 'baseq_count' respectively.
Although similar to the
CollectWgsMetrics tool, the default thresholds for CollectRawWgsMetrics are less stringent.
For example, the CollectRawWgsMetrics have base and mapping quality score thresholds set to '3' and '0' respectively,
while the
CollectWgsMetrics tool has the default threshold values set to '20' (at time of writing). Nevertheless, both
tools enable the user to input specific threshold values.
Note: Metrics labeled as percentages are actually expressed as fractions!
Usage example:
java -jar picard.jar CollectRawWgsMetrics \\
I=input.bam \\
O=output_raw_wgs_metrics.txt \\
R=reference.fasta \\
INCLUDE_BQ_HISTOGRAM=true
Please see
the WgsMetrics documentation for detailed explanations of the output metrics.