Refine
Year of publication
- 2022 (2) (remove)
Language
- English (2) (remove)
Has Fulltext
- yes (2)
Is part of the Bibliography
- no (2)
Keywords
- assembly gaps (1)
- benchmarking (1)
- genome assembly (1)
- long sequencing reads (1)
Institute
Production of K0S, Λ (Λ), Ξ± and Ω± in jets and in the underlying event in pp and p–Pb collisions
(2022)
The production of strange hadrons (K0S, Λ, Ξ±, and Ω±), baryon-to-meson ratios (Λ/K0S, Ξ/K0S, and Ω/K0S), and baryon-to-baryon ratios (Ξ/Λ, Ω/Λ, and Ω/Ξ) associated with jets and the underlying event were measured as a function of transverse momentum (pT) in pp collisions at s√=13 TeV and p-Pb collisions at sNN−−−√=5.02 TeV with the ALICE detector at the LHC. The inclusive production of the same particle species and the corresponding ratios are also reported. The production of multi-strange hadrons, Ξ± and Ω±, and their associated particle ratios in jets and in the underlying event are measured for the first time. In both pp and p-Pb collisions, the baryon-to-meson and baryon-to-baryon yield ratios measured in jets differ from the inclusive particle production for low and intermediate hadron pT (0.6−6 GeV/c). Ratios measured in the underlying event are in turn similar to those measured for inclusive particle production. In pp collisions, the particle production in jets is compared with PYTHIA 8 predictions with three colour-reconnection implementation modes. None of them fully reproduces the data in the measured hadron pT region. The maximum deviation is observed for Ξ± and Ω±, which reaches a factor of about six. In p-Pb collisions, there is no significant event-multiplicity dependence for particle production in jets, in contrast to what is observed in the underlying event. The presented measurements provide novel constraints on hadronisation and its Monte Carlo description. In particular, they demonstrate that the fragmentation of jets alone is insufficient to describe the strange and multi-strange particle production in hadronic collisions at LHC energies.
Background: Long sequencing reads allow increasing contiguity and completeness of fragmented, short-read–based genome assemblies by closing assembly gaps, ideally at high accuracy. While several gap-closing methods have been developed, these methods often close an assembly gap with sequence that does not accurately represent the true sequence.
Findings: Here, we present DENTIST, a sensitive, highly accurate, and automated pipeline method to close gaps in short-read assemblies with long error-prone reads. DENTIST comprehensively determines repetitive assembly regions to identify reliable and unambiguous alignments of long reads to the correct loci, integrates a consensus sequence computation step to obtain a high base accuracy for the inserted sequence, and validates the accuracy of closed gaps. Unlike previous benchmarks, we generated test assemblies that have gaps at the exact positions where real short-read assemblies have gaps. Generating such realistic benchmarks for Drosophila (134 Mb genome), Arabidopsis (119 Mb), hummingbird (1 Gb), and human (3 Gb) and using simulated or real PacBio continuous long reads, we show that DENTIST consistently achieves a substantially higher accuracy compared to previous methods, while having a similar sensitivity.
Conclusion: DENTIST provides an accurate approach to improve the contiguity and completeness of fragmented assemblies with long reads. DENTIST's source code including a Snakemake workflow, conda package, and Docker container is available at https://github.com/a-ludi/dentist. All test assemblies as a resource for future benchmarking are at https://bds.mpi-cbg.de/hillerlab/DENTIST/.