• English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • русский 
    • English
    • français
    • Deutsch
    • español
    • português (Brasil)
    • Bahasa Indonesia
    • русский
    • العربية
    • 中文
  • Войти
Просмотр элемента 
  •   Главная
  • OAI Data Pool
  • OAI Harvested Content
  • Просмотр элемента
  •   Главная
  • OAI Data Pool
  • OAI Harvested Content
  • Просмотр элемента
JavaScript is disabled for your browser. Some features of this site may not work without it.

Просмотр

Весь DSpaceСообщества и коллекцииДата публикацииНазванияТематикаАвторыЭта коллекцияДата публикацииНазванияТематикаАвторыProfilesView

Моя учетная запись

Войти

The Library

AboutNew SubmissionSubmission GuideSearch GuideRepository PolicyContact

Statistics

Most Popular ItemsStatistics by CountryMost Popular Authors

Throughput and robustness of bioinformatics pipelines for genome-scale data analysis

  • CSV
  • RefMan
  • EndNote
  • BibTex
  • RefWorks
Author(s)
Sztromwasser, Paweł

Full record
Показать полную информацию
URI
http://hdl.handle.net/20.500.12424/2581062
Online Access
http://hdl.handle.net/1956/7906
Abstract
<p>The post-genomic era has been heavily influenced by the rapid development of highthroughput
 molecular-screening technologies, which has enabled genome-wide analysis
 approaches on an unprecedented scale. The constantly decreasing cost of producing
 experimental data resulted in a data deluge, which has led to technical challenges
 in distributed bioinformatics infrastructure and computational biology methods. At the
 same time, the advances in deep-sequencing allowed intensified interrogation of human
 genomes, leading to prominent discoveries linking our genetic makeup with numerous
 medical conditions. The fast and cost-effective sequencing technology is expected to
 soon become instrumental in personalized medicine. The transition of the methodology
 related to genome sequencing and high-throughput data analysis from the research
 domain to a clinical service is challenging in many aspects. One of them is providing
 medical personnel with accessible, robust, and accurate methods for analysis of
 sequencing data.</p><p>The computational protocols used for analysis of the sequencing data are complex,
 parameterized, and in continuous development, making results of data analysis sensitive
 to factors such as the software used and the parameter values selected. However,
 the influence of parameters on results of computational pipelines has not been systematically
 studied. To fill this gap, we investigated the robustness of a genetic variant
 discovery pipeline against changes of its parameter settings. Using two sensitivity
 screening methods, we evaluated parameter influence on the identified genetic variants,
 and found that the parameters have irregular effects and are inter-dependent. Only a
 fraction of parameters were identified to have considerable impact on the results, suggesting
 that screening parameter sensitivity can lead to simpler pipeline configuration.
 Our results showed, that although a simple metric can be used to examine parameter
 influence, more informative results are obtained using a criterion related to the accuracy
 of pipeline results. Using the results of sensitivity screening, we have shown that
 the influential pipeline parameters can be adjusted to effectively increase the accuracy
 of variant discovery. Such information is invaluable for researchers tuning pipeline parameters,
 and can guide the search for optimal settings for computational pipelines in
 a clinical setting. Contrasting the two applied screening methods, we learned more
 about specific requirements of robustness analysis of computational methods, and were
 able to suggest a more tailored strategy for parameter screening. Our contributions
 demonstrate the importance and the benefits of systematic robustness analysis of bioinformatics
 pipelines, and indicate that more efforts are needed to advance research in
 this area.</p><p>Web services are commonly used to provide interoperable, programmatic access to bioinformatics resources, and consequently, they are natural building blocks of bioinformatics
 analysis workflows. However, in the light of the data deluge, their usability
 for data-intensive applications has been questioned. We investigated applicability of
 standard Web services to high-throughput pipelines, and showed how throughput and
 performance of such pipelines can be improved. By developing two complementary approaches,
 that take advantage of established and proven optimization mechanisms, we
 were able to enhance Web service communication in a non-intrusive manner. The first
 strategy increases throughput ofWeb service interfaces by a stream-like invocation pattern.
 This additionally allows for data-pipelining between consecutive steps of a workflow.
 The second approach facilitated peer-to-peer data transfer between Web services
 to increase the capacity of the workflow engine. We evaluated the impact of the enhancements
 on genome-scale pipelines, and showed that high-throughput data analysis
 using standard Web service pipelines is possible, when the technology is used sensibly.
 However, considering the contemporary data volumes and their expected growth,
 methods capable of handling even larger data should be sought.</p><p>Systematic analysis of pipeline robustness requires intensive computations, which are
 particularly demanding for high-throughput pipelines. Providing more efficient methods
 of pipeline execution is fundamental for enabling such examinations on a largescale.
 Furthermore, the standardized interfaces of Web services facilitate automated
 executions, and are perfectly suited for coordinating large computational experiments.
 I speculate that, provided wide adoption of Web service technology in bioinformatics
 pipelines, large-scale quality control studies, such as robustness analysis, could be
 automated and performed routinely on newly published computational methods. This
 work contributes to realizing such a conception, providing technical basis for building
 the necessary infrastructure and suggesting methodology for robustness analysis.</p>
Date
2014-04-11
Type
Doctoral thesis
Identifier
oai:bora.uib.no:1956/7906
978-82-308-2967-7
http://hdl.handle.net/1956/7906
Copyright/License
Copyright the author. All rights reserved
Collections
OAI Harvested Content

entitlement

 
DSpace software (copyright © 2002 - 2022)  DuraSpace
Quick Guide | Контакты
Open Repository is a service operated by 
Atmire NV
 

Export search results

The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.