Autor: Maciejewski H.

ISBN: 978-83-7493-794-8

Liczba stron: 148

,

Predictive modelling in high-dimensional data: prior domain knowledge-based approaches

27,00 

Availability: In stock

This book is devoted to the problem of predictive modelling based on high-dimensional data, focusing mainly on the cases where the number of training samples Is substantially smaller than the number of features. Analysis of such data is becoming increasingly important in many areas of science and technology, including bioinformatics, image analysis or text mining.

The major challenge in the analysis of such data is the selection of stable, relevant features for class prediction. As a remedy to this, in this book we develop methods which allow us to include a priori domain knowledge on relationships among features. This approach can stabilize feature selection and improve classification.

Categories: ,
somdn_product_page

This book is devoted to the problem of predictive modelling based on high-dimensional data, focusing mainly on the cases where the number of training samples Is substantially smaller than the number of features. Analysis of such data is becoming increasingly important in many areas of science and technology, including bioinformatics, image analysis or text mining.

The major challenge in the analysis of such data is the selection of stable, relevant features for class prediction. As a remedy to this, in this book we develop methods which allow us to include a priori domain knowledge on relationships among features. This approach can stabilize feature selection and improve classification.

We provide the comprehensive overview of data-driven methods of feature selection, including univariate, multivariate and shrinkage-based methods. Theoretical analysis is given which shows that for the small number of samples, these methods are virtually unable to select stable, relevant features. Next, we provide a comprehensive theoretical analysis of feature set analysis methods developed in bioinformatics which are employed in the process of prior domain knowledge-based feature selection. Algorithms of sample classification based on activation of feature sets are proposed. This approach is evaluated in omparison with data-driven methods in a comprehensive numerical study focusing on low signal-to-noise data with correlated features.

Results presented in this book may be of interest to (i) researchers in machine learning or statistics developing methods of high-dimensional data analysis, (ii) analysts who tackle the feature selection or classification problems based on real-life high-throughput data, or (iii) bioinformaticians interested in development or application of feature set (gene set) analysis methods for the purposes of signalling pathway or gene ontology analysis.

Weight 0,280 kg
Category

Książki

Author

Maciejewski H.

Rok wydania

2013

Liczba stron

148

ISBN

978-83-7493-794-8

Dział

Informatyka

Format

170 × 240 mm