NORMA eResearch @NCI Library

Optimizing Feature Engineering through Induced Hybrid Wrappers, Automatic Thresholds and Filter Ensembling with Rank Aggregation

Karande, Ketan Navanath (2018) Optimizing Feature Engineering through Induced Hybrid Wrappers, Automatic Thresholds and Filter Ensembling with Rank Aggregation. Masters thesis, Dublin, National College of Ireland.

[thumbnail of Master of Science]
Preview
PDF (Master of Science)
Download (1MB) | Preview

Abstract

With evolving big data, the emergence of data dimensionality has surged exponentially. Because of which, researchers are working round the clock to revamp the process of feature selection. The core methods include filters that assign ranks to the features, wrappers that create feature subsets and hybrids that combine the core concepts of relevancy and redundancies from filters and wrappers respectively. Feature ranking is not the only problem since the need of thresholds to limit the number of top-ranked features to be used in the training models is also imperative. These thresholds are dependent on the datasets, termed as fixed or can be determined automatically. This study is speared towards-1. finding best filters for given thresholds, 2. finding conditions in which ensembles of filters are required, 3. finding if the novel approach of creating automatic thresholds is superior to fixed thresholds 4. finding best wrappers 5. testing a novel approach of induced hybrids to achieve relevancy in wrappers and 6. finding the thresholds that causes overfitting. The achieved results prove that the novel approaches introduced in the study have revised the process of feature selection through automatic thresholds that can handle overfitting, through ensembles that boost performance and through induced hybrid that boost relevancy in wrappers.

Item Type: Thesis (Masters)
Subjects: Q Science > QA Mathematics > Electronic computers. Computer science
T Technology > T Technology (General) > Information Technology > Electronic computers. Computer science
Q Science > QA Mathematics > Computer software
T Technology > T Technology (General) > Information Technology > Computer software
Divisions: School of Computing > Master of Science in Data Analytics
Depositing User: Caoimhe Ní Mhaicín
Date Deposited: 06 Nov 2018 12:48
Last Modified: 06 Nov 2018 12:48
URI: https://norma.ncirl.ie/id/eprint/3448

Actions (login required)

View Item View Item