Multi-target regression is a challenging task that consists of creating predictive models for problems with multiple continuous target outputs. Despite the increasing attention on multi-label classification, there are fewer studies concerning multi-target (MT) regression. The current leading MT models are based on ensembles of regressor chains, where random, differently ordered chains of the target variables are created and used to build separate regression models, using the previous target predictions in the chain. The challenges of building MT models stem from trying to capture and exploit possible correlations among the target variables during training. This paper presents three multi-target support vector regression models. The first involves building independent, single-target Support Vector Regression (SVR) models for each output variable. The second builds an ensemble of random chains using the first method as a base model. The third calculates the targets' correlations and forms a maximum correlation chain, which is used to build a single chained support vector regression model, improving the models' prediction performance while reducing the computational complexity. The experimental study evaluates and compares the performance of the three approaches with seven other state-of-the-art multi-target regressors on 24 multi-target datasets. The experimental results are then analyzed using non-parametric statistical tests. The results show that the maximum correlation SVR approach improves the performance of using ensembles of random chains.
These datasets have been collected from the MULAN, MEKA and LABIC repository websites and they are very varied in their degree of complexity, number of labels, number of attributes, and number of examples. The datasets are available to download.
Dataset | Samples | Attributes | Targets |
EDM | 145 | 16 | 2 |
Enb | 768 | 8 | 2 |
Jura | 359 | 11 | 7 |
Osales | 639 | 413 | 12 |
Scpf | 1137 | 23 | 3 |
Slump | 103 | 7 | 3 |
Solar Flare 1 | 323 | 10 | 3 |
Solar Flare 2 | 1,066 | 10 | 3 |
Water Quality | 1,060 | 16 | 14 |
OES97 | 323 | 263 | 16 |
OES10 | 403 | 298 | 16 |
ATP1d | 201 | 411 | 6 |
ATP7d | 188 | 411 | 6 |
Andro | 49 | 30 | 6 |
Wisconsin Cancer | 198 | 34 | 2 |
Stock | 950 | 10 | 3 |
California Housing | 20,640 | 7 | 2 |
Puma8NH | 8,192 | 8 | 3 |
Puma32H | 8,192 | 32 | 6 |
Friedman | 500 | 25 | 6 |
Polymer | 41 | 10 | 4 |
M5SPEC | 80 | 700 | 3 |
MP5SPEC | 80 | 700 | 3 |
MP6SPEC | 80 | 700 | 4 |
The implementation of these algorithms is available from the MULAN library. Download the code for SVR, SVRRC and SVRCC proposed in this paper.