Algorithm Selection
Table of Contents
for a quick overview, have a look at the scikit learn estimator cheatsheet. You may even use a validation set to test out algorithms quickly before fitting to a large training set.
1. Factors to consider
1.1. Interpretability
- does the model need to be intellectually accessible to a non-technical audience (predictions in medical imaging for instance)?
- if only accuracy matters, one may not mind black boxes.
1.2. In-memory vs Out-of-memory
- can the dataset be completely loaded at once in the RAM : allows for a greater choice between algorithms
- Otherwise, need to use an incremental learning algorithms, that can improve model by adding data gradually
1.3. Number of features and examples
- neural nets and suitable ensembles when dealing with a large amount of features.
- traditional models for limited amount of features and speed -> see Occam's Razor
1.4. Categorical vs numerical features
- what kind of data do we have, one of the two or a mix of both.
- the algorithm and relevant feature engineering techniques should be able to handle the data appropriately
1.5. Nonlinearity of the data
- again, look into Occam's Razor..
- if implicitly linearly separable data can be observed, Support Vector Machine with the kernel trick might be preferable
- if no such simple mathematical separation can be identified, one may rely on neural nets and Ensemble Algorithms
1.6. Training speed
- neural nets will be slower than a traditional solution
1.7. Prediction speed
- Live high throughput requirements or daily batch-processing jobs: both cases call for different approaches to minimize overall tradeoffs.