Skip to content

KNN-DWS-I

This algorithm is designed to be consistent and flexible for both classification and regression tasks and is a variation of KNN-DWS that takes neighbor distance into consideration. It uses soft blending between the top experts in a certain competence region to compute a set of weights for the models.


When to use

  • KNN-DWS-I is currently the general recommendation for regression tasks. KNN-DWS-I works best with soft metrics, so it also works for classification with confidence scores, but not as well with hard predictions
  • It performs best when the competence regions and pool are smooth and heterogeneous
  • It performs worst for homogeneous datasets and for classification with hard predictions

How it works

When fit is called, KNN-DWS-I fits a KNN algorithm on the validation data and builds a criterion score matrix. When predict is called, it finds the K nearest neighbors from the test point and uses the score matrix to combine every models' scores over the K neighbors with inverse-distance weights. Afterwards, it normalizes the average scores using min-max normalization and removes the models under a threshold. Finally, it takes the remaining models and creates weights with their scores using softmax with temperature.


Parameters

Parameter Type Default Description
task str "classification" or "regression"
metric str or callable Scoring function per sample. Built-ins: accuracy, mae, mse, rmse, log_loss, prob_correct. Custom callables (y_true, y_pred) -> float are accepted
mode str "max" if higher is better, "min" if lower
k int 10 Number of neighbours
threshold float 0.5 Competence cutoff
temperature float 0.1/1.0 for regression/classification Defines how smooth the model blend is
preset str "balanced" ANN backend preset
finder str —, optional Only if the preset is "custom"; Options: "knn", "faiss", "annoy", "hnsw"

Example

# Regression
from deskit.des.knndswi import KNNDWSI

router = KNNDWSI(task="regression", metric="mae", mode="min", k=20)
router.fit(X_val, y_val, val_preds)
weights = router.predict(x)
# Classification
from deskit.des.knndswi import KNNDWSI

router = KNNDWSI(task="classification", metric="log_loss", mode="min", k=20)
router.fit(X_val, y_val, val_preds)
weights = router.predict(x)

Notes

A lower temperature is recommended for regression because regression metrics tend to produce scores on a continuous scale where differences can be large, so a low temperature sharpens the softmax to reflect that. In contrast, classification metrics tend to produce scores that are closer together, so a higher temperature keeps the blend soft.