instancelib.machinelearning.base module
- class instancelib.machinelearning.base.AbstractClassifier(*args, **kwds)[source]
Bases:
ABC,Generic[IT,KT,DT,VT,RT,LT,LMT,PMT]This class provides an interface that can be used to connect your model to
InstanceProvider,LabelProvider, andEnvironmentobjects.The main methods of this class are listed below:
fit_provider(): Fit a classifier on training instancespredict(): Predict the class labels for (unseen) instancespredict_proba(): Predict the class labels and corresponding probabilitiespredict_proba_raw(): Predicht the class probabilities and return them in matrix form
Examples
Fit a classifier on train data:
>>> model.fit_provider(train, env.labels)
Predict the class labels for a list of instances:
>>> model.predict([ins]) [(20, frozenset({"Games"}))]
Return the class labels and probabilities:
>>> model.predict_proba(test) [(20, frozenset({("Games", 0.66), ("Bedrijfsnieuws", 0.22), ("Smartphones", 0.12)})), ... ]
Return the raw prediction matrix:
>>> preds = model.predict_proba_raw(test, batch_size=512) >>> next(preds) ([3, 4, 5, ...], array([[0.143, 0.622, 0.233], [0.278, 0.546, 0.175], [0.726, 0.126, 0.146], ...]))
- abstract fit_instances(instances, labels)[source]
Fit the classifier with the instances and accompanied labels found in the arguments.
- abstract fit_provider(provider, labels, batch_size=200)[source]
Fit the classifier with the instances found in the
InstanceProviderbased on the labels in theLabelProvider- Parameters:
provider (InstanceProvider[IT, KT, DT, VT, RT]) – The provider that contains the training data
labels (LabelProvider[KT, LT]) – The provider that contains the labels of the training data
batch_size (int, optional) – A batch size for the training process, by default 200
- Return type:
- fit_val_provider(provider, labels, validation=None, batch_size=200)[source]
- Parameters:
provider (
InstanceProvider[TypeVar(IT, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar(KT),TypeVar(DT),TypeVar(VT),TypeVar(RT)]) –labels (
LabelProvider[TypeVar(KT),TypeVar(LT)]) –validation (
Optional[InstanceProvider[TypeVar(IT, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar(KT),TypeVar(DT),TypeVar(VT),TypeVar(RT)]]) –batch_size (
int) –
- Return type:
- abstract property fitted: bool
Return true if the classifier has been fitted
- Returns:
True if the classifier has been fitted
- Return type:
- abstract get_label_column_index(label)[source]
Return the column in which the labels are stored in the label and prediction matrices
- property name: str
The name of the classifier
- Returns:
A name that can be used to identify the classifier
- Return type:
- predict(instances, batch_size=200)[source]
Predict the labels on input instances.
- Parameters:
- Returns:
A Tuple of Keys corresponding with their labels
- Return type:
- Raises:
ValueError – If you supply incorrect formatted arguments
- abstract predict_instances(instances, batch_size=200)[source]
Predict the labels for a iterable of instances
- predict_proba(instances, batch_size=200)[source]
Predict the labels and corresponding probabilities on input instances.
- Parameters:
- Returns:
Tuple of Keys corresponding with tuples of probabilities and the labels
- Return type:
- Raises:
ValueError – If you supply incorrect formatted arguments
- abstract predict_proba_instances(instances, batch_size=200)[source]
Predict the labels for each instance in the provider and return the probability for each label.
- Parameters:
- Returns:
A sequence of tuples consisting of:
The instance identifier
The class labels and their probabilities
- Return type:
- abstract predict_proba_instances_raw(instances, batch_size=200)[source]
Generator function that predicts the labels for each instance. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
- Yields:
Tuple[Sequence[KT], PMT] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of the probability matrix
The Probability matrix with shape
(batch_size, n_labels)
- Return type:
- abstract predict_proba_provider(provider, batch_size=200)[source]
Predict the labels for each instance in the provider and return the probability for each label.
- Parameters:
- Returns:
A sequence of tuples consisting of:
The instance identifier
The class labels and their probabilities
- Return type:
- abstract predict_proba_provider_raw(provider, batch_size=200)[source]
Generator function that predicts the labels for each instance in the provider. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
provider (
InstanceProvider[TypeVar(IT, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar(KT),TypeVar(DT),TypeVar(VT),TypeVar(RT)]) – The input InstanceProviderbatch_size (int, optional) – The batch size in which instances are processed, by default 200 This also influences the shape of the resulting probability matrix.
- Yields:
Iterator[Tuple[Sequence[KT], PMT]] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of
the probability matrix - The Probability matrix with shape
(len(keys), batch_size)- Return type:
- predict_proba_raw(instances, batch_size=200)[source]
Generator function that predicts the labels for each instance. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
instances (
Union[InstanceProvider[TypeVar(IT, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar(KT),TypeVar(DT),TypeVar(VT),TypeVar(RT)],Iterable[Instance[TypeVar(KT),TypeVar(DT),TypeVar(VT),TypeVar(RT)]]]) – Input instancesbatch_size (int, optional) – The batch size in which instances are processed, by default 200 This also influences the shape of the resulting probability matrix.
- Yields:
Tuple[Sequence[KT], PMT] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of the probability matrix
The Probability matrix with shape
(batch_size, n_labels)
- Return type: