instancelib.machinelearning.base module
- class instancelib.machinelearning.base.AbstractClassifier(*args, **kwds)[source]
Bases:
ABC
,Generic
[IT
,KT
,DT
,VT
,RT
,LT
,LMT
,PMT
]This class provides an interface that can be used to connect your model to
InstanceProvider
,LabelProvider
, andEnvironment
objects.The main methods of this class are listed below:
fit_provider()
: Fit a classifier on training instancespredict()
: Predict the class labels for (unseen) instancespredict_proba()
: Predict the class labels and corresponding probabilitiespredict_proba_raw()
: Predicht the class probabilities and return them in matrix form
Examples
Fit a classifier on train data:
>>> model.fit_provider(train, env.labels)
Predict the class labels for a list of instances:
>>> model.predict([ins]) [(20, frozenset({"Games"}))]
Return the class labels and probabilities:
>>> model.predict_proba(test) [(20, frozenset({("Games", 0.66), ("Bedrijfsnieuws", 0.22), ("Smartphones", 0.12)})), ... ]
Return the raw prediction matrix:
>>> preds = model.predict_proba_raw(test, batch_size=512) >>> next(preds) ([3, 4, 5, ...], array([[0.143, 0.622, 0.233], [0.278, 0.546, 0.175], [0.726, 0.126, 0.146], ...]))
- abstract fit_instances(instances, labels)[source]
Fit the classifier with the instances and accompanied labels found in the arguments.
- abstract fit_provider(provider, labels, batch_size=200)[source]
Fit the classifier with the instances found in the
InstanceProvider
based on the labels in theLabelProvider
- Parameters:
provider (InstanceProvider[IT, KT, DT, VT, RT]) – The provider that contains the training data
labels (LabelProvider[KT, LT]) – The provider that contains the labels of the training data
batch_size (int, optional) – A batch size for the training process, by default 200
- Return type:
- fit_val_provider(provider, labels, validation=None, batch_size=200)[source]
- Parameters:
provider (
InstanceProvider
[TypeVar
(IT
, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar
(KT
),TypeVar
(DT
),TypeVar
(VT
),TypeVar
(RT
)]) –labels (
LabelProvider
[TypeVar
(KT
),TypeVar
(LT
)]) –validation (
Optional
[InstanceProvider
[TypeVar
(IT
, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar
(KT
),TypeVar
(DT
),TypeVar
(VT
),TypeVar
(RT
)]]) –batch_size (
int
) –
- Return type:
- abstract property fitted: bool
Return true if the classifier has been fitted
- Returns:
True if the classifier has been fitted
- Return type:
- abstract get_label_column_index(label)[source]
Return the column in which the labels are stored in the label and prediction matrices
- property name: str
The name of the classifier
- Returns:
A name that can be used to identify the classifier
- Return type:
- predict(instances, batch_size=200)[source]
Predict the labels on input instances.
- Parameters:
- Returns:
A Tuple of Keys corresponding with their labels
- Return type:
- Raises:
ValueError – If you supply incorrect formatted arguments
- abstract predict_instances(instances, batch_size=200)[source]
Predict the labels for a iterable of instances
- predict_proba(instances, batch_size=200)[source]
Predict the labels and corresponding probabilities on input instances.
- Parameters:
- Returns:
Tuple of Keys corresponding with tuples of probabilities and the labels
- Return type:
- Raises:
ValueError – If you supply incorrect formatted arguments
- abstract predict_proba_instances(instances, batch_size=200)[source]
Predict the labels for each instance in the provider and return the probability for each label.
- Parameters:
- Returns:
A sequence of tuples consisting of:
The instance identifier
The class labels and their probabilities
- Return type:
- abstract predict_proba_instances_raw(instances, batch_size=200)[source]
Generator function that predicts the labels for each instance. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
- Yields:
Tuple[Sequence[KT], PMT] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of the probability matrix
The Probability matrix with shape
(batch_size, n_labels)
- Return type:
- abstract predict_proba_provider(provider, batch_size=200)[source]
Predict the labels for each instance in the provider and return the probability for each label.
- Parameters:
- Returns:
A sequence of tuples consisting of:
The instance identifier
The class labels and their probabilities
- Return type:
- abstract predict_proba_provider_raw(provider, batch_size=200)[source]
Generator function that predicts the labels for each instance in the provider. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
provider (
InstanceProvider
[TypeVar
(IT
, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar
(KT
),TypeVar
(DT
),TypeVar
(VT
),TypeVar
(RT
)]) – The input InstanceProviderbatch_size (int, optional) – The batch size in which instances are processed, by default 200 This also influences the shape of the resulting probability matrix.
- Yields:
Iterator[Tuple[Sequence[KT], PMT]] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of
the probability matrix - The Probability matrix with shape
(len(keys), batch_size)
- Return type:
- predict_proba_raw(instances, batch_size=200)[source]
Generator function that predicts the labels for each instance. The generator lazy evaluates the prediction function on batches of instances and yields class probabilities in matrix form.
- Parameters:
instances (
Union
[InstanceProvider
[TypeVar
(IT
, bound= Instance[Any,Any,Any,Any], covariant=True),TypeVar
(KT
),TypeVar
(DT
),TypeVar
(VT
),TypeVar
(RT
)],Iterable
[Instance
[TypeVar
(KT
),TypeVar
(DT
),TypeVar
(VT
),TypeVar
(RT
)]]]) – Input instancesbatch_size (int, optional) – The batch size in which instances are processed, by default 200 This also influences the shape of the resulting probability matrix.
- Yields:
Tuple[Sequence[KT], PMT] –
An iterator yielding tuples consisting of:
A sequence of keys that match the rows of the probability matrix
The Probability matrix with shape
(batch_size, n_labels)
- Return type: