instancelib.feature_extraction package
Submodules
Module contents
- class instancelib.feature_extraction.BaseVectorizer[source]
-
This is the
ABCspecifies a generic vectorizer. Vectorizers transform raw data examples into feature vectors. Given a data type DT, it specifies the methodsfit()that initializes or fits the vectorizer. The methodtransform()transforms the data into vector form.- abstract fit(x_data, **kwargs)[source]
Fit the vectorizer according to the data in the given
Sequence.- Parameters:
x_data (Sequence[DT]) – A Sequence of examples with type DT.
- Returns:
A fitted vectorizer for data with type DT
- Return type:
Examples
Assume the creation of a vectorizer and a sequence of data examples in the variable data_list
>>> vectorizer = BaseVectorizer[DT]() >>> vectorizer = vectorizer.fit(data_list)
- Parameters:
kwargs (
Any) –
- abstract fit_transform(x_data, **kwargs)[source]
Transform a list of data to a feature matrix. The transformation is based on the data contained in the parameter x_data. Subsequent transformations with
transform()will be based on the fit of the data provided in this call.- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.fit_transform(x_data)
- Parameters:
kwargs (Any) –
- property fitted: bool
Check if the vectorizer has been fitted
- Returns:
True if the vectorizer has been fitted
- Return type:
- abstract transform(x_data, **kwargs)[source]
Transform a list raw data points to a feature matrix according to the fitted vectorizer
- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.transform(x_data)
- Parameters:
kwargs (Any) –
- class instancelib.feature_extraction.SklearnVectorizer(vectorizer, storage_location=None, filename=None)[source]
Bases:
BaseVectorizer[str],SaveableInnerModel- Parameters:
vectorizer (
BaseEstimator) –
- fit(x_data, **kwargs)[source]
Fit the vectorizer according to the data in the given
Sequence.- Parameters:
x_data (Sequence[DT]) – A Sequence of examples with type DT.
- Returns:
A fitted vectorizer for data with type DT
- Return type:
Examples
Assume the creation of a vectorizer and a sequence of data examples in the variable data_list
>>> vectorizer = BaseVectorizer[DT]() >>> vectorizer = vectorizer.fit(data_list)
- Parameters:
kwargs (
Any) –
- fit_transform(x_data, **kwargs)[source]
Transform a list of data to a feature matrix. The transformation is based on the data contained in the parameter x_data. Subsequent transformations with
transform()will be based on the fit of the data provided in this call.- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.fit_transform(x_data)
- Parameters:
kwargs (Any) –
-
innermodel:
BaseEstimator
- transform(x_data, **kwargs)[source]
Transform a list raw data points to a feature matrix according to the fitted vectorizer
- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.transform(x_data)
- Parameters:
kwargs (Any) –
- class instancelib.feature_extraction.TextInstanceVectorizer(vectorizer)[source]
Bases:
BaseVectorizer[Instance[Any,str,ndarray,Any]]- Parameters:
vectorizer (
BaseVectorizer[str]) –
- fit(x_data, **kwargs)[source]
Fit the vectorizer according to the data in the given
Sequence.- Parameters:
x_data (Sequence[DT]) – A Sequence of examples with type DT.
- Returns:
A fitted vectorizer for data with type DT
- Return type:
Examples
Assume the creation of a vectorizer and a sequence of data examples in the variable data_list
>>> vectorizer = BaseVectorizer[DT]() >>> vectorizer = vectorizer.fit(data_list)
- Parameters:
kwargs (
Any) –
- fit_transform(x_data, **kwargs)[source]
Transform a list of data to a feature matrix. The transformation is based on the data contained in the parameter x_data. Subsequent transformations with
transform()will be based on the fit of the data provided in this call.- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.fit_transform(x_data)
- Parameters:
kwargs (
Any) –
- property fitted: bool
Check if the vectorizer has been fitted
- Returns:
True if the vectorizer has been fitted
- Return type:
- transform(x_data, **kwargs)[source]
Transform a list raw data points to a feature matrix according to the fitted vectorizer
- Parameters:
x_data (Sequence[DT]) – A sequence of raw data examples with length n_examples
- Returns:
A feature matrix with shape (n_examples, n_features)
- Return type:
npt.NDArray[Any]
Examples
Assume the vectorizer is fitted
>>> x_mat = vectorizer.transform(x_data)
- Parameters:
kwargs (
Any) –