instancelib.environment.memory module
- class instancelib.environment.memory.AbstractMemoryEnvironment(*args, **kwds)[source]
Bases:
AbstractEnvironment[InstanceType,KT,DT,VT,RT,LT],ABC,Generic[InstanceType,KT,DT,VT,RT,LT]Environments provide an interface that enable you to access all data stored in the datasets. If there are labels stored in the environment, you can access these as well from here.
There are two important properties in every
Environment:dataset(): Contains all Instances of the original datasetlabels(): Contains an object that allows you to access labels easily
Besides these properties, this object also provides methods to create new
InstanceProviderobjects that contain a subset of the set of all instances stored in this environment.- Variables:
_public_dataset – An
InstanceProviderthat contains all original Instances_dataset – An
InstanceProviderthat contains all instances_labelprovider – This object contains all labels
_named_provider – All user generated providers that were given a name
Examples
Access the dataset:
>>> dataset = env.dataset >>> instance = next(iter(dataset.values()))
Access the labels:
>>> labels = env.labels >>> ins_lbls = labels.get_labels(instance)
Create a train-test split on the dataset (70 % train, 30 % test):
>>> train, test = env.train_test_split(dataset, 0.70)
- property all_instances: InstanceProvider[InstanceType, KT, DT, VT, RT]
This provider should include all instances in all providers. If there are any synthethic datapoints constructed, they should be also in here.
- Returns:
The all_instances
InstanceProvider- Return type:
InstanceProvider[InstanceType, KT, DT, VT, RT]
- create_bucket(keys)[source]
Create an InstanceProvider that contains certain keys found in this environment.
- create_empty_provider()[source]
Use this method to create an empty InstanceProvider
- Returns:
The newly created provider
- Return type:
InstanceProvider[InstanceType, KT, DT, VT, RT]
- property dataset: InstanceProvider[InstanceType, KT, DT, VT, RT]
This property contains the InstanceProvider that contains the original dataset. This provider should include all original instances.
- Returns:
The dataset
InstanceProvider- Return type:
InstanceProvider[InstanceType, KT, DT, VT, RT]
- property labels: LabelProvider[KT, LT]
This property contains provider that has a mapping from instances to labels and vice-versa.
- Returns:
The label provider
- Return type:
- class instancelib.environment.memory.MemoryEnvironment(dataset, labelprovider)[source]
Bases:
AbstractMemoryEnvironment[InstanceType,KT,DT,VT,RT,LT],Generic[InstanceType,KT,DT,VT,RT,LT]This class implements the
ABCEnvironment. In this method, all data is loaded and stored in RAM and is not preserved on disk. There are two important properties in everyEnvironment:dataset(): Contains all Instances of the original datasetlabels(): Contains an object that allows you to access labels easily
Besides these properties, this object also provides methods to create new
InstanceProviderobjects that contain a subset of the set of all instances stored in this environment.- Parameters:
dataset (InstanceProvider[InstanceType, KT, DT, VT, RT]) – An InstanceProvider that contains all Instances
labelprovider (MemoryLabelProvider[KT, LT]) – The label provider that contains the labels associated with the instances from the
datasetvariable
- Variables:
_public_dataset – An
InstanceProviderthat contains all original Instances_dataset – An
InstanceProviderthat contains all instances_labelprovider – This object contains all labels
_named_provider – All user generated providers that were given a name
Examples
Access the dataset:
>>> dataset = env.dataset >>> instance = next(iter(dataset.values()))
Access the labels:
>>> labels = env.labels >>> ins_lbls = labels.get_labels(instance)
Create a train-test split on the dataset (70 % train, 30 % test):
>>> train, test = env.train_test_split(dataset, 0.70)
Store the environment to disk:
>>> import pickle >>> with open("file.pkl", "wb") as fh: ... pickle.dump(env, fh) >>> print("The file is saved to file.pkl")
Load the environment from disk:
>>> import pickle >>> with open("file.pkl", "rb") as fh: ... env = pickle.load(fh) >>> dataset = env.dataset