instancelib.instances.dataset module

class instancelib.instances.dataset.PandasDataset(df, data_col)[source]

Bases: ReadOnlyDataset[int, Any]

Parameters:
get_bulk(keys)[source]
Parameters:

keys (Sequence[int]) –

Return type:

Sequence[Any]

property identifiers: FrozenSet[int]
class instancelib.instances.dataset.ReadOnlyDataset(*args, **kwds)[source]

Bases: Mapping[KT, DT], ABC, Generic[KT, DT]

get_bulk(keys)[source]
Parameters:

keys (Sequence[TypeVar(KT)]) –

Return type:

Sequence[TypeVar(DT)]

abstract property identifiers: FrozenSet[KT]
class instancelib.instances.dataset.ReadOnlyProvider(dataset, from_data_builder, local_data)[source]

Bases: InstanceProvider[IT, KT, DT, ndarray, RT], Generic[IT, KT, DT, RT]

Parameters:
build_from_external(k)[source]
Parameters:

k (TypeVar(KT)) –

Return type:

TypeVar(IT, bound= Instance[Any, Any, Any, Any])

construct(**kwargs)[source]
Parameters:
  • args (Any) –

  • kwargs (Any) –

Return type:

TypeVar(IT, bound= Instance[Any, Any, Any, Any])

create(**kwargs)[source]

Create a new instance of type InstanceType. The created instance is subsequently added to the provider.

Note: The number of arguments and keyword arguments may differ in actual implementation, so there are no standard arguments.

Returns:

The new instance Type

Return type:

InstanceType

Parameters:
  • args (Any) –

  • kwargs (Any) –

data_chunker(batch_size=200)[source]

Iterate over all instances data parts in this provider

Parameters:

batch_size (int) – The batch size, the generator will return lists with size batch_size

Yields:

Sequence[Tuple[KT,DT]] – A sequence of instances with length batch_size. The last list may have a shorter length.

Return type:

Iterator[Sequence[Tuple[TypeVar(KT), TypeVar(DT)]]]

data_chunker_selector(keys, batch_size=200)[source]
Parameters:
Return type:

Iterator[Sequence[Tuple[TypeVar(KT), TypeVar(DT)]]]

property key_list: Sequence[KT]

Return a list of all instance keys in this provider

Returns:

A list of instance keys

Return type:

List[KT]

local_data: InstanceProvider[TypeVar(IT, bound= Instance[Any, Any, Any, Any]), TypeVar(KT), TypeVar(DT), ndarray[Any, dtype[Any]], TypeVar(RT)]
update_external(ins)[source]
Parameters:

ins (Instance[TypeVar(KT), TypeVar(DT), ndarray[Any, dtype[Any]], TypeVar(RT)]) –

Return type:

None