Generators

CorpusGenerator

class kashgari.generators.CorpusGenerator(x_data: List[T], y_data: List[T], *, buffer_size: int = 2000)[source]

Bases: kashgari.generators.ABCGenerator

__init__(x_data: List[T], y_data: List[T], *, buffer_size: int = 2000) → None[source]

Initialize self. See help(type(self)) for accurate signature.

sample() → Iterator[Tuple[Any, Any]]

BatchDataSet

class kashgari.generators.BatchDataSet(corpus: kashgari.generators.CorpusGenerator, *, text_processor: ABCProcessor, label_processor: ABCProcessor, seq_length: int = None, max_position: int = None, segment: bool = False, batch_size: int = 64)[source]

Bases: collections.abc.Iterable, typing.Generic

__init__(corpus: kashgari.generators.CorpusGenerator, *, text_processor: ABCProcessor, label_processor: ABCProcessor, seq_length: int = None, max_position: int = None, segment: bool = False, batch_size: int = 64) → None[source]

Initialize self. See help(type(self)) for accurate signature.

take(batch_count: int = None) → Any[source]

take batches from the dataset

Parameters:batch_count – number of batch count, iterate forever when batch_count is None.