Generators

CorpusGenerator

class kashgari.generators.CorpusGenerator(*args, **kwds)[source]

Bases: kashgari.generators.ABCGenerator

__init__(x_data, y_data, *, buffer_size=2000)[source]

Initialize self. See help(type(self)) for accurate signature.

Parameters
  • x_data (List) –

  • y_data (List) –

  • buffer_size (int) –

Return type

None

sample()
Return type

Iterator[Tuple[Any, Any]]

BatchDataSet

class kashgari.generators.BatchDataSet(*args, **kwds)[source]

Bases: collections.abc.Iterable, typing.Generic

__init__(corpus, *, text_processor, label_processor, seq_length=None, max_position=None, segment=False, batch_size=64)[source]

Initialize self. See help(type(self)) for accurate signature.

Parameters
Return type

None

take(batch_count=None)[source]

take batches from the dataset

Parameters

batch_count (int) – number of batch count, iterate forever when batch_count is None.

Return type

Any