Data Processors¶
Table of Contents
SequenceProcessor¶
- class kashgari.processors.SequenceProcessor(build_in_vocab='text', min_count=3, build_vocab_from_labels=False, **kwargs)[source]¶
Bases:
kashgari.processors.abc_processor.ABCProcessor
Generic processors for the sequence samples.
- Parameters
- Return type
- build_vocab(x_data, y_data)¶
- build_vocab_generator(generators)[source]¶
- Parameters
generators (List[kashgari.generators.CorpusGenerator]) –
- Return type
- get_tensor_shape(batch_size, seq_length)¶
- inverse_transform(labels, *, lengths=None, threshold=0.5, **kwargs)[source]¶
- Parameters
labels (Union[List[List[int]], numpy.ndarray]) –
lengths (Optional[List[int]]) –
threshold (float) –
kwargs (Any) –
- Return type
List[List[str]]
- transform(samples, *, seq_length=None, max_position=None, segment=False)[source]¶
- Parameters
- Return type
- property is_vocab_build: bool¶
- property vocab_size: int¶
ClassificationProcessor¶
- class kashgari.processors.ClassificationProcessor(multi_label=False, **kwargs)[source]¶
Bases:
kashgari.processors.abc_processor.ABCProcessor
- __init__(multi_label=False, **kwargs)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- build_vocab(x_data, y_data)¶
- build_vocab_generator(generators)[source]¶
- Parameters
generators (List[kashgari.generators.CorpusGenerator]) –
- Return type
- transform(samples, *, seq_length=None, max_position=None, segment=False)[source]¶
- Parameters
- Return type
- property is_vocab_build: bool¶
- property vocab_size: int¶