Data Processors¶
Table of Contents
SequenceProcessor¶
-
class
kashgari.processors.
SequenceProcessor
(build_in_vocab='text', min_count=3, build_vocab_from_labels=False, **kwargs)[source]¶ Bases:
kashgari.processors.abc_processor.ABCProcessor
Generic processors for the sequence samples.
-
build_vocab
(x_data, y_data)¶
-
build_vocab_generator
(generators)[source]¶ - Parameters
generators (List[kashgari.generators.CorpusGenerator]) –
- Return type
-
get_tensor_shape
(batch_size, seq_length)¶
-
inverse_transform
(labels, *, lengths=None, threshold=0.5, **kwargs)[source]¶ - Parameters
labels (Union[List[List[int]], numpy.ndarray]) –
lengths (List[int]) –
threshold (float) –
kwargs (Any) –
- Return type
List[List[str]]
-
transform
(samples, *, seq_length=None, max_position=None, segment=False)[source]¶ - Parameters
- Return type
-
property
is_vocab_build
¶
-
property
vocab_size
¶
-
ClassificationProcessor¶
-
class
kashgari.processors.
ClassificationProcessor
(multi_label=False, **kwargs)[source]¶ Bases:
kashgari.processors.abc_processor.ABCProcessor
-
__init__
(multi_label=False, **kwargs)[source]¶ Initialize self. See help(type(self)) for accurate signature.
-
build_vocab
(x_data, y_data)¶
-
build_vocab_generator
(generators)[source]¶ - Parameters
generators (List[kashgari.generators.CorpusGenerator]) –
- Return type
-
transform
(samples, *, seq_length=None, max_position=None, segment=False)[source]¶ - Parameters
- Return type
-
property
is_vocab_build
¶
-
property
vocab_size
¶
-