gec_metrics.analysis.attributor package

Submodules

Module contents

class gec_metrics.analysis.attributor.AttributorAdd(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) → list[dict][source]

Generate edited sentence by applying each edit to src.

Parameters:

src (str) – source sentence.
edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:: ”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) → list[float][source]

Normalize each score by the sum of the scores.

Parameters:

scores (list[float]) – delta M() scores.
sent_level_score (Optional[float]) – Used when normalization.
indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorBase(config: Config)[source]

Bases: ABC

class AttributionOutput(sent_score: float = None, src_score: float = None, attribution_scores: list[float] = None, edits: list[Edit] = None, src: str = None)[source]

Bases: object

Attribution output. - sent_score (float): The overall impact of edits: delta M(S, H) = M(S, H) - M(S, S). - src_score (float): Source score: M(S, S). - attribution_scores (list[float]): Attribution score for each edit. - edits (list[errant.edit.Edit]): Edits extracted by ERRANT. - src (str): Source sentence.

attribution_scores: list[float] = None

edits: list[Edit] = None

sent_score: float = None

src: str = None

src_score: float = None

class Config(metric: MetricBase = None, max_num_edits: int = inf, errant_language: str = 'en', quiet: bool = True)[source]

Bases: object

Attribution configuration. - metric (MetricBase): Metric instance based on gec_metrics.metrics.MetricBase - max_num_edits (int): Ignore a hypothesis when the number of edits exceeds this value. - errant_language (str): Spacy language for ERRANT. - quiet (bool): If False some logs will be shown.

errant_language: str = 'en'

max_num_edits: int = inf

metric: MetricBase = None

quiet: bool = True

attribute(src: str, hyp: str | None = None, inputs_edits: list[Edit] | None = None) → AttributionOutput[source]

Calculate attribution scores.

Parameters:

src (str) – A source sentence.
hyp (Optional[str]) – An edited sentence.
inputs_edits (Optional[list[errant.edit.Edit]]) – An alternative way to pass the hyp, as edit objects.

Returns:

Attributor scores and related information.

Return type:

AttributorOutput

abstractmethod generate(src: str, edits: list[Edit]) → list[dict][source]

Generate edited sentence. How the edits are applied depends on the attribution method.

Parameters:

src (str) – source sentence.
edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:: ”sentence”: An edited sentence. “indices”: Indices of edits that affect editing according to the setting.

Return type:

list[Dict]

abstractmethod post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) → list[float][source]

Post processing depending on the method. E.g. normalize for one-by-one method or sum up for Shapley theory.

Parameters:

scores (list[float]) – delta M() scores.
sent_level_score (Optional[float]) – Used when normalization.
indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorShapley(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) → list[dict][source]

Generate edited sentence by applying all patterns of edits.

Parameters:

src (str) – source sentence.
edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:: ”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) → list[float][source]

Caluclate Shapley values.

Parameters:

scores (list[float]) – delta M() scores.
sent_level_score (Optional[float]) – Used when normalization.
indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorShapleySampling(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit]) → list[dict][source]

Generate edited sentence by applying sampled patterns of edits.

Parameters:

src (str) – source sentence.
edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:: ”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) → list[float][source]

Calculate Shapley sampling values.

Parameters:

scores (list[float]) – delta M() scores.
sent_level_score (Optional[float]) – Used when normalization.
indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorSub(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) → list[dict][source]

Generate edited sentence by removing each edit from the reference.

Parameters:

src (str) – source sentence.
edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:: ”sentence”: An edited sentence. “indices”: Indices of edits that were removed from the reference sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) → list[float][source]

Normalize each score by the sum of the scores.

Parameters:

scores (list[float]) – delta M() scores.
sent_level_score (Optional[float]) – Used when normalization.
indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]