gec_metrics.analysis.attributor package

Submodules

Module contents

class gec_metrics.analysis.attributor.AttributorAdd(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) list[dict][source]

Generate edited sentence by applying each edit to src.

Parameters:
  • src (str) – source sentence.

  • edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:

”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) list[float][source]

Normalize each score by the sum of the scores.

Parameters:
  • scores (list[float]) – delta M() scores.

  • sent_level_score (Optional[float]) – Used when normalization.

  • indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorBase(config: Config)[source]

Bases: ABC

class AttributionOutput(sent_score: float = None, src_score: float = None, attribution_scores: list[float] = None, edits: list[Edit] = None, src: str = None)[source]

Bases: object

Attribution output. - sent_score (float): The overall impact of edits: delta M(S, H) = M(S, H) - M(S, S). - src_score (float): Source score: M(S, S). - attribution_scores (list[float]): Attribution score for each edit. - edits (list[errant.edit.Edit]): Edits extracted by ERRANT. - src (str): Source sentence.

attribution_scores: list[float] = None
edits: list[Edit] = None
sent_score: float = None
src: str = None
src_score: float = None
class Config(metric: MetricBase = None, max_num_edits: int = inf, errant_language: str = 'en', quiet: bool = True)[source]

Bases: object

Attribution configuration. - metric (MetricBase): Metric instance based on gec_metrics.metrics.MetricBase - max_num_edits (int): Ignore a hypothesis when the number of edits exceeds this value. - errant_language (str): Spacy language for ERRANT. - quiet (bool): If False some logs will be shown.

errant_language: str = 'en'
max_num_edits: int = inf
metric: MetricBase = None
quiet: bool = True
attribute(src: str, hyp: str | None = None, inputs_edits: list[Edit] | None = None) AttributionOutput[source]

Calculate attribution scores.

Parameters:
  • src (str) – A source sentence.

  • hyp (Optional[str]) – An edited sentence.

  • inputs_edits (Optional[list[errant.edit.Edit]]) – An alternative way to pass the hyp, as edit objects.

Returns:

Attributor scores and related information.

Return type:

AttributorOutput

abstractmethod generate(src: str, edits: list[Edit]) list[dict][source]

Generate edited sentence. How the edits are applied depends on the attribution method.

Parameters:
  • src (str) – source sentence.

  • edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:

”sentence”: An edited sentence. “indices”: Indices of edits that affect editing according to the setting.

Return type:

list[Dict]

abstractmethod post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) list[float][source]

Post processing depending on the method. E.g. normalize for one-by-one method or sum up for Shapley theory.

Parameters:
  • scores (list[float]) – delta M() scores.

  • sent_level_score (Optional[float]) – Used when normalization.

  • indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorShapley(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) list[dict][source]

Generate edited sentence by applying all patterns of edits.

Parameters:
  • src (str) – source sentence.

  • edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:

”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) list[float][source]

Caluclate Shapley values.

Parameters:
  • scores (list[float]) – delta M() scores.

  • sent_level_score (Optional[float]) – Used when normalization.

  • indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorShapleySampling(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit]) list[dict][source]

Generate edited sentence by applying sampled patterns of edits.

Parameters:
  • src (str) – source sentence.

  • edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:

”sentence”: An edited sentence. “indices”: Indices of edits that were applied to the source sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) list[float][source]

Calculate Shapley sampling values.

Parameters:
  • scores (list[float]) – delta M() scores.

  • sent_level_score (Optional[float]) – Used when normalization.

  • indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]

class gec_metrics.analysis.attributor.AttributorSub(config)[source]

Bases: AttributorBase

generate(src: str, edits: list[Edit] | list[list[Edit]]) list[dict][source]

Generate edited sentence by removing each edit from the reference.

Parameters:
  • src (str) – source sentence.

  • edits (list[errant.edit.Edit]) – Edit to be applied to the source.

Returns:

Each element has two keys:

”sentence”: An edited sentence. “indices”: Indices of edits that were removed from the reference sentence.

Return type:

list[Dict]

post_process(scores: list[float], sent_level_score: float | None = None, indices: list[tuple] | None = None) list[float][source]

Normalize each score by the sum of the scores.

Parameters:
  • scores (list[float]) – delta M() scores.

  • sent_level_score (Optional[float]) – Used when normalization.

  • indices (Optional[list[Tuple]]) – Which edits were applied to the source.

Returns:

Post pocessed scores.

Return type:

list[float]