gec_metrics.metrics.bertscore module

class gec_metrics.metrics.bertscore.BertScore(config: Config = None)[source]

Bases: MetricBaseForSourceFree

class Config(model_type: str = 'bert-base-uncased', num_layers: int = None, batch_size: int = 64, nthreads: int = 4, all_layers: bool = False, idf: bool = False, idf_sents: list[str] = None, lang: str = 'en', rescale_with_baseline: bool = True, baseline_path: str = None, use_fast_tokenizer: bool = False, score_type: str = 'f')[source]

Bases: Config

BERTScore configuration.

  • model_type (str): Embedding model.

  • num_layers (int): The layer of representation to use.

    If None, the pre-difined one is used. (See bert_score.utils.model2layers.)

  • nthreads (int): Number of threads.

  • idf (bool): Whether to use idf or not.

  • idf_sents (list[str]): Sentences to compute idf weights.

  • rescale_with_baselines (bool): Whether to rescale scores.

  • baseline_path (str): Path to .tsv file.

    If None, the pre-defined one is used. (See bert_score.rescale_baseline.*.tsv)

  • use_fast_tokenizer (bool): Whether to use fast tokenizer.

  • score_type (str): “p” (precision) or “r” (recall) or “f” (F1) score.

all_layers: bool = False
baseline_path: str = None
batch_size: int = 64
idf: bool = False
idf_sents: list[str] = None
lang: str = 'en'
model_type: str = 'bert-base-uncased'
nthreads: int = 4
num_layers: int = None
rescale_with_baseline: bool = True
score_type: str = 'f'
use_fast_tokenizer: bool = False
score_sentence(hypotheses: list[str], references: list[list[str]]) list[float][source]

Calculate sentence-level scores.

Parameters:
  • hypotheses (list[str]) – Corrected sentences. The shape is (num_sentences, )

  • references (list[list[str]]) – Reference sentences. The shape is (num_references, num_sentences).

Returns:

The sentence-level scores.

Return type:

list[float]