arxiv.org/abs/2010.08240

Preview meta tags from the arxiv.org website.

Linked Hostnames

26

Thumbnail

Search Engine Appearance

Google

https://arxiv.org/abs/2010.08240

Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-attention over the input pair, and Bi-encoders, which map each input independently to a dense vector space. While cross-encoders often achieve higher performance, they are too slow for many practical use cases. Bi-encoders, on the other hand, require substantial training data and fine-tuning over the target task to achieve competitive performance. We present a simple yet efficient data augmentation strategy called Augmented SBERT, where we use the cross-encoder to label a larger set of input pairs to augment the training data for the bi-encoder. We show that, in this process, selecting the sentence pairs is non-trivial and crucial for the success of the method. We evaluate our approach on multiple tasks (in-domain) as well as on a domain adaptation task. Augmented SBERT achieves an improvement of up to 6 points for in-domain and of up to 37 points for domain adaptation tasks compared to the original bi-encoder performance.



Bing

Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

https://arxiv.org/abs/2010.08240

There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-attention over the input pair, and Bi-encoders, which map each input independently to a dense vector space. While cross-encoders often achieve higher performance, they are too slow for many practical use cases. Bi-encoders, on the other hand, require substantial training data and fine-tuning over the target task to achieve competitive performance. We present a simple yet efficient data augmentation strategy called Augmented SBERT, where we use the cross-encoder to label a larger set of input pairs to augment the training data for the bi-encoder. We show that, in this process, selecting the sentence pairs is non-trivial and crucial for the success of the method. We evaluate our approach on multiple tasks (in-domain) as well as on a domain adaptation task. Augmented SBERT achieves an improvement of up to 6 points for in-domain and of up to 37 points for domain adaptation tasks compared to the original bi-encoder performance.



DuckDuckGo

https://arxiv.org/abs/2010.08240

Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks

There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-attention over the input pair, and Bi-encoders, which map each input independently to a dense vector space. While cross-encoders often achieve higher performance, they are too slow for many practical use cases. Bi-encoders, on the other hand, require substantial training data and fine-tuning over the target task to achieve competitive performance. We present a simple yet efficient data augmentation strategy called Augmented SBERT, where we use the cross-encoder to label a larger set of input pairs to augment the training data for the bi-encoder. We show that, in this process, selecting the sentence pairs is non-trivial and crucial for the success of the method. We evaluate our approach on multiple tasks (in-domain) as well as on a domain adaptation task. Augmented SBERT achieves an improvement of up to 6 points for in-domain and of up to 37 points for domain adaptation tasks compared to the original bi-encoder performance.

  • General Meta Tags

    18
    • title
      [2010.08240] Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
    • title
      open search
    • title
      open navigation menu
    • title
      contact arXiv
    • title
      subscribe to arXiv mailings
  • Open Graph Meta Tags

    10
    • og:type
      website
    • og:site_name
      arXiv.org
    • og:title
      Augmented SBERT: Data Augmentation Method for Improving Bi-Encoders for Pairwise Sentence Scoring Tasks
    • og:url
      https://arxiv.org/abs/2010.08240v2
    • og:image
      /static/browse/0.3.4/images/arxiv-logo-fb.png
  • Twitter Meta Tags

    6
    • twitter:site
      @arxiv
    • twitter:card
      summary
    • twitter:title
      Augmented SBERT: Data Augmentation Method for Improving...
    • twitter:description
      There are two approaches for pairwise sentence scoring: Cross-encoders, which perform full-attention over the input pair, and Bi-encoders, which map each input independently to a dense vector...
    • twitter:image
      https://static.arxiv.org/icons/twitter/arxiv-logo-twitter-square.png
  • Link Tags

    12
    • apple-touch-icon
      /static/browse/0.3.4/images/icons/apple-touch-icon.png
    • canonical
      /abs/2010.08240
    • icon
      /static/browse/0.3.4/images/icons/favicon-32x32.png
    • icon
      /static/browse/0.3.4/images/icons/favicon-16x16.png
    • manifest
      /static/browse/0.3.4/images/icons/site.webmanifest

Links

65