builders.mozilla.org/announcing-common-corpus
Preview meta tags from the builders.mozilla.org website.
Linked Hostnames
11- 12 links tobuilders.mozilla.org
- 5 links tohuggingface.co
- 3 links towww.mozilla.org
- 2 links toarxiv.org
- 2 links totwitter.com
- 2 links towww.linkedin.com
- 1 link todiscord.gg
- 1 link togithub.com
Thumbnail

Search Engine Appearance
Announcing Common Corpus
We released Common Corpus, the largest fully open dataset of over 2 trillion tokens. Pleias is committed to training LLMs in the open. This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define “open” strictly: all data must be both accessible […]
Bing
Announcing Common Corpus
We released Common Corpus, the largest fully open dataset of over 2 trillion tokens. Pleias is committed to training LLMs in the open. This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define “open” strictly: all data must be both accessible […]
DuckDuckGo
Announcing Common Corpus
We released Common Corpus, the largest fully open dataset of over 2 trillion tokens. Pleias is committed to training LLMs in the open. This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define “open” strictly: all data must be both accessible […]
General Meta Tags
14- titleAnnouncing Common Corpus - Mozilla Builders
- charsetutf-8
- viewportwidth=device-width, initial-scale=1
- application-name
- msapplication-TileColor#FFFFFF
Open Graph Meta Tags
10og:locale
en_US- og:typearticle
- og:titleAnnouncing Common Corpus
- og:descriptionWe released Common Corpus, the largest fully open dataset of over 2 trillion tokens. Pleias is committed to training LLMs in the open. This means not only releasing our models but also being open about every aspect, from the training data to the training code. We define “open” strictly: all data must be both accessible […]
- og:urlhttps://builders.mozilla.org/announcing-common-corpus/
Twitter Meta Tags
7- twitter:cardsummary_large_image
- twitter:creator@mozillabuilders
- twitter:site@mozillabuilders
- twitter:label1Written by
- twitter:data1Anastasia Stasenko, Pierre-Carl Langlais
Link Tags
16- apple-touch-icon-precomposedhttps://builders.mozilla.org/wp-content/themes/mozilla-builders/static/img/icons/apple-touch-icon-57x57.png
- apple-touch-icon-precomposedhttps://builders.mozilla.org/wp-content/themes/mozilla-builders/static/img/icons/apple-touch-icon-114x114.png
- apple-touch-icon-precomposedhttps://builders.mozilla.org/wp-content/themes/mozilla-builders/static/img/icons/apple-touch-icon-72x72.png
- apple-touch-icon-precomposedhttps://builders.mozilla.org/wp-content/themes/mozilla-builders/static/img/icons/apple-touch-icon-144x144.png
- apple-touch-icon-precomposedhttps://builders.mozilla.org/wp-content/themes/mozilla-builders/static/img/icons/apple-touch-icon-60x60.png
Links
31- https://arxiv.org/pdf/2101.00027
- https://arxiv.org/pdf/2410.22587
- https://builders.mozilla.org
- https://builders.mozilla.org/ai-support-in-theia-ide
- https://builders.mozilla.org/announcing-localscore