blog.commoncrawl.org/2014/08/web-data-commons-extraction-framework-for-the-distributed-processing-of-cc-data

Preview meta tags from the blog.commoncrawl.org website.

Linked Hostnames

12

Thumbnail

Search Engine Appearance

Google

https://blog.commoncrawl.org/2014/08/web-data-commons-extraction-framework-for-the-distributed-processing-of-cc-data

Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data

This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.



Bing

Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data

https://blog.commoncrawl.org/2014/08/web-data-commons-extraction-framework-for-the-distributed-processing-of-cc-data

This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.



DuckDuckGo

https://blog.commoncrawl.org/2014/08/web-data-commons-extraction-framework-for-the-distributed-processing-of-cc-data

Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data

This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.

  • General Meta Tags

    7
    • title
      Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data
    • charset
      utf-8
    • description
      This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.
    • twitter:title
      Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data
    • twitter:description
      This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.
  • Open Graph Meta Tags

    4
    • og:title
      Common Crawl - Blog - Web Data Commons Extraction Framework for the Distributed Processing of CC Data
    • og:description
      This is a guest blog post by Robert Meusel, a researcher at the University of Mannheim in the Data and Web Science Research Group and a key member of the Web Data Commons project. The post below describes a new tool produced by Web Data Commons for extracting data from the Common Crawl data.
    • og:image
      https://cdn.prod.website-files.com/647b1c7a9990bad2048d3711/64e634cbee339bea89485d24_analysis.webp
    • og:type
      website
  • Twitter Meta Tags

    1
    • twitter:card
      summary_large_image
  • Link Tags

    5
    • alternate
      rss.xml
    • apple-touch-icon
      https://cdn.prod.website-files.com/6479b8d98bf5dcb4a69c4f31/648962c357c8113a871e3378_Common_Crawl_Rev3_LPX_Logo%20Gradient%20BG.png
    • canonical
      https://commoncrawl.org/blog/web-data-commons-extraction-framework-for-the-distributed-processing-of-cc-data
    • shortcut icon
      https://cdn.prod.website-files.com/6479b8d98bf5dcb4a69c4f31/648962712d8394e5aa35ead4_Common_Crawl_Rev3_LPX_White%20Icon%20(1).png
    • stylesheet
      https://cdn.prod.website-files.com/6479b8d98bf5dcb4a69c4f31/css/commoncrawl.webflow.shared.ff529ae98.css

Links

38