Close Menu
  • Home
  • Business News
    • Entrepreneurship
  • Investments
  • Markets
  • Opinion
  • Politics
  • Startups
    • Stock Market
  • Trending
    • Technology
  • Online Jobs

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

What's Hot

Tech Entrepreneurship: Eliminating waste and eliminating scarcity

July 17, 2024

AI for Entrepreneurs and Small Business Owners

July 17, 2024

Young Entrepreneurs Succeed in Timor-Leste Business Plan Competition

July 17, 2024
Facebook X (Twitter) Instagram
  • Home
  • Business News
    • Entrepreneurship
  • Investments
  • Markets
  • Opinion
  • Politics
  • Startups
    • Stock Market
  • Trending
    • Technology
  • Online Jobs
Facebook X (Twitter) Instagram Pinterest
Prosper planet pulse
  • Home
  • Privacy Policy
  • About us
    • Advertise with Us
  • AFFILIATE DISCLOSURE
  • Contact
  • DMCA Policy
  • Our Authors
  • Terms of Use
  • Shop
Prosper planet pulse
Home»Business News»OpenAI launches anthropological ignore rules to stop bots from scraping web content
Business News

OpenAI launches anthropological ignore rules to stop bots from scraping web content

prosperplanetpulse.comBy prosperplanetpulse.comJune 21, 2024No Comments3 Mins Read0 Views
Share Facebook Twitter Pinterest LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Two of the world’s top AI startups are ignoring requests from media publishers to stop scraping web content for free model training data, Business Insider has learned.

OpenAI and Anthropic were found to ignore or circumvent established web rules called robots.txt that prevent the automated scraping of websites.

TollBit, a startup that aims to broker paid licensing deals between publishers and AI companies, found several AI companies engaging in this behavior and notified some major publishers in a letter on Friday that was previously reported by Reuters. The letter did not name the AI ​​companies that allegedly circumvented the rules.

OpenAI and Anthropic have publicly stated that they respect robots.txt and block two specific web crawlers: GPTBot and ClaudeBot.

However, TollBit’s findings show that such blocks are not being respected as claimed, with AI companies such as OpenAI and Anthropic simply choosing to “bypass” robots.txt in order to retrieve or scrape all content from a given website or page.

An OpenAI spokesperson declined to comment beyond pointing BI to a May company blog post in which the company said it takes web crawler permissions into account each time it trains a new model. An Anthropic spokesperson did not respond to an email seeking comment.

Robots.txt is a single piece of code that has been used since the late 1990s as a way for websites to tell bot crawlers that they don’t want their data scraped or collected. It was widely accepted as one of the informal rules that support the web.

The rise of generative AI has startups and tech companies racing to build the most powerful AI models. A key ingredient is high-quality data. This thirst for training data is undermining robots.txt and the informal agreements that support the use of this code.

OpenAI develops the popular chatbot ChatGPT. The company’s largest investor is Microsoft. Anthropic develops another relatively popular chatbot Claude. The company’s largest investor is Amazon.

Both chatbots respond to users’ questions in a human-like manner, and can do so because the AI ​​models they are based on contain large amounts of text and data collected from the web, much of which is copyrighted or owned by its creators.

Last year, several technology companies filed a lawsuit with the U.S. Copyright Office arguing that nothing on the web should be considered copyrightable when it comes to AI training data.

OpenAI has signed deals with several publishers, including BI-owner Axel Springer, for access to their content, and the U.S. Copyright Office is expected to update its guidelines on AI and copyright later this year.

Are you a tech employee or someone with tips and insights to share? Contact Kali Hays. email address Or in a secure messaging appsignal +1-949-280-0267. Please contact us using a non-work device.



Source link

Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
prosperplanetpulse.com
  • Website

Related Posts

Business News

ATLANTIC-ACM Announces 2024 U.S. Business Connectivity Service Provider Excellence Awards

July 10, 2024
Business News

Costco’s hourly workers will get a pay raise. Read the CEO memo.

July 10, 2024
Business News

Why a Rockland restaurant closed after 48 years

July 10, 2024
Business News

RNC Business: Thrive or Die? Local businesses prepare a week in advance

July 10, 2024
Business News

Tesla’s energy business is growing and could be the company’s next big source of revenue.

July 10, 2024
Business News

DC Police Chief asks small business owners to help prevent crime

July 10, 2024
Add A Comment
Leave A Reply Cancel Reply

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Editor's Picks

The rule of law is more important than feelings about Trump | Opinion

July 15, 2024

OPINION | Biden needs to follow through on promise to help Tulsa victims

July 15, 2024

Opinion | Why China is off-limits to me now

July 15, 2024

Opinion | Fast food chains’ value menu wars benefit consumers

July 15, 2024
Latest Posts

ATLANTIC-ACM Announces 2024 U.S. Business Connectivity Service Provider Excellence Awards

July 10, 2024

Costco’s hourly workers will get a pay raise. Read the CEO memo.

July 10, 2024

Why a Rockland restaurant closed after 48 years

July 10, 2024

Stay Connected

Twitter Linkedin-in Instagram Facebook-f Youtube

Subscribe