openai-gpt_GPT-3是一个了不起的研究工具。 但是,OpenAI不共享代码。

阎阎宝
2023-12-01

openai-gpt

For years, A.I. research lab OpenAI has been chasing the dream of an algorithm that can write like a human.

F或几年,人工智能研究实验室OpenAI一直追了一种算法,可以写出像一个人的梦想。

Its latest iteration on that concept, a language-generation algorithm called GPT-3, has now been used to generate such convincing fake writing that one blog written by the it fooled posters on Hacker News and became popular enough to top the site. (A telling excerpt from the post: “In order to get something done, maybe we need to think less. Seems counter-intuitive, but I believe sometimes our thoughts can get in the way of the creative process.”)

它在该概念上的最新迭代(一种称为GPT-3的语言生成算法)现已用于生成令人信服的假文字,以至于由它撰写的一个博客愚弄了Hacker News上的海报,并广受欢迎,成为该网站的热门。 ( 帖子中的一段有趣的摘录:“为了完成某件事,也许我们需要少想些。似乎违反直觉,但我相信有时我们的想法会阻碍创作过程。”)

While OpenAI has released its algorithms to the public in the past, it has opted to keep GPT-3 locked away.

尽管OpenAI过去已向公众发布其算法,但它选择保持GPT-3处于锁定状态。

OpenAI has been able to achieve such a powerful algorithm because of its access to massive amounts of computing power and data. And the algorithm itself is bigger than any that’s come before it: The largest version of GPT-3 has 175 billion parameters, which are equations that help the algorithm make a more precise prediction. GPT-2 had 1.5 billion.

Open A I之所以能够实现如此强大的算法,是因为它可以访问大量的计算能力和数据。 而且该算法本身比之前的任何算法都要强大:GPT-3的最大版本具有1,750亿个参数,这些参数可以帮助该算法做出更精确的预测。 GPT-2有15亿。

While OpenAI has released its algorithms to the public in the past, it has opted to keep GPT-3 locked away. The research firm says it’s simply too large for most people to run, and putting it behind a paywall allows OpenAI to monetize its research. In the past year, OpenAI has changed its corporate structure to make itself more appealing to investors. It dropped a nonprofit standing in favor of a “capped-profit” model that would allow investors to get returns on their investment if OpenAI becomes profitable. It also entered into a $1 billion deal with Microsoft, opening collaboration between the firms and giving OpenAI priority access to Microsoft’s cloud computing platform.

虽然OpenAI过去已向公众发布其算法,但它选择保持GPT-3处于锁定状态。 这家研究公司表示,它对于大多数人来说实在太大了,将其隐藏在付费专区后,OpenAI便可以利用其研究获利。 在过去的一年中,OpenAI改变了公司结构,以使其对投资者更具吸引力。 它放弃了非营利组织的立场,转而支持“上限利润”模型 ,该模型将允许投资者在OpenAI获利后获得投资回报。 它还与微软达成了一项10亿美元的交易,开启了两家公司之间的合作,并给予OpenAI优先访问微软云计算平台的机会。

Researchers who spoke to OneZero questioned OpenAI’s decisions to not release the algorithm, saying that it goes against basic scientific principles and makes the company’s claims harder to verify. (A representative for OpenAI declined to comment when reached for this article.)

OneZero交谈的研究人员对OpenAI不发布该算法的决定提出了质疑,称它违反了基本科学原理,并使该公司的主张更加难以验证。 (在到达本文时,OpenAI的代表拒绝发表评论。)

“I remain unconvinced by any of the arguments provided so far for not sharing the code for AlphaGo, GPT-2/GPT-3,” Joelle Pineau, co-managing director of Facebook AI Research (FAIR) and head of the FAIR lab in Montreal, told OneZero in an email. “And there are many more cases in A.I.”

Facebook AI Research(FAIR)联合管理总监兼FAIR实验室负责人Joelle Pineau表示:“到目前为止,我为不共享AlphaGo,GPT-2 / GPT-3的代码而提出的任何论点仍然感到怀疑。”蒙特利尔在一封电子邮件中告诉OneZero 。 “人工智能领域还有更多案例”

At its heart, GPT-3 is an incredibly powerful tool for writing in the English language. The most important thing about GPT-3 is its size. GPT-3 learned to produce writing by analyzing 45 terabytes of data, and that training process reportedly cost millions of dollars in cloud computing. It has seen human writing in billions of combinations.

本质上GPT-3是使用英语编写的功能强大的工具。 GPT-3最重要的是它的尺寸。 GPT-3通过分析45 TB的数据学会了写作,据报道,培训过程在云计算上花费了数百万美元 。 它已经看到了数十亿种组合的人类写作。

This is a key part of OpenAI’s long-term strategy. The firm has been saying for years that when it comes to deep learning algorithms, the bigger the better. More data and more computing power make a more capable algorithm. For instance, when OpenAI crushed professional esports players at Dota 2, it was due to its ability to train algorithms on hundreds of GPUs at the same time.

这是OpenAI长期战略的关键部分。 该公司多年来一直在说深度学习算法方面,越大越好。 更多的数据和更多的计算能力使算法更强大。 例如,当OpenAI在Dota 2 击败专业电玩家时,这是因为它能够同时在数百个GPU上训练算法。

It’s something OpenAI leaders have told me previously: Jack Clark, policy director for OpenAI, said that the bigger the algorithm, the “more coherent, more creative, and more reliable” it is. When talking about the amount of training the Dota 2 bots needed, CTO Greg Brockman said, “We just kept waiting for the magic to run out. We kept waiting to hit a wall, and we never seemed to hit a wall.”

这是OpenAI领导人以前告诉我的事情:OpenAI策略总监Jack Clark 表示 ,算法越大,算法就越“连贯,更具创造力和可靠性”。 在谈到Dota 2机器人所需的培训量时,首席技术官Greg Brockman说:“我们一直在等待魔术耗尽。 我们一直在等待撞墙,我们似乎从未撞墙。”

A similar approach was taken for GPT-3. OpenAI argues that bigger algorithms, meaning more parameters, allow more general behavior. For instance, GPT-3’s most basic function is to act like an autocomplete. Give it one word or a sentence, and it will generate what it thinks comes next, word by word. But it can also answer questions, or even do translations, without needing any changes to the algorithm. This is different from more specialized, fine-tuned algorithms that can tackle only one task.

GPT-3也采用了类似的方法。 OpenAI认为,更大的算法(意味着更多的参数)允许更一般的行为。 例如,GPT-3的最基本功能是充当自动完成功能。 给它一个单词或一个句子,它就会逐字逐句地产生它认为接下来的内容。 但是它也可以回答问题,甚至可以翻译,而无需对算法进行任何更改。 这不同于只能处理一项任务的更专业的,经过微调的算法。

Some argue that this is a step toward general intelligence, the holy grail of A.I. that would mean an algorithm could learn and adapt much like a human, while others say the algorithm still doesn’t actually understand the words it’s spitting out.

一些人认为这是迈向通用智能的一步 ,这是AI的圣杯,这意味着算法可以像人类一样学习和适应,而另一些人则说该算法实际上仍然无法理解它吐出来的单词。

OpenAI has released a detailed research paper that explains the architecture of the algorithm and results it achieved, but when it comes to studying how GPT-3 functions, other A.I. researchers have to take OpenAI at its word. The research firm, which has recently pivoted away from its nonprofit roots to raise money and develop commercial products, hasn’t publicly released this algorithm, as it has done in the past.

OpenAI发布了一份详细的研究论文 ,解释了该算法的体系结构及其实现的结果,但是在研究GPT-3的功能时,其他AI研究人员必须秉承OpenAI的原则。 这家研究公司最近摆脱了其非营利组织的基础 ,筹集资金并开发了商业产品,但它并未像过去那样公开发布此算法。

OpenAI infamously claimed in February 2019 that the largest version of its previous GPT-2 algorithm was too dangerous to be released, due to its ability to potentially generate misinformation or fake news. The firm initially released smaller, truncated versions of GPT-2, and seeing no evidence of misuse, ended up releasing the largest version of the algorithm. Now, instead of being too dangerous, GPT-3 seems to be too lucrative to release.

OpenAI 在2019年2月臭名昭著地声称 ,其先前GPT-2算法的最大版本过于危险,因此无法发布,因为它可能产生错误信息或虚假新闻。 该公司最初发布了较小的,截短的GPT-2版本,并且没有发现滥用的迹象,最终发布了该算法的最大版本。 现在,GPT-3并没有太危险,反而显得过于有利可图。

GPT-3 can be accessed only through an API run by OpenAI, similar to how companies like Amazon, Google, and Microsoft have monetized their own algorithms. Coders are able to write programs that send specific commands to GPT-3, which generates a response in OpenAI’s cloud and sends back the result. While the API is free during its private beta testing period, OpenAI is figuring out long-term pricing.

只能通过OpenAI运行的API来访问GPT-3,类似于Amazon,Google和Microsoft这样的公司通过自己的算法获利的方式。 编码人员能够编写将特定命令发送到GPT-3的程序,该程序将在OpenAI的云中生成响应并将结果发送回。 尽管该API在其私人Beta测试期间是免费的,但OpenAI正在计算长期定价。

That means researchers can send only specific commands to the algorithm, and OpenAI can revoke access at any time.

这意味着研究人员只能向该算法发送特定的命令,而OpenAI可以随时撤消访问。

OpenAI’s reasoning for this comes down to safety and scale. If the firm catches someone misusing the API to do something like prop up a fake news website, then the company could shut down that developer’s access.

OpenAI的理由归结为安全性和规模。 如果该公司发现有人滥用API来进行虚假新闻网站的支持,那么该公司可能会关闭该开发人员的访问权限。

As for scale, the company says that the algorithms are large and expensive to run — let alone train to begin with. “This makes it hard for anyone except larger companies to benefit from the underlying technology,” OpenAI’s website says. “We’re hopeful that the API will make powerful A.I. systems more accessible to smaller businesses and organizations.”

至于规模,该公司表示算法庞大且运行成本高昂,更不用说开始训练了。 OpenAI网站说:“这使得除大公司以外的任何人都很难从基础技术中受益。” “我们希望该API将使功能强大的AI系统更适合小型企业和组织使用。”

The exact cost for OpenAI to train and operate the algorithm is difficult to game out because of how the price of cloud computing is calculated. The cost of renting a GPU differs wildly, depending on factors like geographic proximity to certain server regions and negotiated rates based on the scale of projects. OpenAI also likely benefitted from its billion-dollar partnership with Microsoft, as some of that funding was allocated to build OpenAI its own supercomputer for these kinds of tasks.

由于如何计算云计算的价格,很难计算出OpenAI训练和操作该算法的确切成本。 租用GPU的成本千差万别,具体取决于诸如与某些服务器区域的地理位置相近以及基于项目规模的协商费率等因素。 OpenAI还可能从与微软的数十亿美元合作中受益,因为其中部分资金被分配用于为此类任务构建自己的超级计算机OpenAI

But these limitations—the size and lack of transparency—make it hard for other scientists to replicate and validate the algorithm’s efficacy.

但是,这些限制(大小和缺乏透明度)使其他科学家很难复制和验证算法的功效。

A.I. research, for all the venture capital and corporate interest, is still an avenue of computer science, and the scientific method still applies. The best-conducted scientific experiments, in this case the building of an algorithm that succeeds at a task and proves a hypothesis, can be repeated by others.

对于所有风险投资和公司利益而言,人工智能研究仍然是计算机科学的途径,并且科学方法仍然适用。 进行得最好的科学实验,在这种情况下,可以成功完成一项任务并证明一个假设的算法,可以由其他人重复进行。

Pineau, an ardent supporter of replicable computer science, says that she thinks of unreleased algorithms like GPT-3 and AlphaGo as “scientific artifacts.”

Pineau是可复制计算机科学的热心支持者,她说她认为未发布的算法(例如GPT-3和AlphaGo)是“科学人工制品”。

“A bit like a dinosaur bone you might dig out, which gives you some evidence to support some theories but is not the same as running an actual experiment,’” she told OneZero in an email.

她在一封电子邮件中告诉OneZero: “您可能会挖出一点恐龙骨骼,这为您提供了支持某些理论的证据,但与进行实际实验并不相同。”

Pineau says that these artifacts can be very useful in shaping future hypotheses, but they’re still not a replacement for conclusive knowledge.

皮诺(Pineau)表示,这些人工制品在塑造未来假设时可能非常有用,但仍不能替代结论性知识。

Others worry that by limiting access to the code and trained algorithm, OpenAI threatens the “democratization” of artificial intelligence, an idea that access to A.I. should be attainable by anyone.

其他人担心,通过限制对代码和训练有素的算法的访问,OpenAI威胁到人工智能的“民主化”,即任何人都应该可以实现对AI的访问。

The phrase “access to A.I.” is multifaceted, meaning access to computing power, datasets, and the algorithms themselves. Open source frameworks like Google’s TensorFlow and Facebook’s PyTorch make algorithms easy to build and share, and many open source datasets exist.

短语“对AI的访问”是多方面的,表示对计算能力,数据集和算法本身的访问。 诸如Google的TensorFlow和Facebook的PyTorch之类的开源框架使算法易于构建和共享,并且存在许多开源数据集。

But computing power comes from hardware, a constrained physical resource that’s most accessible to large companies and well-funded research organizations like OpenAI.

但是计算能力来自硬件,这是一种有限的物理资源,大型公司和资金雄厚的研究组织(如OpenAI)最容易使用。

If OpenAI’s experiments turn out to be the way forward for artificial intelligence, and bigger algorithms translate to increased performance, then cutting-edge A.I. becomes inaccessible to those who can’t afford it. It also allows big companies with the resources to make the rules as to who has access to certain A.I. algorithms. For example, they could set them behind an API and charge access to use the algorithm.

如果OpenAI的实验被证明是人工智能的发展方向,而更大的算法可以转化为更高的性能,那么那些负担不起的人将无法使用尖端的AI。 它还允许大公司有资源制定规则,确定谁有权访问某些AI算法。 例如,他们可以将它们设置在API后面,并收取访问费用以使用该算法。

“If we believe that the road to better A.I. in fact is a function of larger models, then OpenAI becomes a gatekeeper of who can have good A.I. and who cannot,” says Mark Riedl, an A.I. professor at Georgia Institute of Technology who studies natural language processing.

乔治亚理工学院的AI教授马克·里德尔(Mark Riedl)表示:“如果我们认为通向更好的AI之路实际上是更大模型的功能,那么OpenAI将成为谁拥有良好AI和谁没有AI的把关人”语言处理。

Riedl also questions whether OpenAI, which has gone to great lengths in the past to think about how its algorithms could be misused, would monitor all the uses of its new API to see if its being used for malicious ends.

Riedl还质疑过去曾费了很多时间思考其算法可能被滥用的OpenAI是否会监视其新API的所有使用情况,以查看其是否被用于恶意目的。

“Will OpenAI look at the outputs and try to make judgement calls about whether their technology is being used appropriately? This seems to be a critical question given OpenAI’s mission statement and how it is at odds with their new for-profit mode. Can they even monitor at scale?” he asked.

“ OpenAI是否会查看输出并尝试就其技术是否被正确使用做出判断? 鉴于OpenAI的使命声明以及这与他们的新营利模式不符,这似乎是一个关键问题。 他们甚至可以大规模监视吗?” 他问。

And not everyone is sold that OpenAI’s “bigger is better” method is the way forward.

并非所有人都被OpenAI的“越大越好”的方法是前进的道路。

For example, natural language processing researcher Melanie Mitchell put GPT-3 through its paces on a “copycat” test, where the algorithm was asked to identify patterns in how certain series of letters were changed. If “abc” is changed to “abd,” then what does “efg” change to?

例如,自然语言处理研究人员Melanie Mitchell 将GPT-3放在了“模仿”测试上,该测试被要求算法识别某些字母序列如何变化的模式。 如果将“ abc”更改为“ abd”,那么“ efg”将更改为什么?

These kinds of tests, which Mitchell developed an algorithm to solve in the 1980s, are a tiny simulation of making the kinds of analogies that humans make all the time. To make an analogy correctly, you have to understand all the components’ relationships with each other. In the alphabet example, the algorithm has to understand that the alphabet is ordered and the position of each letter.

米切尔(Mitchell)在1980年代开发了一种可以解决的算法,这些测试只是对人类一直在做出的类比做出的微小模拟。 为了正确地进行类比,您必须了解所有组件之间的关系。 在字母示例中,算法必须了解字母的顺序以及每个字母的位置。

While the algorithm performed well in many of the tests, Mitchell found that it was also unable to grasp some simple concepts that other algorithms had mastered decades ago.

虽然该算法在许多测试中表现良好,但Mitchell发现它也无法掌握其他算法数十年前掌握的一些简单概念。

“On the research side, I personally think that throwing more compute and parameters at a problem may be a dead-end strategy in A.I.,” Mitchell told OneZero. “I don’t think that’s the way that real progress will be made if our goal is to build machines with robust, general intelligence.”

“在研究方面,我个人认为在问题上投入更多的计算和参数可能是AI的死胡同,” Mitchell告诉OneZero 。 “如果我们的目标是制造具有强大的通用情报的机器,那么我认为这不会取得真正的进步。”

She does concede that the idea of large amounts of computing power gives tech giants an edge when it comes to building A.I. products that require deep learning, but conversely, not every modern problem requires a power-hungry deep learning algorithm. In other words: Not every problem needs a GPT-3-sized solution.

她确实承认,在构建需要深度学习的AI产品时,大量计算能力的想法为科技巨头提供了优势,但是相反,并非每个现代问题都需要耗能的深度学习算法。 换句话说:并非每个问题都需要GPT-3尺寸的解决方案。

“All in all, GPT-3’s performance is often impressive and surprising, but it is also similar to a lot of what we see in today’s state-of-the-art A.I. systems: impressive, intelligent-seeming performance interspersed with unhumanlike errors, plus no transparency as to why it performs well or makes certain errors,” Mitchell wrote when testing the algorithm.

“总而言之,GPT-3的性能通常令人印象深刻且令人惊讶,但它也与我们在当今最先进的AI系统中所看到的很多相似:令人印象深刻,智能的性能散布着不人道的错误,而且对于执行效果好或出现某些错误的原因也没有透明性。”米切尔在测试算法时写道

翻译自: https://onezero.medium.com/gpt-3-is-an-amazing-research-tool-openai-isnt-sharing-the-code-d048ba39bbfd

openai-gpt

 类似资料: