题意:OpenAI GPT-3 API 错误:“此模型的最大上下文长度是 2049 个token”
问题背景:
I have two issues relating to the response result from OpenAI completion.
我遇到了两个与OpenAI完成响应结果相关的问题
The following result doesn't return back the full text when I give a content of 500 words and prompt with "Fix grammar mistakes" (Is tokens issue?)
当我给出500字的内容,并使用“Fix grammar mistakes”作为提示时,返回的结果并没有包含完整的文本(这是令牌问题吗?)
The second issue is when the text sometimes have some double quotes or single quotes. It messes with the JSON format. So I delete any type of quotes from the content (I am not sure if it's the best solution, but I may prefer doing it in JavaScript, not PHP).
第二个问题是,有时文本中会包含双引号或单引号,这会破坏JSON格式。因此,我从内容中删除了任何类型的引号(我不确定这是否是最佳解决方案,但我可能更倾向于在JavaScript中而不是PHP中执行此操作)
curl_setopt($ch, CURLOPT_POSTFIELDS, "{\n \"model\": \"text-davinci-001\",\n \"prompt\": \"" . $open_ai_prompt . ":nn" . $content_text . "\",\n \"temperature\": 0,\n \"top_p\": 1.0,\n \"frequency_penalty\": 0.0,\n \"presence_penalty\": 0.0\n}");
"message": "We could not parse the JSON body of your request. (HINT: This likely means you aren't using your HTTP library correctly. The OpenAI API expects a JSON payload, but what was sent was not valid JSON.
问题解决:
Regarding token limits 关于令牌限制
First of all, I think you don't understand how tokens work: 500 words is more than 500 tokens. Use the Tokenizer to calculate the number of tokens.
首先,我认为你不了解令牌是如何工作的:500个单词并不意味着就是500个令牌。你应该使用分词器(Tokenizer)来计算令牌的数量。
As stated in the official OpenAI article:
正如OpenAI官方文章所述:
Depending on the model used, requests can use up to
4097
tokens shared between prompt and completion. If your prompt is4000
tokens, your completion can be97
tokens at most.The limit is currently a technical limitation, but there are often creative ways to solve problems within the limit, e.g. condensing your prompt, breaking the text into smaller pieces, etc.
Switch text-davinci-001
for a GPT-3 model because the token limits are higher.
切换到GPT-3模型的text-davinci-001版本,因为这个版本的令牌限制更高。
GPT-3 models:
Regarding double quotes in JSON
关于JSON中的双引号
You can escape double quotes in JSON by using \
in front of double quotes like this:
在JSON中,你可以通过在双引号前面加上反斜杠\
来转义双引号,像这样:
"This is how you can escape \"double quotes\" in JSON."
But... This is more of a quick fix. For proper solution, see @ADyson's comment above:
但是...这更像是一个快速修复方法。对于正确的解决方案,请参见上面@ADyson的评论:
Don't build your JSON by hand like that. Make a PHP object / array with the correct structure, and then use
json_encode()
to turn it into valid JSON, it will automatically handle any escaping etc which is needed, and you can also use the options to tweak certain things about the output - check the PHP documentation.
EDIT 1
You need to set the max_tokens parameter higher. Otherwise, the output will be shorter than your input. You will not get the whole fixed text back, but just a part of it.
你需要将max_tokens
参数设置得更高。否则,输出将比输入短。你将不会得到整个修正后的文本,而只是其中的一部分。
EDIT 2
Now you set the max_tokens
parameter too high! If you set max_tokens = 5000
, this is too much even for the most capable GPT-3 model (i.e., text-davinci-003
). The prompt and the completion together can be 4097
tokens.现在你把max_tokens
参数设置得太高了!如果你将max_tokens
设置为5000,即使是对于功能最强大的GPT-3模型(即text-davinci-003
)来说也太多了。提示(prompt)和完成(completion)文本加在一起最多只能是4097个令牌。
You can figure this out if you take a look at the error you got:
你可以通过查看你遇到的错误来弄清楚这一点: