Prompt Engineering Techniques From OpenAI

Prompt Engineering Techniques From OpenAI

How to get the most out of AI models like GPT-4 with practical prompt engineering techniques that improve response accuracy and relevance, from OpenAI’s guides.

Jun 19, 2024
How to get the most out of AI models like GPT-4 with practical prompt engineering techniques that improve response accuracy and relevance, from OpenAI’s guides.
来自OpenAI的提示词技巧
How to get the most out of AI models like GPT-4 with practical prompt engineering techniques that improve response accuracy and relevance, from OpenAI’s guides. 来自OpenAI的提示词技巧
  • Use Case: Enhancing the effectiveness and accuracy of AI responses through strategic prompt engineering.
  • Tool: GPT-4 or similar large language models.
  • Time for Learning: Approximately 20 minutes.

Summary

This guide outlines various strategies and tactics to optimize prompt engineering for large language models such as GPT-4. It includes tips on writing clear instructions, providing reference texts, splitting complex tasks into simpler ones, and using external tools to improve model performance. By applying these techniques, users can significantly enhance the quality and relevance of AI-generated responses.

Bear’s take

Bear: Prompt engineering is like fine-tuning a musical instrument – with the right adjustments, you can make it sing beautifully. One time, I was working on a project and needed a summary of a complex report. By breaking down the task into smaller steps and providing clear examples, I got a concise and accurate summary that saved me hours of work.

What you’ll learn

From this guide, you'll learn how to craft prompts that maximize the capabilities of AI models like GPT-4. You'll discover strategies for writing clear and detailed instructions, specifying the desired length and format of outputs, and using reference texts to guide the AI. The guide also covers how to split complex tasks into simpler subtasks, allowing the AI to handle them more effectively. Additionally, you'll explore the use of external tools to complement the AI's abilities, such as using code execution for precise calculations. By the end, you'll be equipped with practical tactics to enhance the accuracy and relevance of AI responses, making these tools more useful for your everyday tasks.

Key points

  1. Write Clear Instructions: Include specific details and context in your prompts.
  1. Provide Reference Text: Supply relevant information to guide the AI's responses.
  1. Split Complex Tasks: Break down large tasks into manageable subtasks.
  1. Specify Desired Output: Indicate the length and format of the response you need.
  1. Use External Tools: Leverage tools like code execution for precise calculations.

Next step

  • Start experimenting with the tactics discussed in the guide.
  • Practice creating different types of prompts to see which techniques yield the best results.
  • Explore the OpenAI Cookbook and other resources for more advanced prompting strategies.

Links or resource

notion image

Six Prompting Strategies

1. Write Clear Instructions

These models can’t read your mind. If outputs are too long, ask for brief replies. If outputs are too simple, ask for expert-level writing. If you dislike the format, demonstrate the format you’d like to see. The less the model has to guess at what you want, the more likely you’ll get it.
Tactics:
  • Include details in your query to get more relevant answers
  • Ask the model to adopt a persona
  • Use delimiters to clearly indicate distinct parts of the input
  • Specify the steps required to complete a task
  • Provide examples
  • Specify the desired length of the output

2. Provide Reference Text

Language models can confidently invent fake answers, especially when asked about esoteric topics or for citations and URLs. In the same way that a sheet of notes can help a student do better on a test, providing reference text to these models can help in answering with fewer fabrications.
Tactics:
  • Instruct the model to answer using a reference text
  • Instruct the model to answer with citations from a reference text

3. Split Complex Tasks

Just as it is good practice in software engineering to decompose a complex system into a set of modular components, the same is true of tasks submitted to a language model. Complex tasks tend to have higher error rates than simpler tasks. Furthermore, complex tasks can often be re-defined as a workflow of simpler tasks in which the outputs of earlier tasks are used to construct the inputs to later tasks.
Tactics:
  • Use intent classification to identify the most relevant instructions for a user query
  • For dialogue applications that require very long conversations, summarize or filter previous dialogue
  • Summarize long documents piecewise and construct a full summary recursively

4. Give the Model Time to “Think”

If asked to multiply 17 by 28, you might not know it instantly, but can still work it out with time. Similarly, models make more reasoning errors when trying to answer right away, rather than taking time to work out an answer. Asking for a “chain of thought” before an answer can help the model reason its way toward correct answers more reliably.
Tactics:
  • Instruct the model to work out its own solution before rushing to a conclusion
  • Use inner monologue or a sequence of queries to hide the model’s reasoning process
  • Ask the model if it missed anything on previous passes

5. Use External Tools

Compensate for the weaknesses of the model by feeding it the outputs of other tools. For example, a text retrieval system (sometimes called RAG or retrieval augmented generation) can tell the model about relevant documents. A code execution engine like OpenAI’s Code Interpreter can help the model do math and run code. If a task can be done more reliably or efficiently by a tool rather than by a language model, offload it to get the best of both.
Tactics:
  • Use embeddings-based search to implement efficient knowledge retrieval
  • Use code execution to perform more accurate calculations or call external APIs
  • Give the model access to specific functions

6. Test Changes Systematically

Improving performance is easier if you can measure it. In some cases a modification to a prompt will achieve better performance on a few isolated examples but lead to worse overall performance on a more representative set of examples. Therefore to be sure that a change is net positive to performance it may be necessary to define a comprehensive test suite (also known as an “eval”).
Tactic:
  • Evaluate model outputs with reference to gold-standard answers

如何利用像GPT-4这样的AI模型,通过实用的提示工程技术提高响应的准确性和相关性,这是来自OpenAI的指南。
应用场景: 通过战略性的提示工程提高AI响应的有效性和准确性。
工具: GPT-4或类似的大语言模型。
学习时间: 大约20分钟。
总结
本指南概述了优化大型语言模型(如GPT-4)提示工程的各种策略和战术。包括如何编写清晰的指令、提供参考文本、将复杂任务拆分为简单任务,以及使用外部工具来提高模型性能。通过应用这些技术,用户可以显著提升AI生成的响应质量和相关性。
Bear的观点
Bear: 提示工程就像调音乐器一样——通过正确的调整,你可以让它发出美妙的声音。有一次,我在做一个项目时需要对一份复杂报告进行总结。通过将任务分解为较小的步骤并提供清晰的示例,我得到了一个简洁准确的总结,节省了我数小时的工作时间。
你将学到的内容
通过本指南,你将学会如何编写能够最大化AI模型(如GPT-4)能力的提示。你将发现编写清晰详细指令、指定所需输出的长度和格式以及使用参考文本来引导AI的策略。指南还介绍了如何将复杂任务拆分为简单子任务,使AI能够更有效地处理。此外,你还将探索使用外部工具来补充AI的能力,如使用代码执行进行精确计算。到最后,你将掌握实用的策略,以提高AI响应的准确性和相关性,使这些工具在你的日常任务中更加有用。
关键点
1. 编写清晰的指令: 在提示中包含具体的细节和上下文。
2. 提供参考文本: 提供相关信息以引导AI的响应。
3. 拆分复杂任务: 将大任务分解为可管理的子任务。
4. 指定所需输出: 指明你需要的响应长度和格式。
5. 使用外部工具: 利用代码执行等工具进行精确计算。
下一步
• 开始尝试指南中讨论的策略。
• 练习创建不同类型的提示,看看哪些技术能产生最佳效果。

六种提示策略
1. 编写清晰的指令
这些模型不能读心术。如果输出太长,请要求简短回复。如果输出太简单,请要求专业级别的写作。如果不喜欢格式,请展示你想要的格式。模型越少猜测你想要什么,你就越有可能得到你想要的。
策略:
• 在查询中包含详细信息以获得更相关的答案
• 要求模型扮演某个角色
• 使用分隔符清楚地表示输入的不同部分
• 指定完成任务所需的步骤
• 提供示例
• 指定所需输出的长度
2. 提供参考文本
语言模型在被问到专业话题或需要引用和URL时,可能会自信地编造假答案。就像一张笔记能帮助学生在考试中表现得更好一样,给这些模型提供参考文本可以帮助减少虚构答案。
策略:
• 指示模型使用参考文本回答问题
• 指示模型用引用参考文本来回答问题
3. 拆分复杂任务
就像在软件工程中将复杂系统分解为一组模块化组件一样,对语言模型提交的任务也应该如此。复杂任务的错误率往往高于简单任务。此外,复杂任务通常可以重新定义为一系列简单任务的工作流程,其中早期任务的输出用于构建后续任务的输入。
策略:
• 使用意图分类来识别用户查询的最相关指令
• 对需要长时间对话的对话应用,总结或过滤先前的对话
• 分段总结长文档并递归地构建完整总结
4. 给模型“思考”的时间
如果被要求将17乘以28,你可能不会立刻知道答案,但可以花时间算出来。同样,模型在试图立即回答时会犯更多推理错误,而不是花时间得出答案。要求模型在回答前进行“思考链”可以更可靠地推理出正确答案。
策略:
• 指示模型在得出结论前自己解决问题
• 使用内心独白或一系列查询来展示模型的推理过程
• 询问模型是否在之前的尝试中遗漏了什么
5. 使用外部工具
通过输入其他工具的输出来补偿模型的弱点。例如,文本检索系统(有时称为RAG或检索增强生成)可以告诉模型相关文档。像OpenAI的代码解释器这样的代码执行引擎可以帮助模型进行数学计算和运行代码。如果一个任务可以更可靠或更有效地由工具完成,而不是由语言模型完成,分担任务以获得两者的最佳效果。
策略:
• 使用基于嵌入的搜索实现高效知识检索
• 使用代码执行进行更准确的计算或调用外部API
• 让模型访问特定功能
6. 系统地测试更改
如果可以衡量性能,改进性能会更容易。在某些情况下,提示的修改会在一些孤立的示例中实现更好的性能,但在更具代表性的示例集中却导致更差的整体性能。因此,为了确保更改对性能有净正面影响,可能有必要定义一个综合测试套件(也称为“评估”)。
策略:
• 参考标准答案评估模型输出