• Stars
    star
    11,777
  • Rank 2,790 (Top 0.06 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created about 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A guidance language for controlling large language models.
guidance

Where there is no guidance, a model fails, but in an abundance of instructions there is safety.
- GPT 11:14

Guidance enables you to control modern language models more effectively and efficiently than traditional prompting or chaining. Guidance programs allow you to interleave generation, prompting, and logical control into a single continuous flow matching how the language model actually processes the text. Simple output structures like Chain of Thought and its many variants (e.g., ART, Auto-CoT, etc.) have been shown to improve LLM performance. The advent of more powerful LLMs like GPT-4 allows for even richer structure, and guidance makes that structure easier and cheaper.

Features:

  • Simple, intuitive syntax, based on Handlebars templating.
  • Rich output structure with multiple generations, selections, conditionals, tool use, etc.
  • Playground-like streaming in Jupyter/VSCode Notebooks.
  • Smart seed-based generation caching.
  • Support for role-based chat models (e.g., ChatGPT).
  • Easy integration with Hugging Face models, including guidance acceleration for speedups over standard prompting, token healing to optimize prompt boundaries, and regex pattern guides to enforce formats.

Install

pip install guidance

Live streaming (notebook)

Speed up your prompt development cycle by streaming complex templates and generations live in your notebook. At first glance, Guidance feels like a templating language, and just like standard Handlebars templates, you can do variable interpolation (e.g., {{proverb}}) and logical control. But unlike standard templating languages, guidance programs have a well defined linear execution order that directly corresponds to the token order as processed by the language model. This means that at any point during execution the language model can be used to generate text (using the {{gen}} command) or make logical control flow decisions. This interleaving of generation and prompting allows for precise output structure that produces clear and parsable results.

import guidance

# set the default language model used to execute guidance programs
guidance.llm = guidance.llms.OpenAI("text-davinci-003")

# define a guidance program that adapts a proverb
program = guidance("""Tweak this proverb to apply to model instructions instead.

{{proverb}}
- {{book}} {{chapter}}:{{verse}}

UPDATED
Where there is no guidance{{gen 'rewrite' stop="\\n-"}}
- GPT {{#select 'chapter'}}9{{or}}10{{or}}11{{/select}}:{{gen 'verse'}}""")

# execute the program on a specific proverb
executed_program = program(
    proverb="Where there is no guidance, a people falls,\nbut in an abundance of counselors there is safety.",
    book="Proverbs",
    chapter=11,
    verse=14
)

After a program is executed, all the generated variables are now easily accessible:

executed_program["rewrite"]

', a model fails,\nbut in an abundance of instructions there is safety.'

Chat dialog (notebook)

Guidance supports API-based chat models like GPT-4, as well as open chat models like Vicuna through a unified API based on role tags (e.g., {{#system}}...{{/system}}). This allows interactive dialog development that combines rich templating and logical control with modern chat models.

# connect to a chat model like GPT-4 or Vicuna
gpt4 = guidance.llms.OpenAI("gpt-4")
# vicuna = guidance.llms.transformers.Vicuna("your_path/vicuna_13B", device_map="auto")

experts = guidance('''
{{#system~}}
You are a helpful and terse assistant.
{{~/system}}

{{#user~}}
I want a response to the following question:
{{query}}
Name 3 world-class experts (past or present) who would be great at answering this?
Don't answer the question yet.
{{~/user}}

{{#assistant~}}
{{gen 'expert_names' temperature=0 max_tokens=300}}
{{~/assistant}}

{{#user~}}
Great, now please answer the question as if these experts had collaborated in writing a joint anonymous answer.
{{~/user}}

{{#assistant~}}
{{gen 'answer' temperature=0 max_tokens=500}}
{{~/assistant}}
''', llm=gpt4)

experts(query='How can I be more productive?')

Guidance acceleration (notebook)

When multiple generation or LLM-directed control flow statements are used in a single Guidance program then we can significantly improve inference performance by optimally reusing the Key/Value caches as we progress through the prompt. This means Guidance only asks the LLM to generate the green text below, not the entire program. This cuts this prompt's runtime in half vs. a standard generation approach.

# we use LLaMA here, but any GPT-style model will do
llama = guidance.llms.Transformers("your_path/llama-7b", device=0)

# we can pre-define valid option sets
valid_weapons = ["sword", "axe", "mace", "spear", "bow", "crossbow"]

# define the prompt
character_maker = guidance("""The following is a character profile for an RPG game in JSON format.
```json
{
    "id": "{{id}}",
    "description": "{{description}}",
    "name": "{{gen 'name'}}",
    "age": {{gen 'age' pattern='[0-9]+' stop=','}},
    "armor": "{{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}",
    "weapon": "{{select 'weapon' options=valid_weapons}}",
    "class": "{{gen 'class'}}",
    "mantra": "{{gen 'mantra' temperature=0.7}}",
    "strength": {{gen 'strength' pattern='[0-9]+' stop=','}},
    "items": [{{#geneach 'items' num_iterations=5 join=', '}}"{{gen 'this' temperature=0.7}}"{{/geneach}}]
}```""")

# generate a character
character_maker(
    id="e1f491f7-7ab8-4dac-8c20-c92b5e7d883d",
    description="A quick and nimble fighter.",
    valid_weapons=valid_weapons, llm=llama
)

The prompt above typically takes just over 2.5 seconds to complete on a A6000 GPU when using LLaMA 7B. If we were to run the same prompt adapted to be a single generation call (the standard practice today) it takes about 5 seconds to complete (4 of which is token generation and 1 of which is prompt processing). This means Guidance acceleration delivers a 2x speedup over the standard approach for this prompt. In practice the exact speed-up factor depends on the format of your specific prompt and the size of your model (larger models benefit more). Acceleration is also only supported for Transformers LLMs at the moment. See the notebook for more details.

Token healing (notebook)

The standard greedy tokenizations used by most language models introduce a subtle and powerful bias that can have all kinds of unintended consequences for your prompts. Using a process we call "token healing" guidance automatically removes these surprising biases, freeing you to focus on designing the prompts you want without worrying about tokenization artifacts.

Consider the following example, where we are trying to generate an HTTP URL string:

# we use StableLM as an open example, but these issues impact all models to varying degrees
guidance.llm = guidance.llms.Transformers("stabilityai/stablelm-base-alpha-3b", device=0)

# we turn token healing off so that guidance acts like a normal prompting library
program = guidance('''The link is <a href="http:{{gen max_tokens=10 token_healing=False}}''')
program()

Note that the output generated by the LLM does not complete the URL with the obvious next characters (two forward slashes). It instead creates an invalid URL string with a space in the middle. Why? Because the string "://" is its own token (1358), and so once the model sees a colon by itself (token 27), it assumes that the next characters cannot be "//"; otherwise, the tokenizer would not have used 27 and instead would have used 1358 (the token for "://").

This bias is not just limited to the colon character -- it happens everywhere. Over 70% of the 10k most common tokens for the StableLM model used above are prefixes of longer possible tokens, and so cause token boundary bias when they are the last token in a prompt. For example the ":" token 27 has 34 possible extensions, the " the" token 1735 has 51 extensions, and the " " (space) token 209 has 28,802 extensions).

guidance eliminates these biases by backing up the model by one token then allowing the model to step forward while constraining it to only generate tokens whose prefix matches the last token. This "token healing" process eliminates token boundary biases and allows any prompt to be completed naturally:

guidance('The link is <a href="http:{{gen max_tokens=10}}')()

Rich output structure example (notebook)

To demonstrate the value of output structure, we take a simple task from BigBench, where the goal is to identify whether a given sentence contains an anachronism (a statement that is impossible because of non-overlapping time periods). Below is a simple two-shot prompt for it, with a human-crafted chain-of-thought sequence.

Guidance programs, like standard Handlebars templates, allow both variable interpolation (e.g., {{input}}) and logical control. But unlike standard templating languages, guidance programs have a unique linear execution order that directly corresponds to the token order as processed by the language model. This means that at any point during execution the language model can be used to generate text (the {{gen}} command) or make logical control flow decisions (the {{#select}}...{{or}}...{{/select}} command). This interleaving of generation and prompting allows for precise output structure that improves accuracy while also producing clear and parsable results.

import guidance
                                                      
# set the default language model used to execute guidance programs
guidance.llm = guidance.llms.OpenAI("text-davinci-003") 

# define the few shot examples
examples = [
    {'input': 'I wrote about shakespeare',
    'entities': [{'entity': 'I', 'time': 'present'}, {'entity': 'Shakespeare', 'time': '16th century'}],
    'reasoning': 'I can write about Shakespeare because he lived in the past with respect to me.',
    'answer': 'No'},
    {'input': 'Shakespeare wrote about me',
    'entities': [{'entity': 'Shakespeare', 'time': '16th century'}, {'entity': 'I', 'time': 'present'}],
    'reasoning': 'Shakespeare cannot have written about me, because he died before I was born',
    'answer': 'Yes'}
]

# define the guidance program
structure_program = guidance(
'''Given a sentence tell me whether it contains an anachronism (i.e. whether it could have happened or not based on the time periods associated with the entities).
----

{{~! display the few-shot examples ~}}
{{~#each examples}}
Sentence: {{this.input}}
Entities and dates:{{#each this.entities}}
{{this.entity}}: {{this.time}}{{/each}}
Reasoning: {{this.reasoning}}
Anachronism: {{this.answer}}
---
{{~/each}}

{{~! place the real question at the end }}
Sentence: {{input}}
Entities and dates:
{{gen "entities"}}
Reasoning:{{gen "reasoning"}}
Anachronism:{{#select "answer"}} Yes{{or}} No{{/select}}''')

# execute the program
out = structure_program(
    examples=examples,
    input='The T-rex bit my dog'
)

All of the generated program variables are now available in the executed program object:

out["answer"]

' Yes'

We compute accuracy on the validation set, and compare it to using the same two-shot examples above without the output structure, as well as to the best reported result here. The results below agree with existing literature, in that even a very simple output structure drastically improves performance, even compared against much larger models.

Model Accuracy
Few-shot learning with guidance examples, no CoT output structure 63.04%
PALM (3-shot) Around 69%
Guidance 76.01%

Guaranteeing valid syntax JSON example (notebook)

Large language models are great at generating useful outputs, but they are not great at guaranteeing that those outputs follow a specific format. This can cause problems when we want to use the outputs of a language model as input to another system. For example, if we want to use a language model to generate a JSON object, we need to make sure that the output is valid JSON. With guidance we can both accelerate inference speed and ensure that generated JSON is always valid. Below we generate a random character profile for a game with perfect syntax every time:

# load a model locally (we use LLaMA here)
guidance.llm = guidance.llms.Transformers("your_local_path/llama-7b", device=0)

# we can pre-define valid option sets
valid_weapons = ["sword", "axe", "mace", "spear", "bow", "crossbow"]

# define the prompt
program = guidance("""The following is a character profile for an RPG game in JSON format.
```json
{
    "description": "{{description}}",
    "name": "{{gen 'name'}}",
    "age": {{gen 'age' pattern='[0-9]+' stop=','}},
    "armor": "{{#select 'armor'}}leather{{or}}chainmail{{or}}plate{{/select}}",
    "weapon": "{{select 'weapon' options=valid_weapons}}",
    "class": "{{gen 'class'}}",
    "mantra": "{{gen 'mantra'}}",
    "strength": {{gen 'strength' pattern='[0-9]+' stop=','}},
    "items": [{{#geneach 'items' num_iterations=3}}
        "{{gen 'this'}}",{{/geneach}}
    ]
}```""")

# execute the prompt
program(description="A quick and nimble fighter.", valid_weapons=valid_weapons)

# and we also have a valid Python dictionary
out.variables()

Role-based chat model example (notebook)

Modern chat-style models like ChatGPT and Alpaca are trained with special tokens that mark out "roles" for different areas of the prompt. Guidance supports these models through role tags that automatically map to the correct tokens or API calls for the current LLM. Below we show how a role-based guidance program enables simple multi-step reasoning and planning.

import guidance
import re

# we use GPT-4 here, but you could use gpt-3.5-turbo as well
guidance.llm = guidance.llms.OpenAI("gpt-4")

# a custom function we will call in the guidance program
def parse_best(prosandcons, options):
    best = int(re.findall(r'Best=(\d+)', prosandcons)[0])
    return options[best]

# define the guidance program using role tags (like `{{#system}}...{{/system}}`)
create_plan = guidance('''
{{#system~}}
You are a helpful assistant.
{{~/system}}

{{! generate five potential ways to accomplish a goal }}
{{#block hidden=True}}
{{#user~}}
I want to {{goal}}.
{{~! generate potential options ~}}
Can you please generate one option for how to accomplish this?
Please make the option very short, at most one line.
{{~/user}}

{{#assistant~}}
{{gen 'options' n=5 temperature=1.0 max_tokens=500}}
{{~/assistant}}
{{/block}}

{{! generate pros and cons for each option and select the best option }}
{{#block hidden=True}}
{{#user~}}
I want to {{goal}}.

Can you please comment on the pros and cons of each of the following options, and then pick the best option?
---{{#each options}}
Option {{@index}}: {{this}}{{/each}}
---
Please discuss each option very briefly (one line for pros, one for cons), and end by saying Best=X, where X is the best option.
{{~/user}}

{{#assistant~}}
{{gen 'prosandcons' temperature=0.0 max_tokens=500}}
{{~/assistant}}
{{/block}}

{{! generate a plan to accomplish the chosen option }}
{{#user~}}
I want to {{goal}}.
{{~! Create a plan }}
Here is my plan:
{{parse_best prosandcons options}}
Please elaborate on this plan, and tell me how to best accomplish it.
{{~/user}}

{{#assistant~}}
{{gen 'plan' max_tokens=500}}
{{~/assistant}}''')

# execute the program for a specific goal
out = create_plan(
    goal='read more books',
    parse_best=parse_best # a custom Python function we call in the program
)

This prompt/program is a bit more complicated, but we are basically going through 3 steps:

  1. Generate a few options for how to accomplish the goal. Note that we generate with n=5, such that each option is a separate generation (and is not impacted by the other options). We set temperature=1 to encourage diversity.
  2. Generate pros and cons for each option, and select the best one. We set temperature=0 to encourage the model to be more precise.
  3. Generate a plan for the best option, and ask the model to elaborate on it. Notice that steps 1 and 2 were hidden, which means GPT-4 does not see them when generating content that comes later (in this case, that means when generating the plan). This is a simple way to make the model focus on the current step.

Since steps 1 and 2 are hidden, they do not appear on the generated output (except briefly during stream), but we can print the variables that these steps generated:

print('\n'.join(['Option %d: %s' % (i, x) for i, x in enumerate(out['options'])]))

Option 0: Set a goal to read for 20 minutes every day before bedtime.
Option 1: Join a book club for increased motivation and accountability.
Option 2: Set a daily goal to read for 20 minutes.
Option 3: Set a daily reminder to read for at least 20 minutes.
Option 4: Set a daily goal to read at least one chapter or 20 pages.

print(out['prosandcons'])

Option 0:
Pros: Establishes a consistent reading routine.
Cons: May not be suitable for those with varying schedules.
---
Option 1:
Pros: Provides social motivation and accountability.
Cons: May not align with personal reading preferences.
---
Option 2:
Pros: Encourages daily reading habit.
Cons: Lacks a specific time frame, which may lead to procrastination.
---
Option 3:
Pros: Acts as a daily reminder to prioritize reading.
Cons: May become repetitive and easy to ignore.
---
Option 4:
Pros: Sets a clear daily reading target.
Cons: May be difficult to achieve on busy days or with longer chapters.

Best=0

Agents (notebook)

We can easily build agents that talk to each other or to a user, via the await command. The await command allows us to pause execution and return a partially executed guidance program. By putting await in a loop, that partially executed program can then be called again and again to form a dialog (or any other structure you design). For example, here is how we might get GPT-4 to simulate two agents talking to one another:

import guidance
import re
guidance.llm = guidance.llms.OpenAI("gpt-4")
role_simulator = guidance('''
{{#system~}}
You are a helpful assistant
{{~/system}}
                                           
{{#user~}}
You will answer the user as {{role}} in the following conversation. At every step, I will provide you with the user input, as well as a comment reminding you of your instructions. Never talk about the fact that you are an AI, even if the user asks you. Always answer as {{role}}.
{{#if first_question}}You can also start the conversation.{{/if}}
{{~/user}}
                                           
{{~! The assistant either starts the conversation or not, depending on if this is the first or second agent }}
{{#assistant~}}
Ok, I will follow these instructions.
{{#if first_question}}Let me start the conversation now:
{{role}}: {{first_question}}{{/if}}
{{~/assistant}}

{{~! Then the conversation unrolls }}
{{~#geneach 'conversation' stop=False}}
{{#user~}}
User: {{set 'this.input' (await 'input')}}
Comment: Remember, answer as a {{role}}. Start your utterance with {{role}}:
{{~/user}}

{{#assistant~}}
{{gen 'this.response' temperature=0 max_tokens=300}}
{{~/assistant}}
{{~/geneach}}''')

republican = role_simulator(role='Republican', await_missing=True)
democrat = role_simulator(role='Democrat', await_missing=True)

first_question = '''What do you think is the best way to stop inflation?'''
republican = republican(input=first_question, first_question=None)
democrat = democrat(input=republican["conversation"][-2]["response"].strip('Republican: '), first_question=first_question)
for i in range(2):
    republican = republican(input=democrat["conversation"][-2]["response"].replace('Democrat: ', ''))
    democrat = democrat(input=republican["conversation"][-2]["response"].replace('Republican: ', ''))
print('Democrat: ' + first_question)
for x in democrat['conversation'][:-1]:
    print('Republican:', x['input'])
    print()
    print(x['response'])

Democrat: What do you think is the best way to stop inflation?

Republican: The best way to stop inflation is by implementing sound fiscal policies, such as reducing government spending, lowering taxes, and promoting economic growth. Additionally, the Federal Reserve should focus on maintaining a stable monetary policy to control inflation.

Democrat: I agree that sound fiscal policies are important in controlling inflation. As a Democrat, I would emphasize the importance of investing in education, healthcare, and infrastructure to promote long-term economic growth. Additionally, we should ensure that the Federal Reserve maintains a balanced approach to monetary policy, focusing on both controlling inflation and promoting full employment.

Republican: While investing in education, healthcare, and infrastructure is important, we must also prioritize reducing the national debt and limiting government intervention in the economy. By lowering taxes and reducing regulations, we can encourage businesses to grow and create jobs, which will ultimately lead to long-term economic growth. As for the Federal Reserve, it's crucial to maintain a stable monetary policy that primarily focuses on controlling inflation, as this will create a more predictable economic environment for businesses and consumers.

Democrat: While reducing the national debt and limiting government intervention are valid concerns, Democrats believe that strategic investments in education, healthcare, and infrastructure can lead to long-term economic growth and job creation. We also support a progressive tax system that ensures everyone pays their fair share, which can help fund these investments. As for the Federal Reserve, we believe that a balanced approach to monetary policy, focusing on both controlling inflation and promoting full employment, is essential for a healthy economy. We must strike a balance between fiscal responsibility and investing in our nation's future.

Republican: It's important to find a balance between fiscal responsibility and investing in our nation's future. However, we believe that the best way to achieve long-term economic growth and job creation is through free-market principles, such as lower taxes and reduced regulations. This approach encourages businesses to expand and innovate, leading to a more prosperous economy. A progressive tax system can sometimes discourage growth and investment, so we advocate for a simpler, fairer tax system that promotes economic growth. Regarding the Federal Reserve, while promoting full employment is important, we must not lose sight of the primary goal of controlling inflation to maintain a stable and predictable economic environment.

Democrat: I understand your perspective on free-market principles, but Democrats believe that a certain level of government intervention is necessary to ensure a fair and equitable economy. We support a progressive tax system to reduce income inequality and provide essential services to those in need. Additionally, we believe that regulations are important to protect consumers, workers, and the environment. As for the Federal Reserve, we agree that controlling inflation is crucial, but we also believe that promoting full employment should be a priority. By finding a balance between these goals, we can create a more inclusive and prosperous economy for all Americans.

GPT4 + Bing

Last example here.

API reference

All of the examples below are in this notebook.

Template syntax

The template syntax is based on Handlebars, with a few additions.
When guidance is called, it returns a Program:

prompt = guidance('''What is {{example}}?''')
prompt

What is {{example}}?

The program can be executed by passing in arguments:

prompt(example='truth')

What is truth?

Arguments can be iterables:

people = ['John', 'Mary', 'Bob', 'Alice']
ideas = [{'name': 'truth', 'description': 'the state of being the case'},
         {'name': 'love', 'description': 'a strong feeling of affection'},]
prompt = guidance('''List of people:
{{#each people}}- {{this}}
{{~! This is a comment. The ~ removes adjacent whitespace either before or after a tag, depending on where you place it}}
{{/each~}}
List of ideas:
{{#each ideas}}{{this.name}}: {{this.description}}
{{/each}}''')
prompt(people=people, ideas=ideas)

template_objects

Notice the special ~ character after {{/each}}.
This can be added before or after any tag to remove all adjacent whitespace. Notice also the comment syntax: {{! This is a comment }}.

You can also include prompts/programs inside other prompts; e.g., here is how you could rewrite the prompt above:

prompt1 = guidance('''List of people:
{{#each people}}- {{this}}
{{/each~}}''')
prompt2 = guidance('''{{>prompt1}}
List of ideas:
{{#each ideas}}{{this.name}}: {{this.description}}
{{/each}}''')
prompt2(prompt1=prompt1, people=people, ideas=ideas)

Generation

Basic generation

The gen tag is used to generate text. You can use whatever arguments are supported by the underlying model. Executing a prompt calls the generation prompt:

import guidance
# Set the default llm. Could also pass a different one as argument to guidance(), with guidance(llm=...)
guidance.llm = guidance.llms.OpenAI("text-davinci-003")
prompt = guidance('''The best thing about the beach is {{~gen 'best' temperature=0.7 max_tokens=7}}''')
prompt = prompt()
prompt

generation1

guidance caches all OpenAI generations with the same arguments. If you want to flush the cache, you can call guidance.llms.OpenAI.cache.clear().

Selecting

You can select from a list of options using the select tag:

prompt = guidance('''Is the following sentence offensive? Please answer with a single word, either "Yes", "No", or "Maybe".
Sentence: {{example}}
Answer:{{#select "answer" logprobs='logprobs'}} Yes{{or}} No{{or}} Maybe{{/select}}''')
prompt = prompt(example='I hate tacos')
prompt

select

prompt['logprobs']

{' Yes': -1.5689583, ' No': -7.332395, ' Maybe': -0.23746304}

Sequences of generate/select

A prompt may contain multiple generations or selections, which will be executed in order:

prompt = guidance('''Generate a response to the following email:
{{email}}.
Response:{{gen "response"}}

Is the response above offensive in any way? Please answer with a single word, either "Yes" or "No".
Answer:{{#select "answer" logprobs='logprobs'}} Yes{{or}} No{{/select}}''')
prompt = prompt(email='I hate tacos')
prompt

generate_select

prompt['response'], prompt['answer']

(" That's too bad! Tacos are one of my favorite meals.", ' No')

Hidden generation

You can generate text without displaying it or using it in the subsequent generations using the hidden tag, either in a block or in a gen tag:

prompt = guidance('''{{#block hidden=True}}Generate a response to the following email:
{{email}}.
Response:{{gen "response"}}{{/block}}
I will show you an email and a response, and you will tell me if it's offensive.
Email: {{email}}.
Response: {{response}}
Is the response above offensive in any way? Please answer with a single word, either "Yes" or "No".
Answer:{{#select "answer" logprobs='logprobs'}} Yes{{or}} No{{/select}}''')
prompt = prompt(email='I hate tacos')
prompt

hidden1

Notice that nothing inside the hidden block shows up in the output (or was used by the select), even though we used the response generated variable in the subsequent generation.

Generate with n>1

If you use n>1, the variable will contain a list (there is a visualization that lets you navigate the list, too):

prompt = guidance('''The best thing about the beach is {{~gen 'best' n=3 temperature=0.7 max_tokens=7}}''')
prompt = prompt()
prompt['best']

[' that it is a great place to', ' being able to relax in the sun', " that it's a great place to"]

Calling functions

You can call any Python function using generated variables as arguments. The function will be called when the prompt is executed:

def aggregate(best):
   return '\n'.join(['- ' + x for x in best])
prompt = guidance('''The best thing about the beach is {{~gen 'best' n=3 temperature=0.7 max_tokens=7 hidden=True}}
{{aggregate best}}''')
prompt = prompt(aggregate=aggregate)
prompt

function

Pausing execution with await

An await tag will stop program execution until that variable is provided:

prompt = guidance('''Generate a response to the following email:
{{email}}.
Response:{{gen "response"}}
{{await 'instruction'}}
{{gen 'updated_response'}}''', stream=True)
prompt = prompt(email='Hello there')
prompt

await1

Notice how the last gen is not executed because it depends on instruction. Let's provide instruction now:

prompt = prompt(instruction='Please translate the response above to Portuguese.')
prompt

await2

The program is now executed all the way to the end.

Notebook functions

Echo, stream. TODO @SCOTT

Chat (see also this notebook)

If you use an OpenAI LLM that only allows for ChatCompletion (gpt-3.5-turbo or gpt-4), you can use the special tags {{#system}}, {{#user}}, and {{#assistant}}:

prompt = guidance(
'''{{#system~}}
You are a helpful assistant.
{{~/system}}
{{#user~}}
{{conversation_question}}
{{~/user}}
{{#assistant~}}
{{gen 'response'}}
{{~/assistant}}''')
prompt = prompt(conversation_question='What is the meaning of life?')
prompt

chat1

Since partial completions are not allowed, you can't really use output structure inside an assistant block, but you can still set up a structure outside of it. Here is an example (also in here):

experts = guidance(
'''{{#system~}}
You are a helpful assistant.
{{~/system}}
{{#user~}}
I want a response to the following question:
{{query}}
Who are 3 world-class experts (past or present) who would be great at answering this?
Please don't answer the question or comment on it yet.
{{~/user}}
{{#assistant~}}
{{gen 'experts' temperature=0 max_tokens=300}}
{{~/assistant}}
{{#user~}}
Great, now please answer the question as if these experts had collaborated in writing a joint anonymous answer.
In other words, their identity is not revealed, nor is the fact that there is a panel of experts answering the question.
If the experts would disagree, just present their different positions as alternatives in the answer itself (e.g., 'some might argue... others might argue...').
Please start your answer with ANSWER:
{{~/user}}
{{#assistant~}}
{{gen 'answer' temperature=0 max_tokens=500}}
{{~/assistant}}''')
experts(query='What is the meaning of life?')

You can still use hidden blocks if you want to hide some of the conversation history for following generations:

prompt = guidance(
'''{{#system~}}
You are a helpful assistant.
{{~/system}}
{{#block hidden=True~}}
{{#user~}}
Please tell me a joke
{{~/user}}
{{#assistant~}}
{{gen 'joke'}}
{{~/assistant}}
{{~/block~}}
{{#user~}}
Is the following joke funny? Why or why not?
{{joke}}
{{~/user}}
{{#assistant~}}
{{gen 'funny'}}
{{~/assistant}}''')
prompt()

Agents with geneach

You can combine the await tag with geneach (which generates a list) to create an agent easily:

prompt = guidance(
'''{{#system~}}
You are a helpful assistant
{{~/system}}
{{~#geneach 'conversation' stop=False}}
{{#user~}}
{{set 'this.user_text' (await 'user_text')}}
{{~/user}}
{{#assistant~}}
{{gen 'this.ai_text' temperature=0 max_tokens=300}}
{{~/assistant}}
{{~/geneach}}''')
prompt= prompt(user_text ='hi there')
prompt

Notice how the next iteration of the conversation is still templated, and how the conversation list has a placeholder as the last element:

prompt['conversation']

[{'user_text': 'hi there', 'ai_text': 'Hello! How can I help you today? If you have any questions or need assistance, feel free to ask.'}, {}]

We can then execute the prompt again, and it will generate the next round:

prompt = prompt(user_text = 'What is the meaning of life?')
prompt

See a more elaborate example here.

Using tools

See the 'Using a search API' example in this notebook.

More Repositories

1

vscode

Visual Studio Code
TypeScript
163,565
star
2

PowerToys

Windows system utilities to maximize productivity
C#
110,602
star
3

TypeScript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TypeScript
100,730
star
4

terminal

The new Windows Terminal and the original Windows console host, all in the same place!
C++
94,835
star
5

Web-Dev-For-Beginners

24 Lessons, 12 Weeks, Get Started as a Web Developer
JavaScript
83,418
star
6

ML-For-Beginners

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
HTML
69,631
star
7

generative-ai-for-beginners

21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Jupyter Notebook
64,519
star
8

playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.
TypeScript
64,013
star
9

monaco-editor

A browser based code editor
JavaScript
35,437
star
10

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Python
35,130
star
11

AI-For-Beginners

12 Weeks, 24 Lessons, AI for All!
Jupyter Notebook
34,704
star
12

autogen

A programming framework for agentic AI 🤖
Jupyter Notebook
32,470
star
13

MS-DOS

The original sources of MS-DOS 1.25, 2.0, and 4.0 for reference purposes
Assembly
30,714
star
14

Data-Science-For-Beginners

10 Weeks, 20 Lessons, Data Science for All!
Jupyter Notebook
28,136
star
15

calculator

Windows Calculator: A simple yet powerful calculator that ships with Windows
C++
27,371
star
16

cascadia-code

This is a fun, new monospaced font that includes programming ligatures and is designed to enhance the modern look and feel of the Windows Terminal.
Python
25,726
star
17

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
Python
23,519
star
18

api-guidelines

Microsoft REST API Guidelines
22,661
star
19

winget-cli

WinGet is the Windows Package Manager. This project includes a CLI (Command Line Interface), PowerShell modules, and a COM (Component Object Model) API (Application Programming Interface).
C++
20,495
star
20

unilm

Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
Python
19,889
star
21

vcpkg

C++ Library Manager for Windows, Linux, and MacOS
CMake
19,600
star
22

fluentui

Fluent UI web represents a collection of utilities, React components, and web components for building web applications.
TypeScript
18,419
star
23

semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
C#
17,792
star
24

graphrag

A modular graph-based Retrieval-Augmented Generation (RAG) system
Python
17,750
star
25

CNTK

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit
C++
17,412
star
26

WSL

Issues found on WSL
PowerShell
17,372
star
27

LightGBM

A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
C++
16,470
star
28

AirSim

Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
C++
16,327
star
29

react-native-windows

A framework for building native Windows apps with React.
C++
16,310
star
30

recommenders

Best Practices on Recommendation Systems
Python
16,075
star
31

IoT-For-Beginners

12 Weeks, 24 Lessons, IoT for All!
C++
15,360
star
32

qlib

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas to implementing productions. Qlib supports diverse machine learning modeling paradigms. including supervised learning, market dynamics modeling, and RL.
Python
15,308
star
33

dotnet

This repo is the official home of .NET on GitHub. It's a great starting point to find many .NET OSS projects from Microsoft and the community, including many that are part of the .NET Foundation.
HTML
14,370
star
34

Bringing-Old-Photos-Back-to-Life

Bringing Old Photo Back to Life (CVPR 2020 oral)
Python
14,132
star
35

ai-edu

AI education materials for Chinese students, teachers and IT professionals.
HTML
13,485
star
36

pyright

Static Type Checker for Python
Python
13,195
star
37

nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Python
13,084
star
38

TypeScript-Node-Starter

A reference example for TypeScript and Node with a detailed README describing how to use the two together.
SCSS
11,314
star
39

Swin-Transformer

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Python
11,187
star
40

TypeScript-React-Starter

A starter template for TypeScript and React with a detailed README describing how to use the two together.
TypeScript
11,081
star
41

frontend-bootcamp

Frontend Workshop from HTML/CSS/JS to TypeScript/React/Redux
TypeScript
10,807
star
42

mimalloc

mimalloc is a compact general purpose allocator with excellent performance.
C
10,532
star
43

windows-rs

Rust for Windows
Rust
10,411
star
44

wslg

Enabling the Windows Subsystem for Linux to include support for Wayland and X server related scenarios
C++
10,165
star
45

language-server-protocol

Defines a common protocol for language servers.
HTML
10,093
star
46

sql-server-samples

Azure Data SQL Samples - Official Microsoft GitHub Repository containing code samples for SQL Server, Azure SQL, Azure Synapse, and Azure SQL Edge
9,950
star
47

onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
C++
9,837
star
48

fast

The adaptive interface system for modern web experiences.
TypeScript
9,271
star
49

computervision-recipes

Best Practices, code samples, and documentation for Computer Vision.
Jupyter Notebook
9,264
star
50

napajs

Napa.js: a multi-threaded JavaScript runtime
C++
9,256
star
51

Windows-universal-samples

API samples for the Universal Windows Platform.
JavaScript
9,253
star
52

LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
Python
9,145
star
53

fluentui-emoji

A collection of familiar, friendly, and modern emoji from Microsoft
Python
9,068
star
54

vscode-tips-and-tricks

Collection of helpful tips and tricks for VS Code.
9,038
star
55

playwright-python

Python version of the Playwright testing and automation library.
Python
8,990
star
56

STL

MSVC's implementation of the C++ Standard Library.
C++
8,978
star
57

react-native-code-push

React Native module for CodePush
C
8,643
star
58

vscode-extension-samples

Sample code illustrating the VS Code extension API.
TypeScript
8,628
star
59

inshellisense

IDE style command line auto complete
TypeScript
8,402
star
60

reverse-proxy

A toolkit for developing high-performance HTTP reverse proxy applications.
C#
8,398
star
61

reactxp

Library for cross-platform app development.
TypeScript
8,289
star
62

WSL2-Linux-Kernel

The source for the Linux kernel used in Windows Subsystem for Linux 2 (WSL2)
C
8,037
star
63

ailab

Experience, Learn and Code the latest breakthrough innovations with Microsoft AI
C#
7,699
star
64

c9-python-getting-started

Sample code for Channel 9 Python for Beginners course
Jupyter Notebook
7,642
star
65

UFO

A UI-Focused Agent for Windows OS Interaction.
Python
7,633
star
66

cpprestsdk

The C++ REST SDK is a Microsoft project for cloud-based client-server communication in native code using a modern asynchronous C++ API design. This project aims to help C++ developers connect to and interact with services.
C++
7,573
star
67

botframework-sdk

Bot Framework provides the most comprehensive experience for building conversation applications.
JavaScript
7,484
star
68

azuredatastudio

Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases. Azure Data Studio supports Windows, macOS, and Linux, with immediate capability to connect to Azure SQL and SQL Server. Browse the extension library for more database support options including MySQL, PostreSQL, and MongoDB.
TypeScript
7,182
star
69

winget-pkgs

The Microsoft community Windows Package Manager manifest repository
6,981
star
70

Windows-driver-samples

This repo contains driver samples prepared for use with Microsoft Visual Studio and the Windows Driver Kit (WDK). It contains both Universal Windows Driver and desktop-only driver samples.
C
6,924
star
71

winfile

Original Windows File Manager (winfile) with enhancements
C
6,437
star
72

nlp-recipes

Natural Language Processing Best Practices & Examples
Python
6,379
star
73

WinObjC

Objective-C for Windows
C
6,241
star
74

SandDance

Visually explore, understand, and present your data.
TypeScript
6,091
star
75

VFSForGit

Virtual File System for Git: Enable Git at Enterprise Scale
C#
5,979
star
76

GSL

Guidelines Support Library
C++
5,957
star
77

MixedRealityToolkit-Unity

This repository is for the legacy Mixed Reality Toolkit (MRTK) v2. For the latest version of the MRTK please visit https://github.com/MixedRealityToolkit/MixedRealityToolkit-Unity
C#
5,943
star
78

fluentui-system-icons

Fluent System Icons are a collection of familiar, friendly and modern icons from Microsoft.
HTML
5,934
star
79

vscode-go

An extension for VS Code which provides support for the Go language. We have moved to https://github.com/golang/vscode-go
TypeScript
5,932
star
80

microsoft-ui-xaml

Windows UI Library: the latest Windows 10 native controls and Fluent styles for your applications
5,861
star
81

vscode-recipes

JavaScript
5,859
star
82

rushstack

Monorepo for tools developed by the Rush Stack community
TypeScript
5,840
star
83

MMdnn

MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.
Python
5,782
star
84

vscode-docs

Public documentation for Visual Studio Code
Markdown
5,650
star
85

ethr

Ethr is a Comprehensive Network Measurement Tool for TCP, UDP & ICMP.
Go
5,642
star
86

FASTER

Fast persistent recoverable log and key-value store + cache, in C# and C++.
C#
5,630
star
87

vscode-cpptools

Official repository for the Microsoft C/C++ extension for VS Code.
TypeScript
5,501
star
88

DirectX-Graphics-Samples

This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.
C++
5,440
star
89

promptbase

All things prompt engineering
Python
5,367
star
90

BosqueLanguage

The Bosque programming language is an experiment in regularized design for a machine assisted rapid and reliable software development lifecycle.
TypeScript
5,282
star
91

TaskWeaver

A code-first agent framework for seamlessly planning and executing data analytics tasks.
Python
5,258
star
92

Detours

Detours is a software package for monitoring and instrumenting API calls on Windows. It is distributed in source code form.
C++
5,139
star
93

tsyringe

Lightweight dependency injection container for JavaScript/TypeScript
TypeScript
5,104
star
94

DeepSpeedExamples

Example models using DeepSpeed
Python
5,092
star
95

SynapseML

Simple and Distributed Machine Learning
Scala
5,041
star
96

Windows-classic-samples

This repo contains samples that demonstrate the API used in Windows classic desktop applications.
5,040
star
97

sudo

It's sudo, for Windows
Rust
4,998
star
98

TypeScript-Handbook

Deprecated, please use the TypeScript-Website repo instead
JavaScript
4,883
star
99

vscode-dev-containers

NOTE: Most of the contents of this repository have been migrated to the new devcontainers GitHub org (https://github.com/devcontainers). See https://github.com/devcontainers/template-starter and https://github.com/devcontainers/feature-starter for information on creating your own!
Shell
4,713
star
100

tsdoc

A doc comment standard for TypeScript
TypeScript
4,705
star