====== AutoComplete ======

<WRAP 516px>
^  AutoComplete  ^^
|{{ hack:auto:autocomplete.webp?500x280 |A screenshot of AutoComplete in action, using the GPT-4 language model. The chat history shows "JohnDoe was blown up by Chicken." and GPT-4 suggests replying with "lol, how did that even happen?".}}||
^Type|Hack|
^Category|[[:Chat]]|
^In-game description|"Auto-completes your chat messages using large language models. Requires either an OpenAI account with API access or a locally installed language model with the oobabooga web UI."|
^[[:keybinds#default_keybinds|Default keybind]]|none|
^Source code|[[w7src>net/wurstclient/hacks/AutoCompleteHack.java]]|
|:::|[[w7src>net/wurstclient/hacks/autocomplete]]|
</WRAP>

AutoComplete is a Minecraft hack that generates auto-completions for the user's chat messages, using large language models like GPT-3, GPT-4 and LLaMA.

===== Settings =====

==== API provider ====
^  API provider  ^^
^Type|Enum|
^In-game description|"**OpenAI** lets you use models like ChatGPT, but requires an account with API access, costs money to use and sends your chat history to their servers. The name is a lie - it's closed source.\\ \\ **oobabooga** lets you use models like LLaMA and many others. It's a true open source alternative to OpenAI that you can run locally on your own computer. It's free to use and does not send your chat history to any servers."|
^Default value|oobabooga|
^Possible values|OpenAI, oobabooga|

The "API provider" setting allows the user to choose between the OpenAI API and a locally running oobabooga web UI instance. AutoComplete needs to be restarted for changes in this setting to take effect.

==== OpenAI model ====
^  OpenAI model  ^^
^Type|Enum|
^In-game description|"The model to use for OpenAI API calls.\\ \\ **Text-Davinci-003** (better known as GPT-3) is an older model that's less censored than ChatGPT, but it's also 10x more expensive to use.\\ \\ **GPT-3.5-Turbo** (better known as ChatGPT) is recommended for most use cases, as it's relatively cheap and powerful.\\ \\ **GPT-4** is more powerful, but only works if OpenAI has chosen you to be a beta tester. It can be anywhere from 15x to 60x more expensive than ChatGPT."|
^Default value|gpt-3.5-turbo|
^Possible values|gpt-3.5-turbo, gpt-3.5-turbo-0613, gpt-3.5-turbo-0301, gpt-3.5-turbo-16k, gpt-3.5-turbo-16k-0613, gpt-4, gpt-4-0613, gpt-4-0314, gpt-4-32k, gpt-4-32k-0613, text-davinci-003, text-davinci-002, text-davinci-001, davinci, text-curie-001, curie, text-babbage-001, babbage, text-ada-001, ada|

==== Max tokens ====
^  Max tokens  ^^
^Type|Slider|
^In-game description|"The maximum number of tokens that the model can generate.\\ \\ Higher values allow the model to predict longer chat messages, but also increase the time it takes to generate predictions.\\ \\ The default value of 16 is fine for most use cases."|
^Default value|16 tokens|
^Minimum|1 token|
^Maximum|100 tokens|
^Increment|1 token|

==== Temperature ====
^  Temperature  ^^
^Type|Slider|
^In-game description|"Controls the model's creativity and randomness. A higher value will result in more creative and sometimes nonsensical completions, while a lower value will result in more boring completions."|
^Default value|0.7|
^Minimum|0|
^Maximum|2|
^Increment|0.01|

Note: Temperature values above 1 will cause most language models to generate complete nonsense and should only be used for comedic effect.

==== Top P ====
^  Top P  ^^
^Type|Slider|
^In-game description|"An alternative to temperature. Makes the model less random by only letting it choose from the most likely tokens.\\ \\ A value of 100% disables this feature by letting the model choose from all tokens."|
^Default value|100%|
^Minimum|0%
^Maximum|100%|
^Increment|1%|

==== Presence penalty ====
^  Presence penalty  ^^
^Type|Slider|
^In-game description|"Penalty for choosing tokens that already appear in the chat history.\\ \\ Positive values encourage the model to use synonyms and talk about different topics. Negative values encourage the model to repeat the same word over and over again.\\ \\ Only works with OpenAI models."|
^Default value|0|
^Minimum|-2|
^Maximum|2|
^Increment|0.01|

==== Frequency penalty ====
^  Frequency penalty  ^^
^Type|Slider|
^In-game description|"Similar to presence penalty, but based on how often the token appears in the chat history.\\ \\ Positive values encourage the model to use synonyms and talk about different topics. Negative values encourage the model to repeat existing chat messages.\\ \\ Only works with OpenAI models."|
^Default value|0.6|
^Minimum|-2|
^Maximum|2|
^Increment|0.01|

==== Repetition penalty ====
^  Repetition penalty  ^^
^Type|Slider|
^In-game description|"Similar to presence penalty, but uses a different algorithm.\\ \\ 1.0 means no penalty, negative values are not possible and 1.5 is the maximum value.\\ \\ Only works with the oobabooga web UI."|
^Default value|1|
^Minimum|1|
^Maximum|1.5|
^Increment|0.01|

==== Encoder repetition penalty ====
^  Encoder repetition penalty  ^^
^Type|Slider|
^In-game description|"Similar to frequency penalty, but uses a different algorithm.\\ \\ 1.0 means no penalty, 0.8 behaves like a negative value and 1.5 is the maximum value.\\ \\ Only works with the oobabooga web UI."|
^Default value|1|
^Minimum|0.8|
^Maximum|1.5|
^Increment|0.01|

==== Stop sequence ====
^  Stop sequence  ^^
^Type|Enum|
^In-game description|"Controls how AutoComplete detects the end of a chat message.\\ \\ **Line Break** is the default value and is recommended for most language models.\\ \\ **Next Message** works better with certain code-optimized language models, which have a tendency to insert line breaks in the middle of a chat message."|
^Default value|Line Break|
^Possible values|Line Break, Next Message|

Note: "certain code-optimized language models" is a reference to OpenAI's ''code-davinci-002'' model, which worked much better when using the "Next Message" option and is unfortunately no longer available. It's possible that open source code models like StarCoder will see a similar improvement when using the "Next Message" option.

==== Context length ====
^  Context length  ^^
^Type|Slider|
^In-game description|"Controls how many messages from the chat history are used to generate predictions.\\ \\ Higher values improve the quality of predictions, but also increase the time it takes to generate them, as well as cost (for OpenAI API users) or RAM usage (for oobabooga users)."|
^Default value|10 messages|
^Minimum|0 (unlimited)|
^Maximum|100 messages|
^Increment|1 message|

==== Filter server messages ====
^  Filter server messages  ^^
^Type|Checkbox|
^In-game description|"Only shows player-made chat messages to the model.\\ \\ This can help you save tokens and get more out of a low context length, but it also means that the model will have no idea about events like players joining, leaving, dying, etc."|
^Default value|not checked|

==== OpenAI chat endpoint ====
^  OpenAI chat endpoint  ^^
^Type|TextField|
^In-game description|"Endpoint for OpenAI's chat completion API."|
^Default value|''https://api.openai.com/v1/chat/completions''|

The "OpenAI chat endpoint" setting allows the user to use OpenAI's chat completion API through a proxy. This is necessary in some countries where OpenAI's APIs are banned.

It may also be useful for Microsoft Azure customers who have their own endpoint, but this has not been tested yet. There are subtle differences in the Azure version of the API, so it's possible that it won't work with AutoComplete.

==== OpenAI legacy endpoint ====
^  OpenAI legacy endpoint  ^^
^Type|TextField|
^In-game description|"Endpoint for OpenAI's legacy completion API."|
^Default value|''https://api.openai.com/v1/completions''|

The "OpenAI legacy endpoint" setting allows the user to use OpenAI's legacy completion API through a proxy. This is necessary in some countries where OpenAI's APIs are banned.

It may also be useful for Microsoft Azure customers who have their own endpoint, but this has not been tested yet. There are subtle differences in the Azure version of the API, so it's possible that it won't work with AutoComplete.

==== Oobabooga endpoint ====
^  Oobabooga endpoint  ^^
^Type|TextField|
^In-game description|"Endpoint for your Oobabooga web UI instance.\\ Remember to start the Oobabooga server with the <color #FF5>--extensions api</color> flag."|
^Default value|''http://127.0.0.1:5000/api/v1/generate''|

The "Oobabooga endpoint" setting allows the user to set a custom endpoint for their Oobabooga web UI instance. This is useful for users who have the Oobabooga web UI running on a different computer than the one they're playing Minecraft on.

By running the Oobabooga web UI on a server, rented from a specialized AI hosting provider, it's possible to use much more powerful language models that would not be possible to run on a gaming computer.

==== Max suggestions per draft ====
^  Max suggestions per draft  ^^
^Type|Slider|
^In-game description|"How many suggestions the AI is allowed to generate for the same draft message.\\ \\ <color #F55>**WARNING:**</color> Higher values can use up a lot of tokens. Definitely limit this to 1 for expensive models like GPT-4."|
^Default value|3|
^Minimum|1|
^Maximum|10|
^Increment|1|

The "Max suggestions per draft" setting controls how many different suggestions the AI will try to generate for the same draft message. Higher values will result in more suggestions, but will also use up more tokens and be more expensive for OpenAI API users. This setting can be useful for exploring different response options.

Setting "Max suggestions per draft" to a higher value than "Max suggestions shown" is usually not a good idea, as there will be no way to see the additional suggestions.

==== Max suggestions kept ====
^  Max suggestions kept  ^^
^Type|Slider|
^In-game description|"Maximum number of suggestions kept in memory."|
^Default value|100 messages|
^Minimum|10 messages|
^Maximum|1000 messages|
^Increment|10 messages|

The "Max suggestions kept" setting only controls at what point old suggestions are deleted from memory. Higher values don't use any additional tokens and only consume a tiny amount of RAM. This is why the range of values is so much higher than for the other settings.

==== Max suggestions shown ====
^  Max suggestions shown  ^^
^Type|Slider|
^In-game description|"How many suggestions can be shown above the chat box.\\ \\ If this is set too high, the suggestions will obscure some of the existing chat messages. How high you can set this depends on your screen resolution and GUI scale."|
^Default value|5|
^Minimum|1|
^Maximum|10|
^Increment|1|

The "Max suggestions shown" setting controls how many suggestions can be shown at once on the screen. Depending on the user's screen resolution and GUI scale, higher values may cause the suggestions to cover up other parts of the UI.

Setting "Max suggestions per draft" to a higher value than "Max suggestions shown" is usually not a good idea, as there will be no way to see the additional suggestions.

===== Changes =====
^Version^Changes^
|[[update:Wurst 7.33]]|Added AutoComplete.|
|[[update:Wurst 7.35]]|Fixed the description of AutoComplete's "Max tokens" setting incorrectly claiming that stop sequences don't work when using the oobabooga web UI.|
|[[update:Wurst 7.36]]|Added "OpenAI chat endpoint", "OpenAI legacy endpoint" and "Oobabooga endpoint" settings to AutoComplete.|
|:::|AutoComplete now supports all of OpenAI's currently available language models.|

{{tag>client-side}}