Wolfram Language & System Documentation Center

"Cohere" (Service Connection)

See Also
- ServiceExecute
- ServiceConnect
- LLMFunction
- LLMSynthesize
- ChatEvaluate
- LLMConfiguration
- Service Connections
- AlephAlpha
- Anthropic
- DeepSeek
- GoogleGemini
- Groq
- MistralAI
- OpenAI
- TogetherAI
- See Also
  - ServiceExecute
  - ServiceConnect
  - LLMFunction
  - LLMSynthesize
  - ChatEvaluate
  - LLMConfiguration
  - Service Connections
  - AlephAlpha
  - Anthropic
  - DeepSeek
  - GoogleGemini
  - Groq
  - MistralAI
  - OpenAI
  - TogetherAI

"Cohere" (Service Connection)

This service connection requires LLM access »

Use the Cohere API with the Wolfram Language.

Connecting & Authenticating

ServiceConnect["Cohere"] creates a connection to the Cohere API. If a previously saved connection can be found, it will be used; otherwise, a new authentication request will be launched.

Use of this connection requires internet access and a Cohere account.

Requests

ServiceExecute["Cohere","request",params] sends a request to the Cohere API, using parameters params. The following gives possible requests.

Request:

"TestConnection" — returns Success for working connection, Failure otherwise

Text

Request:

"Completion" — create text completion for a given prompt

Parameters:

"Prompt"	(required)	the prompt for which to generate completions
"MaxTokens"	Automatic	maximum number of tokens to generate
"FrequencyPenalty"	Automatic	penalize tokens based on their existing frequency in the text so far (between -2 and 2)
"Model"	Automatic	name of the model to use
"N"	Automatic	number of completions to return
"PresencePenalty"	Automatic	penalize new tokens based on whether they appear in the text so far
"StopTokens"	Automatic	strings where the API will stop generating further tokens
"Stream"	False	return the result as server-sent events
"Temperature"	Automatic	sampling temperature
"TopProbabilities"	Automatic	sample only among the k highest-probability classes
"TotalProbabilityCutoff"	Automatic	an alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with the requested probability mass

Request:

"Chat" — create a response for the given chat conversation

Parameters:

"Messages"	(required)	a list of messages in the conversation, each given as an association with "Role" and "Content" keys
"MaxTokens"	Automatic	maximum number of tokens to generate
"Model"	Automatic	name of the model to use
"StopTokens"	Automatic	strings where the API will stop generating further tokens
"Stream"	False	return the result as server-sent events
"Temperature"	Automatic	sampling temperature
"Tools"	Automatic	one or more LLMTool objects available to the model
"TopProbabilities"	Automatic	sample only among the k highest-probability classes
"TotalProbabilityCutoff"	Automatic	an alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with the requested probability mass

Request:

"Embedding" — create an embedding vector representing the input text

Optional parameters:

	"Input"	(required)	one text or a list of texts to get embeddings for
	"Model"	Automatic	name of the model to use

Model Lists

Request:

"ChatModelList" — list models available for the "Chat" request

Request:

"EmbeddingModelList" — list models available for the "Embedding" request

Examples

open all close all

Basic Examples (1)

Create a new connection:

Wolfram Language code: cohere = ServiceConnect["Cohere"]

Complete a piece of text:

Wolfram Language code: ServiceExecute[cohere, "Completion", {"Prompt" -> "Hello there!"}]

Generate a response from a chat:

Wolfram Language code: ServiceExecute[cohere, "Chat", {"Messages" -> {<|"Role" -> "User", "Content" -> "Hello there!"|>}}]

Compute the embedding for a sentence:

Wolfram Language code: ServiceExecute[cohere, "Embedding", {"Input" -> "The quick brown fox ..."}]

Scope (5)

Connection (1)

Test the connection

Wolfram Language code: ServiceExecute["Cohere", "TestConnection"]

Text (4)

Completion (1)

Return multiple completions, decreasing the number of characters in each and specify a stop token:

Wolfram Language code:

ServiceExecute["Cohere", "Completion", {"Prompt" -> "Hello there!", "N" -> 2, "MaxTokens" -> 10, "StopTokens" -> {"Hello"}}]

Chat (1)

Respond to a chat containing multiple messages:

Wolfram Language code:

ServiceExecute["Cohere", "Chat", {"Messages" -> {
	<|"Role" -> "User", "Content" -> "What's the tallest mountain?"|>, 
	<|"Role" -> "Assistant", "Content" -> "The highest mountain in the world is Mount Everest."|>, 
	<|"Role" -> "User", "Content" -> "How tall is it?"|>
	}}]

Allow the model to use an LLMTool:

Wolfram Language code: tool = LLMTool[{"countCharacter", "count the number of characters in a string"}, "string", StringLength[#string]&]

Wolfram Language code:

ServiceExecute["Cohere", "Chat", {"Model" -> "command-r", "Messages" -> {<|"Role" -> "User", "Content" -> "Please tell me how long this message is"|>}, "Tools" -> tool}]

ChatModelList (1)

Look up the available chat models list:

Wolfram Language code: ServiceExecute["Cohere", "ChatModelList"]

EmbeddingModelList (1)

Look up the available embedding models list:

Wolfram Language code: ServiceExecute["Cohere", "EmbeddingModelList"]

Top

More Learning

Tech Support

Wolfram Solutions

Wolfram Solutions For Education

Get Started

Grow Your Skills

Work with Us

Educational Programs for Adults

Educational Programs for Youth

Read

"Cohere" (Service Connection)

Connecting & Authenticating

Requests

Text

Model Lists

Examples

Basic Examples (1)

Scope (5)

Connection (1)

Text (4)

Completion (1)

Chat (1)

ChatModelList (1)

EmbeddingModelList (1)

"Cohere" (Service Connection)

Connecting & Authenticating

Requests

Text

Model Lists

Examples

Basic Examples (1)

Scope (5)

Connection (1)

Text (4)

Completion (1)

Chat (1)

ChatModelList (1)

EmbeddingModelList (1)

See Also

History