In the previous guide I showed you how to set up a basic rails app that interacts with claude using the claude-ruby gem. But the examples I gave you were very rudimentary - basically just "tell me a joke", then print out the response that Claude gave us.
In this guide we'll expand on that with some more detailed examples.
- Claude-ruby options
- model
- temperature
- max_tokens
- Handling conversation context
Models
The thing that will have the biggest impact of all in your claude API calls is the model that you choose to use.
Claude has 3 "classes" of models, ranging from Haiku (which is the cheapest and fastest), to Sonnet (a balanced model), through to Opus (which is the smartest, but most expensive model).
Anthropic progressively release new model versions, so it's always best to check what is available and to choose the best model for your particular use-case.
If you don't specify a model with your api call, then claude-ruby gem will use a default model, which at the time of this guide in July 2024, is claude-3-5-sonnet-20240620
.
It's important to note here is that there is drastically different pricing for the different models - i.e. a magnitude of 60 difference between Claude Haiku and Claude Opus
Claude Haiku
Claude Haiku is very cheap at only 25 cents per million input tokens and 1 dollar 25 per million output tokens. It's also very fast. This makes it a great choice for realtime chat bots, for basic summarization of large amounts of data, for quick sentiment analysis of social media posts or news articles, or for jobs where cost is the main restriction.
Claude Opus
Claude Opus on the other hand is 60 times more expensive than Haiku, and it's also slower. That makes it not a great candidate for chat bot applications, but an excellent choice for drafting blog posts and news articles, for analysing complex data sets, or for performing complex reasoning or calculations.
Claude Sonnet
Claude Sonnet fits in between Haiku and Opus on cost, intelligence and speed, and is a good general use model for many applications.
But since Claude 3.5 Sonnet is also the newest model, and as the only 3.5 model, it temporarily has the highest intelligence according to benchmark tests. This will likely change once other 3.5 models are released later in 2024.
With claude-ruby gem you can specify the model like so:
response = claude.messages(
claude.user_message(message),
{ model: 'claude-3-haiku-20240307' })
Or, you can use one of claude-ruby's predefined constants such as MODEL_CLAUDE_SMARTEST, MODEL_CLAUDE_BALANCED, or MODEL_CLAUDE_CHEAPEST
response = claude.messages(
claude.user_message(message),
{ model: Claude::Client::MODEL_CLAUDE_SMARTEST })
Temperature
Now let's take a look at the temperature option.
This option determines the randomness that's injected into the response. In Claude, the allowed temperatures range from 0.0 to 1.0. The default is 1.0.
A low temperature like 0.0 will give a more deterministic result - which is good for analytical type answers, or for when you're testing out other options and want a consistent baseline to compare against.
A high temperature like 1.0 is better for more creative or generative tasks, like generating blog post ideas, or creative writing.
With claude-ruby gem the temperature can be specified in the options like this:
response = claude.messages(
claude.user_message(message),
{ temperature: 0.0 })
max_tokens
This option allows you to specify the approximate maximum number of tokens to generate before stopping. Note that it's just a guide, and Claude's response may contain less tokens than what you specify.
This can be useful if you're wanting to reduce your API spend (since output tokens are 5 times more expensive that input tokens). It can also be used if you want a more concise result - but note that when a low max_tokens is used then the response may be cut off mid-sentence, so this is better used in a research context rather than for presenting live results to an end user.
Here's a claude-ruby example using max_tokens:
response = claude.messages(
claude.user_message(message),
{ max_tokens: 50 })
There are more options that can be explored but they're only relevant for more specific use cases.
Conversations
In our earlier examples we provided just a single message to Claude and got back a single result.
But Claude allows us to pass in multiple messages for different roles, in a conversational context.
This not only helps guide Claude towards the type of response that we want, but it's also important for building chat bots, because API calls are stateless, with no memory of what's previously been discussed.
Claude supports the roles of "User" and "Assistant". System messages are handled separately.
The Anthropic API guide shows an example with multiple conversational turns, and we can do that with claude-ruby gem like so:
system_message = "You are a helpful assistant."
messages = [
{"role": "user", "content": "Hello there."},
{"role": "assistant", "content": "Hi, I'm Claude. How can I help you?"},
{"role": "user", "content": "Can you explain LLMs in plain English?"},
]
response = claude.messages(messages,{ system: system_message })
We define an array of messages hashes, where each one has a "role" and a "content". The first message should use the "user" role, and the second message should have the "assistant" role. You can repeat this as many times as required to form the conversation memory.
Another example using roles is to "put words into Claude's mouth", so to speak.
Here we partially fill in the assistant's response, and Claude will respond accordingly.
messages = [
{"role": "user", "content": "Which wine goes best with steak? (A) Sauvignon blanc (B) Cabernet sauvignon (C) Pinot noir, (D) Pinot blanc"},
{"role": "assistant", "content": "The best answer is ("},
]
response = claude.messages(messages)
That's all for now. Join me in the next guide where I'll go into detail of some Natural Language Processing Tasks with Claude and Ruby.
Examples from the video and article are available on github:
https://github.com/devgabcom/claude102