NLP with Claude and Ruby

Posted on 8 July 2024
Claude with Ruby - Article 3
1 2
3
4 5

Welcome to the 3rd tutorial in the Claude with Ruby series.

In this guide we'll explore three key Natural Language Processing Task techniques: Sentiment Analysis, Named Entity Recognition, and Text Summarization, and I'll show you how to perform these techniques using Claude, Ruby, and the Anthropic API.

Natural Language Processing, or NLP for short, provides several techniques that are important for understanding and extracting valuable insights from text data, making them invaluable in fields like customer service, content creation, and data analysis.

Claude, is a powerful large language model that can perform a variety of natural language processing tasks. And using the Anthropic API and claude-ruby gem, we can leverage NLP tasks within our own applications.

Sentiment Analysis

Sentiment Analysis is the process of determining the emotional tone behind a series of words. It helps identify and extract subjective information from text, which is particularly useful for understanding customer opinions, feedback, and social media sentiment.

Some Applications of Sentiment Analysis are:

  • Analyzing customer reviews to understand product feedback.
  • Monitoring social media to gauge public opinion on a topic.
  • Evaluating the sentiment of news articles or blog posts.

Using Claude for Sentiment Analysis.

Claude can be used to classify the sentiment of text as positive, negative, or neutral.

Before we do that let's refactor the AiAgent class from our earlier tutorials to make it more expandable.

We'll change our previous AiAgent class into a module, that has a Base class, and a Claude class. Then we'll move the previous methods out of AiAgent into the new Claude class. And in the Base class we'll add methods that we expect the subclasses to override.

This is going to allow us to create AiAgents for different providers in the future.

# services/ai_agent.rb
module AiAgent
  class Base
    attr_accessor :agent

    def initialize
      # be sure to set agent in the subclass initialize method
    end

    def client
      raise NotImplementedError, "Subclasses must implement the client method"
    end

    def send_messages(messages, options)
      raise NotImplementedError, "Subclasses must implement the send_message method"
    end

    def format_response(response)
      raise NotImplementedError, "Subclasses must implement the format_response method"
    end
end
# services/ai_agent/claude.rb
require "claude/client"

module AiAgent
  CLAUDE = 'claude'.freeze

  class Claude < Base
    def initialize
      super
      self.agent = CLAUDE
    end

    def client
      claude if agent == CLAUDE
    end

    def send_messages(messages, options)
      client.messages(messages, options)
    end

    def format_response(response)
      client.parse_response(response) rescue "ERROR: Couldn't extract text from Claude response"
    end

    private

    def claude
      @claude ||= ::Claude::Client.new(anthropic_api_key)
    end

    def anthropic_api_key
      @anthropic_api_key ||= ENV['ANTHROPIC_API_KEY']
    end
  end
end

Now let's look at how to perform sentiment analysis with claude and ruby.

Example 1: Analyzing the Sentiment of a Customer Review

Let's say we have a customer review and we want to know whether it has a positive, neutral or negative sentiment.

In order for Claude to classify the sentiment we'll have to pass it a prompt something like this:

Analyze the sentiment of the following text and classify it as positive, negative, or neutral:\n\n#{text}\n\nSentiment:

The way we're going to send this prompt to Claude, is we'll create an analyze_sentiment method on our AiAgent Base class. We'll have that method prepare the prompt and any options, and then call the send_messages method on our Claude class, which is a subclass of Base.

def analyze_sentiment(text, strict: true)
  prompt = "Analyze the sentiment of the following text and classify it as positive, negative, or neutral:\n\n#{text}\n\nSentiment: "
  if strict
    system = "If you are asked to return a word, then return only that word with no preamble or postamble. " if strict
    max_tokens = 2
  else
    max_tokens = 100
  end

  send_messages([ { 'role': 'user', 'content': prompt } ],
                { system: system, max_tokens: max_tokens })
end

We'll also add an optional parameter called, "strict", which will help us to direct Claude to be more specific when returning its result. When analysing a lot of data it will be best to use the strict mode, but for general usage or debugging the relaxed mode will provide some hints at how Claude came to its answer.

Here's the set of commands we'll use to perform the sentiment analysis:

ai_agent = AiAgent::Claude.new
review = "The product quality is excellent and the customer service was very helpful!"
response = ai_agent.analyze_sentiment(review)
ai_agent.format_response(response)

And Claude will return:

positive

Example 2: Analyse the abstract of a technical paper

Similar to before, we're run:

ai_agent = AiAgent::Claude.new
abstract = "This paper presents a new algorithm for deep learning, achieving state-of-the-art results in image recognition tasks. The proposed method outperforms existing techniques in both accuracy and computational efficiency."
response = ai_agent.analyze_sentiment(abstract, strict: true)
ai_agent.format_response(response)

And Claude will return:

positive

Or for an example of the same analysis using relaxed mode:

response = ai_agent.analyze_sentiment(abstract, strict: false)
ai_agent.format_response(response)

We'll get something like this:

Sentiment: Positive\n\nAnalysis:\nThe text presents information about a new algorithm in a favorable light, highlighting its achievements and advantages. Key elements that contribute to the positive sentiment include:\n\n1. \"achieving state-of-the-art results\": This phrase indicates exceptional performance, which is a positive outcome.\n\n2. \"outperforms existing techniques\": Directly states that the new method is superior to current methods, which is a positive comparison.\n\n

Example 3 - Analyse an array of short Social Media Posts

For one more example of sentiment analysis, we'll analyse an array of short Social Media Posts, and return an array of hashes, with each row containing the post and the sentiment:

posts = [
   "I love the new update!",
   "The latest feature is very disappointing.",
   "I'm indifferent about the changes."
]

ai_agent = AiAgent::Claude.new
results = posts.map do |post|
   response = ai_agent.analyze_sentiment(post)
   { post: post, sentiment: ai_agent.format_response(response).strip }
end
pp results

To which we get:

[{:post=>"I love the new update!", :sentiment=>"positive"},
 {:post=>"The latest feature is very disappointing.", :sentiment=>"Negative"},
 {:post=>"I'm indifferent about the changes.", :sentiment=>"neutral"}]

Which looks pretty accurate. Claude has done a good job of classifying each post correctly.

Named Entity Recognition

Another use of NLP is Named Entity Recognition, or NER for short.

Named Entity Recognition is the process of identifying and classifying named entities in text into predefined categories such as names of people, organizations, locations, dates, etc.

Some Applications of NER are:

  • Extracting key information from news articles.
  • Identifying entities in legal documents.
  • Enhancing search engine accuracy by recognizing specific entities.

Using Claude for NER

Here's how we can use Claude to identify and classify named entities in a given text.

Example: Extracting Entities from a News Headline

Let's say we're working on a project that needs to extract named entities from news headlines so that some statistical analysis can be done for certain entities over time.

If we have a headline like this:

Apple Inc. announced its latest iPhone in Cupertino, California.

Then we'll need a prompt for Claude to instruct it to extract the named entities for us. Something like:

Identify and list the named entities in the following text:\n\n#{text}\n\nEntities:

As for our earlier examples, we'll create a new method on AiAgent which will handle sending the prompt and options to Claude to get back the response that we need. We'll call it recognize_entities.

We'll generate the prompt, along with the input text (which is a news headline in our example), and like before, we'll create a "strict" version that aims to minimise any waffle in the response from Claude.

def recognize_entities(text, strict: true)
  prompt = "Identify and list the named entities in the following text:\n\n#{text}\n\nEntities: "
  if strict
    system = "Be specific in your answer, with no preamble or postamble. If you are asked to list some names, then return only a list of those names, nothing else. "
    max_tokens = 100
  else
    max_tokens = 500
  end

  send_messages([ { 'role': 'user', 'content': prompt } ],
                { system: system, max_tokens: max_tokens })
end

Now to call this method we can do the following:

ai_agent = AiAgent::Claude.new
abstract = "Apple Inc. announced its latest iPhone in Cupertino, California."
response = ai_agent.recognize_entities(abstract, strict: true)
y ai_agent.format_response(response)

And we get back a list of named entities from Claude:

  1. Apple Inc.
  2. iPhone
  3. Cupertino
  4. California

You could do something similar for an entire news article, or an entirely different domain such as scanning legal documents.

Text Summarization

As you'd expect, Text Summarization is the process of condensing a piece of text into its essential points, making it shorter while retaining its core meaning.

Some Applications of Text Summarization are:

  • Summarizing long research papers or articles
  • Creating concise summaries of lengthy emails or reports
  • Providing brief overviews of books or documents

Using Claude for Text Summarization.

Claude is great at processing long texts and summarizing them into something shorter, making them easier to understand. Here's how:

Again, we'll need a prompt for Claude, and a method on AiAgent to prepare the prompt and pass it on to Claude. The prompt is pretty simple this time, just something like, "Summarize the following text".

We'll put that into a method called, summarize_text, again with an optional "strict" version.

def summarize_text(text, strict: true)
  prompt = "Summarize the following text:\n\n#{text}\n\nSummary: "
  if strict
    system = "Be specific in your answer, with no preamble or postamble. I.e. return only what the user asks for, nothing else. "
    max_tokens = 100
  else
    max_tokens = 500
  end

  send_messages([ { 'role': 'user', 'content': prompt } ],
                { system: system, max_tokens: max_tokens })
end

Now it's a simple matter of calling the summarize_text method with the long text that we want to summarize.

Example: Summarizing a Research Paper.

In this example we'll get Claude to summarize a 2 paragraph research paper abstract into just a couple of sentences or so:

This paper introduces a groundbreaking algorithm designed to enhance deep learning capabilities, particularly in the realm of image recognition. The innovative approach leverages advanced neural network architectures and novel training techniques to achieve unprecedented levels of accuracy. By integrating these cutting-edge methods, the algorithm significantly improves the performance of image recognition systems, enabling more precise and reliable identification of objects within various datasets. This advancement not only sets a new benchmark for the field but also opens up new possibilities for applications in areas such as medical imaging, autonomous vehicles, and security systems.

Furthermore, the proposed method demonstrates remarkable computational efficiency, surpassing current state-of-the-art techniques. Through optimized processing algorithms and effective utilization of computational resources, the new algorithm achieves faster training times and reduced energy consumption. This efficiency is particularly beneficial for large-scale applications where processing power and speed are critical. The results of extensive testing and validation show that this method consistently outperforms existing solutions, making it a promising tool for researchers and practitioners seeking to push the boundaries of what is possible in deep learning and image recognition.

We'll use the "strict" form of the summarise_text method like so:

ai_agent = AiAgent::Claude.new
abstract = "This paper introduces a groundbreaking algorithm designed to enhance deep learning capabilities, particularly in the realm of image recognition. The innovative approach leverages advanced neural network architectures and novel training techniques to achieve unprecedented levels of accuracy. By integrating these cutting-edge methods, the algorithm significantly improves the performance of image recognition systems, enabling more precise and reliable identification of objects within various datasets. This advancement not only sets a new benchmark for the field but also opens up new possibilities for applications in areas such as medical imaging, autonomous vehicles, and security systems. Furthermore, the proposed method demonstrates remarkable computational efficiency, surpassing current state-of-the-art techniques. Through optimized processing algorithms and effective utilization of computational resources, the new algorithm achieves faster training times and reduced energy consumption. This efficiency is particularly beneficial for large-scale applications where processing power and speed are critical. The results of extensive testing and validation show that this method consistently outperforms existing solutions, making it a promising tool for researchers and practitioners seeking to push the boundaries of what is possible in deep learning and image recognition."
response = ai_agent.summarize_text(abstract, strict: true)
ai_agent.format_response(response)

And, we get a much shorter, summarized version of the abstract, which is easier to read:

A new algorithm for deep learning in image recognition has been developed, offering superior accuracy and efficiency. It utilizes advanced neural networks and novel training techniques, significantly improving object identification in various fields. The algorithm demonstrates faster training times, reduced energy consumption, and consistently outperforms existing solutions, making it a promising tool for advancing deep learning and image recognition applications.

In this tutorial, we've explored how to use Claude for the common NLP tasks of Sentiment Analysis, Named Entity Recognition, and Text Summarization.

These techniques can greatly enhance your ability to process and understand large volumes of text data. Experiment with these techniques and play around with the example code to discover new ways to apply them in your projects.

Examples from the video and article are available on github:
https://github.com/devgabcom/claude103

Bye for now, and join us next time!

Claude with Ruby - Article 3
1 2
3
4 5