[Webinar] Mastering Kafka Security Across Hybrid Environments → Register Now

Build a scalable and up-to-date generative AI chatbot with Amazon Bedrock and Confluent Cloud for business loan specialists

Written By

In this post, we demonstrate how a robust and scalable generative artificial intelligence (GenAI) chatbot is built using Amazon Bedrock and Confluent Cloud. We walk through the architecture and implementation of this generative AI chatbot, and see how it uses Confluent's real-time event streaming capabilities along with Amazon's infrastructure to continually stay up to date with the latest advances from the AI landscape.

In the rapidly evolving sphere of AI, building intelligent chatbots that seamlessly integrate into our daily lives is challenging. As businesses strive to remain at the forefront of innovation, the demand for scalable and current conversational AI solutions has become more critical than ever. The fusion of cutting-edge platforms is crucial to build a chatbot that not only understands but also adapts to human interaction. Real-time data plays a pivotal role in achieving the responsiveness and relevance of these chatbots. The ability to harness and analyze data in real time empowers chatbots to stay abreast of the latest trends, user preferences, and contextual information, enabling them to provide more accurate, personalized, and timely responses.

In the world of virtual banking, the role of a business loan specialist has transcended routine transactions. Imagine a scenario where a specialist is equipped with a generative AI chatbot, a digital companion that not only comprehends the intricacies of a bank's diverse product offerings but also possesses a nuanced understanding of each client's unique profile. For the specialist, the major challenge lies in engaging in meaningful and well-informed discussions with clients. The challenge extends beyond transactional exchanges, emphasizing the importance of comprehending the client's business intricacies, identifying their financial needs, and strategically aligning them with a diverse array of available banking products.

The ongoing challenge is consistently achieving this depth of engagement, making sure each interaction contributes not only to a one-time transaction but to establish a long-term and mutually beneficial financial partnership. This demands not only financial acumen but also effective communication skills to navigate the unique nuances of each client's business requirements. Despite the many benefits of generative AI chatbots in the mortgage industry, lenders struggle to effectively implement and integrate these technologies into their existing systems and workflows. This leads to missed opportunities to better serve customers, higher costs, inefficiencies, and more. An AI-driven solution serves as a virtual guide, empowering specialists to navigate through the complexities of financial discussions with unparalleled acumen.

Solution overview

What sets this generative AI chatbot apart is its ability to seamlessly integrate client-specific data. Picture a scenario where the AI is not only aware of the client's name and business particulars, but also possesses insights into their credit score. This client-centric approach transforms each interaction into a secure, private, personalized, and insightful dialogue, laying the foundation for a more meaningful connection.

As the conversation progresses and aligns with the client's financial needs, the generative AI chatbot takes on a pivotal role. Not only does it provide information, it also generates a pre-approval form. This automated process streamlines the workflow, collecting all mandatory information needed for the approval process. The pre-approval form is subsequently prepared and queued for examination by authorized bank personnel with access to client data.

The architecture of this generative AI chatbot consists of two major components: Confluent Cloud and Amazon Bedrock.

Confluent Cloud is a cloud-centered, data streaming platform that enables real-time data freshness and supports the microservices paradigm. With Apache Kafka® as its foundation, Confluent Cloud orchestrates the flow of information between various components. This real-time data streaming capability empowers the generative AI agent to stay abreast of the latest updates, so client interactions are not just informed but reflect the latest information.

The microservices architecture enabled by Confluent Cloud breaks down the monolithic structure into modular, independently deployable components. Each microservice handles specific tasks, fostering agility and scalability. This architecture not only enhances the maintainability of the system but also allows for seamless updates and additions, making sure the generative AI chatbot remains at the forefront of technological innovation.

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, Cohere, Meta, Stability AI, Amazon, and more through a single API. To power this application, Anthropic's specialized chatbot model is used. This FM model is capable of text summarization and data protection, which is the core of this AI solution. It comprehends user queries, generates contextually rich responses, and makes sure sensitive data is securely handled. The Anthropic models hosted on Amazon Bedrock provide a robust foundation for the chatbot's ability to engage in informed and secure conversations.

The following diagram illustrates the solution architecture.

The use of Confluent Cloud with Amazon Bedrock makes an elastic architecture that dynamically scales in response to varying workloads. Such scaling capabilities make this generative AI chatbot a flexible and adaptive solution in the event of higher demands from clients or evolving banking requirements without compromising on performance and responsiveness.


The architecture described in this post requires the following:

  • Java 11 or higher

  • Access to Confluent Cloud

  • Access to Amazon Bedrock

  • Familiarity with Apache Kafka and microservices architecture

Chatbot conversation processing

The chatbot conversation processing microservice consists of four essential components: the product cache, prompt cache, summary cache, and user cache. Each of these is powered by KCache. At its core, the KStream API orchestrates real-time responses, enabling near-instantaneous interactions and scalable infrastructure tailored to the demands of the digital landscape. The implementation details are provided in the following code:

public class ChatProcessing implements Processor<String, ChatInput, String, Discussion> {

    private final PromptProvider promptProvider;
    private final BedrockService bedrockService;
    private final DiscussionsCache discussionsCache;
    private ProcessorContext<String, Discussion> context;

    public ChatProcessing(PromptProvider promptProvider,
                          BedrockService bedrockService,
                          DiscussionsCache discussionsCache) {
        this.promptProvider = promptProvider;
        this.bedrockService = bedrockService;
        this.discussionsCache = discussionsCache;

    public void init(ProcessorContext<String, Discussion> context) {
        this.context = context;

    public void process(Record<String, ChatInput> record) {
        log.info("Processing chat input: {}", record.value().getSession_id());
        final ChatInput chatInput = record.value();
        final String prompt = promptProvider.getPrompt(chatInput.getUser_id(), chatInput.getSession_id());
        final String discussionsKey = chatInput.getUser_id();
        if (StringUtils.isEmpty(discussionsKey)) {
            log.error("No user id...");

        Discussions discussions = discussionsCache.containsKey(discussionsKey)
                ? discussionsCache.get(discussionsKey)
                : new Discussions(discussionsKey, chatInput.getSession_id());
        if (!StringUtils.equals(discussions.getSessionId(), chatInput.getSession_id())) {
            // New session, new discussions
            discussions = new Discussions(discussionsKey, chatInput.getSession_id());

        final String promptWithDiscussion = prompt.replace("{history}", discussions.toString());

        // Call bedrock
        final Map<String, Object> response = bedrockService.chat(promptWithDiscussion, chatInput.getInput());
        final String output = (String) response.get("completion");
        final ChatOutput chatOutput = new ChatOutput(chatInput.getSession_id(), chatInput.getUser_id(), output);
        final Discussion discussion = new Discussion(chatInput, chatOutput);

        // Update the discussion
        discussionsCache.put(discussionsKey, discussions);

        // Forward to downstream processing or topics
        context.forward(new Record<>(chatInput.getSession_id(), discussion, record.timestamp()));
        log.info("Chat output forwarded: {}", chatOutput.getSession_id());

The product cache, prompt cache, summary cache, and user cache are integral components, seamlessly integrating with KCache to make sure the chatbot core engine operates with the most up-to-date information. The KStream API orchestrates real-time responses to user requests and provides the scalability needed to meet the demands of virtual banking interactions.

The chatbot processor encapsulates the essence of real-time processing. By using KCache, it intertwines dynamic cache updates with the processing logic, making sure user requests are met with responses grounded in the latest data.

Orchestrating approval process

In the generative AI chatbot's architecture, the pre-approval microservice is a crucial part that is responsible for crafting the JSON document required for the client’s approval. This module is built on Amazon Bedrock, using a specific tag configured in the chatbot microservice to invoke the creation of pre-approval documents.

The following is an example of the prompt:

            Extract mandatory information from the conversation in json format.

            ### Instructions
            - The required information are First name, Last Name, Address, Date of Birth, Citizenship, Credit Score, SSN and selected product index.
            - Return only the json result

            ### Conversation

The following is the sample Java code:

        final String newSessionId = (value != null) ? value.getSessionId() : null;
        final String oldSessionId = (oldValue != null) ? oldValue.getSessionId() : null;

        final Discussion lastDiscussion;
        if (!StringUtils.equals(newSessionId, oldSessionId) && oldValue != null) {
            lastDiscussion = oldValue.getLastEntry();
        } else if (value != null) {
            lastDiscussion = value.getLastEntry();
        } else {
            log.error("No discussions...");

        final String lastResponse = lastDiscussion.getChat_output().getOutput();
        if (!StringUtils.isEmpty(lastResponse) && lastResponse.endsWith("#EOF#") && value != null) {
            final String prompt = PROMPT.replace("{conversation}", value.toString());
            final Map<String, Object> information = bedrockService.submit(prompt);
            try {
                final Map<String, Object> submission = BedrockService.extractJson(information);
                kafkaTemplate.send("submit", value.getUserId() + value.getSessionId(), submission);
            } catch (JsonProcessingException e) {
                log.error("Error parsing json.", e);


Using a specific prompt, the chatbot microservice signals the pre-approval microservice to generate a JSON document based on the collected information, a culmination of conversations between the generative AI chatbot and the user.


In the final component of our generative AI chatbot, the masking microservice takes center stage, orchestrating real-time data masking to safeguard sensitive information. This vital piece of our architectural composition works with Amazon Bedrock, using carefully crafted prompt instructions to make sure data masking aligns with the bank staff’s permissions.

The masking microservice enforces the confidentiality of sensitive data. Amazon Bedrock integration works with it to enable real-time data masking. The secret lies in the craft of specially designed prompt instructions that align with the bank staff's permissions.

The following is the sample prompt used for data masking:

    - De-identify the json document including date of birth and address.
    - Keep the first and last character, replace other characters with an X.

Amazon Bedrock once again proves its versatility as it collaborates with the masking microservice. The specially crafted prompt instructions invoke Amazon Bedrock to mask sensitive data in real time, aligning with the permissions set by the bank staff. This integration makes sure confidentiality is not compromised and data remains shielded from unauthorized eyes.

Achieving scalability

In the culmination of our proof of concept, we've embarked on a journey that transcends traditional chatbot paradigms. By seamlessly integrating Confluent Cloud and Amazon Bedrock, we've orchestrated a symphony of intelligence that redefines the realm of virtual banking interactions:

  • Real-time brilliance – The generative AI chatbot, fueled by Confluent Cloud, showcases a real-time brilliance in its interactions. Microservices orchestrated with precision, powered by KCache and KStream, make sure every conversation is not just informed but dynamically responsive to the evolving needs of clients.

  • Scalability unleashed – Confluent Cloud's elasticity and the adaptive architecture of Amazon Bedrock jointly unleash unprecedented scalability. The generative AI chatbot effortlessly adapts to surges in user interactions, meeting the demands of the dynamic virtual banking landscape with unparalleled efficiency.

  • Amazon Bedrock magic – At the heart of our success lies the magic of Amazon Bedrock. The integration with Anthropic's models empowers the generative AI chatbot with conversational finesse, summarization capabilities, and robust data protection. Amazon Bedrock doesn't just serve as a foundation; it's the bedrock upon which intelligent banking conversations thrive.

  • Security beyond boundaries – With a masking microservice fortified by Amazon Bedrock, we've made sure sensitive data remains a fortress. Real-time data masking, driven by specially crafted prompt instructions, aligns seamlessly with bank staff permissions, setting a new standard for security in virtual banking interactions.


The generative AI chatbot, powered by the synergy of Confluent Cloud and Amazon Bedrock, is a pioneering solution in the realm of intelligent virtual banking. The chatbot delivers real-time responsiveness, scalability, and robust security measures. This paves the way for innovative possibilities in the digital banking landscape.

Next steps

Not yet a Confluent customer? Try Confluent Cloud in AWS Marketplace. New sign-ups receive $1,000* in free credits to spend during their first 30 days. Your credits will be immediately visible on your Confluent account after subscribing to Confluent through the AWS Marketplace.

*Confluent is offering a limited-time promotion wherein, you will receive an additional $600 bonus credits on top of the normal offer of $400 free credits. Finish signing up for Confluent Cloud and you will receive a $600 promo code to your email address used to create your account with us. During this free trial period, you'll have full access to all features, enabling you to build multiple use cases, connect to your databases, and get technical support whenever needed.

Explore the possibilities of building a scalable and up-to-date generative AI chatbot for your virtual banking needs. Contact our team to learn more about integrating Confluent Cloud and Amazon Bedrock.

  • Pascal Vantrepote is a Director of Innovation carving a path in the exciting realm of generative AI. Hailing from Lille, France, he holds a degree in electrical engineering and boasts a diverse background, from software engineering to serving as an enterprise architect in the financial sector. Recognized for his expertise in designing scalable applications, Pascal leads cutting-edge projects with a visionary touch. Beyond the professional sphere, he finds joy in outdoor activities, whether it's conquering ski slopes, hitting the trails for a run, or exploring the beauty of nature through hiking.

  • Mario Bourgoin is a Data scientist with a broad knowledge of deep learning, machine learning, artificial intelligence, statistics, and computational mathematics. Worked with Fortune 500 customers to develop product requirements, and used those to design and deliver predictive analytics solutions. Led teams in the successful delivery of 13 software products and two hardware products. Expert at developing algorithms for intelligent solutions, time-series mining, and predictive analytics, including parallel and distributed algorithms.

  • Shruti is a Solutions Architect at Amazon Web Services. She works with several small-medium businesses in the Betting and Gaming industry and helps them accelerate their cloud journey.

Did you like this blog post? Share it now