How to Build an AI Chatbot With React and Node.js: Complete Beginner Guide

Most tutorials on building AI chatbots show you how to call an API and display a response in a text box. That is not a chatbot — that is a form with an API call attached to it.

A real chatbot maintains context across messages, handles errors gracefully, streams responses so users are not staring at a blank screen for eight seconds, and does not send your API key to every client browser in the world.

This guide builds a complete AI chatbot from scratch — React frontend, Node.js backend, OpenAI integration, streaming responses, and the conversation memory that makes it feel like an actual conversation.

AI chatbot interface on computer screen

Why the Architecture Matters Before You Write a Line of Code

The most common beginner mistake is calling the OpenAI API directly from React:

// NEVER do this in production
const response = await fetch('https://api.openai.com/v1/chat/completions', {
  headers: { 'Authorization': `Bearer ${process.env.REACT_APP_OPENAI_KEY}` }
})

When you prefix a variable with REACT_APP_, it gets baked into the JavaScript bundle served to every browser. Anyone who opens your page and views the source can find your API key. Automated bots scrape for these keys and use them to run API calls at your expense. This has cost developers thousands of dollars in hours.

The backend exists to keep your API key on the server. Every request from React goes to your Node.js server, which adds the API key server-side and calls OpenAI. Users never see anything except the response.

The second reason for a separate backend: conversation history. OpenAI's chat API is stateless — it does not remember previous messages. To maintain conversation context, you need to send the full conversation history with every request. The server is the right place to manage this history.

Project Structure

Set this up before writing any code:

chatbot-app/
├── client/          ← React frontend (Vite)
│   ├── src/
│   │   ├── components/
│   │   │   ├── ChatWindow.jsx
│   │   │   ├── MessageBubble.jsx
│   │   │   └── InputBar.jsx
│   │   ├── hooks/
│   │   │   └── useChat.js
│   │   └── App.jsx
│   └── package.json
├── server/          ← Node.js backend
│   ├── routes/
│   │   └── chat.js
│   ├── services/
│   │   └── openai.js
│   ├── middleware/
│   │   └── rateLimit.js
│   ├── server.js
│   └── package.json
└── .gitignore       ← Include .env files

Server: Node.js with Express

Install dependencies:

cd server
npm init -y
npm install express openai dotenv cors express-rate-limit

server/services/openai.js — the OpenAI service:

import OpenAI from 'openai';

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

export async function streamChatResponse(messages, res) {
  const stream = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [
      {
        role: 'system',
        content: 'You are a helpful assistant. Be concise and clear in your responses.'
      },
      ...messages,
    ],
    max_tokens: 1000,
    stream: true,
  });
  
  let fullResponse = '';
  
  for await (const chunk of stream) {
    const token = chunk.choices[0]?.delta?.content || '';
    if (token) {
      fullResponse += token;
      res.write(`data: ${JSON.stringify({ token })}\n\n`);
    }
  }
  
  res.write(`data: ${JSON.stringify({ done: true })}\n\n`);
  return fullResponse;
}

server/routes/chat.js — the chat route:

import express from 'express';
import { streamChatResponse } from '../services/openai.js';

const router = express.Router();

// Validate conversation history
function validateMessages(messages) {
  if (!Array.isArray(messages)) return false;
  if (messages.length > 50) return false; // Prevent excessively long histories
  
  return messages.every(msg =>
    typeof msg.role === 'string' &&
    ['user', 'assistant'].includes(msg.role) &&
    typeof msg.content === 'string' &&
    msg.content.length > 0 &&
    msg.content.length <= 4000
  );
}

router.post('/stream', async (req, res) => {
  const { messages } = req.body;
  
  if (!validateMessages(messages)) {
    return res.status(400).json({ error: 'Invalid messages format' });
  }
  
  // SSE headers
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache, no-transform');
  res.setHeader('X-Accel-Buffering', 'no');
  res.setHeader('Connection', 'keep-alive');
  
  // Handle client disconnect
  req.on('close', () => {
    // Connection closed by client — nothing to clean up in this stateless setup
  });
  
  try {
    await streamChatResponse(messages, res);
  } catch (err) {
    if (err.status === 429) {
      res.write(`data: ${JSON.stringify({ error: 'Rate limit reached — please try again in a moment' })}\n\n`);
    } else {
      res.write(`data: ${JSON.stringify({ error: 'Something went wrong — please try again' })}\n\n`);
    }
  } finally {
    res.end();
  }
});

export default router;

server/server.js:

import express from 'express';
import cors from 'cors';
import dotenv from 'dotenv';
import rateLimit from 'express-rate-limit';
import chatRouter from './routes/chat.js';

dotenv.config();

const app = express();

app.use(cors({ origin: process.env.CLIENT_URL || 'http://localhost:5173' }));
app.use(express.json({ limit: '100kb' }));

// Rate limiting: 20 requests per minute per IP
const limiter = rateLimit({
  windowMs: 60 * 1000,
  max: 20,
  message: { error: 'Too many requests — please slow down' },
});

app.use('/api/chat', limiter);
app.use('/api/chat', chatRouter);

app.listen(3001, () => console.log('Server running on port 3001'));

React and Node.js development setup

Client: React with Vite

client/src/hooks/useChat.js — the chat logic hook:

import { useState, useCallback } from 'react';

export function useChat() {
  const [messages, setMessages] = useState([]);
  const [isStreaming, setIsStreaming] = useState(false);
  const [error, setError] = useState(null);
  
  const sendMessage = useCallback(async (userMessage) => {
    if (isStreaming || !userMessage.trim()) return;
    
    setError(null);
    
    // Optimistically add user message
    const userMsg = { role: 'user', content: userMessage };
    const updatedMessages = [...messages, userMsg];
    setMessages(updatedMessages);
    
    // Add empty assistant message that will be filled by streaming
    const assistantMsg = { role: 'assistant', content: '' };
    setMessages([...updatedMessages, assistantMsg]);
    setIsStreaming(true);
    
    try {
      const response = await fetch('http://localhost:3001/api/chat/stream', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: JSON.stringify({ messages: updatedMessages }),
      });
      
      if (!response.ok) {
        throw new Error(`HTTP error: ${response.status}`);
      }
      
      const reader = response.body.getReader();
      const decoder = new TextDecoder();
      let buffer = '';
      
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
        
        buffer += decoder.decode(value, { stream: true });
        const lines = buffer.split('\n');
        buffer = lines.pop() || '';
        
        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const data = JSON.parse(line.slice(6));
          
          if (data.done) {
            setIsStreaming(false);
            return;
          }
          
          if (data.error) {
            setError(data.error);
            // Remove the empty assistant message on error
            setMessages(prev => prev.slice(0, -1));
            setIsStreaming(false);
            return;
          }
          
          if (data.token) {
            // Append token to the last message (the streaming assistant message)
            setMessages(prev => {
              const updated = [...prev];
              updated[updated.length - 1] = {
                ...updated[updated.length - 1],
                content: updated[updated.length - 1].content + data.token,
              };
              return updated;
            });
          }
        }
      }
    } catch (err) {
      setError('Connection error — please check your network and try again');
      setMessages(prev => prev.slice(0, -1)); // Remove empty assistant message
    } finally {
      setIsStreaming(false);
    }
  }, [messages, isStreaming]);
  
  const clearChat = useCallback(() => {
    setMessages([]);
    setError(null);
  }, []);
  
  return { messages, isStreaming, error, sendMessage, clearChat };
}

client/src/components/ChatWindow.jsx:

import { useState } from 'react';
import { useChat } from '../hooks/useChat';

export function ChatWindow() {
  const { messages, isStreaming, error, sendMessage, clearChat } = useChat();
  const [input, setInput] = useState('');
  
  const handleSubmit = (e) => {
    e.preventDefault();
    if (!input.trim() || isStreaming) return;
    sendMessage(input);
    setInput('');
  };
  
  return (
    <div className="chat-container">
      <div className="messages">
        {messages.map((msg, i) => (
          <div key={i} className={`message ${msg.role}`}>
            <span className="role">{msg.role === 'user' ? 'You' : 'AI'}</span>
            <p>{msg.content}</p>
          </div>
        ))}
        {isStreaming && messages[messages.length - 1]?.content === '' && (
          <div className="typing-indicator">AI is thinking...</div>
        )}
      </div>
      
      {error && (
        <div className="error-banner">{error}</div>
      )}
      
      <form onSubmit={handleSubmit} className="input-form">
        <input
          value={input}
          onChange={(e) => setInput(e.target.value)}
          placeholder="Type a message..."
          disabled={isStreaming}
          maxLength={4000}
        />
        <button type="submit" disabled={isStreaming || !input.trim()}>
          {isStreaming ? 'Sending...' : 'Send'}
        </button>
      </form>
    </div>
  );
}

What This Architecture Gets Right

API key security: The key lives in server/.env and never reaches the browser.

Conversation memory: The messages array is sent to the server on every request. OpenAI receives the full context and can reference previous turns.

Streaming: Users see the first token within 400ms instead of waiting 5–8 seconds for a complete response.

Error handling: Rate limit errors, network failures, and API errors all produce meaningful messages to the user, not silent failures.

Input validation: The server validates message format and length before calling the LLM API. Invalid or malicious inputs are rejected before they consume API credits.

Rate limiting: 20 requests per minute per IP prevents a single user from exhausting your API quota.

This is the baseline architecture. From here, you can add authentication to associate conversations with users, persistent storage to save conversation history across sessions, and more sophisticated system prompts to specialise the assistant's behaviour.

The feature that took your first tutorial five minutes will take this architecture a few days. The difference is the feature that survives production.