ChatGPT API — Getting Started
The ChatGPT web interface at chat.openai.com is a polished product built on top of an API. That same API is available to developers, which means you can embed GPT-4o's capabilities directly into your own applications — a customer support bot for your e-commerce site, an automated document summariser, a coding assistant inside your IDE, or a backend service that processes thousands of pieces of text overnight.
This chapter demystifies the API: what it is, how to get access, how pricing works, and how to make your first calls — both with raw curl and with the Python openai library.
1. API vs Chat Interface — When to Use Which
The chat interface at chat.openai.com and the API both use the same underlying model, but they serve fundamentally different purposes.
| Dimension | Chat Interface | API |
|---|---|---|
| Target user | Non-technical end users | Developers and businesses |
| Interaction | Manual, one at a time | Programmatic, can be automated |
| Customisation | Limited (system prompt via Custom GPTs) | Full control over every parameter |
| Volume | Limited to rate-of-typing | Thousands of requests per hour |
| Integration | Standalone product | Embedded in your application |
| Cost model | Flat subscription (Plus = $20/month) | Pay per token consumed |
| Memory | Built-in conversation history UI | Stateless — you manage history |
Use the chat interface when you are exploring ideas, drafting content, or doing research manually.
Use the API when you want to:
- Automate repetitive text processing tasks
- Build products or internal tools powered by AI
- Process data in bulk (summarising 500 customer reviews, classifying support tickets)
- Integrate AI into an existing codebase
- Control the model's behaviour precisely with system prompts and parameters
2. Getting an API Key
Create an OpenAI Account
If you do not already have one, create an account at platform.openai.com. Note that this is separate from your chat.openai.com account, though you can link them using the same email.
Add a Payment Method
The API is billed on usage — you pay for what you use, not a flat fee. You must add a credit or debit card (international cards work; Visa and Mastercard issued by Indian banks that support international transactions are typically accepted). You can also prepay a specific amount.
OpenAI offers a small amount of free credits for new API accounts, though this changes periodically — check the current offer when you sign up.
Generate Your API Key
- Go to
platform.openai.com/api-keys - Click "Create new secret key"
- Give it a descriptive name (e.g., "meritshot-tutorial-key")
- Copy the key immediately — OpenAI shows it only once
Your key will look like this:
sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Protect Your Key
An API key is a credential with billing attached. Treat it like a password:
- Never paste it into code you commit to a public GitHub repository
- Never share it in a chat message, email, or screenshot
- Store it in an environment variable, not hardcoded in source files
- If you suspect a key is compromised, revoke it immediately from the dashboard and generate a new one
The standard practice is to store keys in a .env file locally and load them via os.environ in Python:
# .env file (never commit this to version control)
OPENAI_API_KEY=sk-proj-your-key-here
3. Understanding the Pricing Model
The API charges you per token. A token is roughly 4 characters of English text, or about 0.75 words. "Meritshot is a great learning platform" is approximately 8 tokens.
Pricing has two components:
- Input tokens — everything you send to the model (system prompt + conversation history + your current message)
- Output tokens — everything the model generates in response
Output tokens are typically more expensive than input tokens. Prices are listed per million tokens (per 1M).
Current Model Pricing Comparison
OpenAI offers several models at different price points. The two most commonly used for applications are:
| Model | Input (per 1M tokens) | Output (per 1M tokens) | Best for |
|---|---|---|---|
| gpt-4o-mini | ~$0.15 | ~$0.60 | High-volume tasks, classification, simple Q&A |
| gpt-4o | ~$2.50 | ~$10.00 | Complex reasoning, nuanced generation, coding |
| gpt-4.1-nano | ~$0.10 | ~$0.40 | Very high volume, simple tasks |
| gpt-4.1 | ~$2.00 | ~$8.00 | Balanced performance and cost |
Note: Prices change and new models are released frequently. Always check platform.openai.com/docs/models for the current pricing before budgeting.
Practical Cost Calculation
Imagine you are building a product description generator for an Indian e-commerce seller. Each product description request involves:
- System prompt: 200 tokens
- Product details input: 100 tokens
- Generated description output: 300 tokens
- Total per request: 600 tokens (300 input + 300 output)
Using gpt-4o-mini:
- Input cost: 300 tokens x ($0.15 / 1,000,000) = $0.000045
- Output cost: 300 tokens x ($0.60 / 1,000,000) = $0.00018
- Total per request: approximately $0.000225
At this rate, processing 10,000 product descriptions costs about $2.25 — roughly ₹19. That is dramatically cheaper than hiring a copywriter.
Using gpt-4o for the same task:
- Total per request: approximately $0.00375
- 10,000 descriptions: ~$37.50 (₹315)
The choice between models depends on how much quality you need. For simple, structured tasks, gpt-4o-mini is usually sufficient.
4. Your First API Call with curl
curl is a command-line tool for making HTTP requests. It is available on macOS and Linux by default, and on Windows via WSL or Git Bash. This is the most direct way to see the API in action without any code setup.
Open your terminal and run:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "What is the capital of Rajasthan?"
}
],
"max_tokens": 50
}'
The $OPENAI_API_KEY refers to your environment variable. If you have not set it yet, you can set it temporarily in your terminal session:
export OPENAI_API_KEY="sk-proj-your-key-here"
Understanding the Response
The API returns a JSON object:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"created": 1720000000,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of Rajasthan is Jaipur."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 17,
"completion_tokens": 9,
"total_tokens": 26
}
}
The fields you care about most:
choices[0].message.content— the model's response textchoices[0].finish_reason— why it stopped (stop= natural end,length= hitmax_tokenslimit)usage— token counts for billing purposes
5. Your First API Call with Python
Python is the most common language for working with the OpenAI API. Install the official library:
pip install openai python-dotenv
Then create a file called first_call.py:
import os
from dotenv import load_dotenv
from openai import OpenAI
# Load API key from .env file
load_dotenv()
# Initialise the client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
# Make the API call
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "user",
"content": "Explain compound interest in two sentences, using a ₹10,000 SIP example."
}
],
max_tokens=150
)
# Extract and print the response
print(response.choices[0].message.content)
print(f"\nTokens used: {response.usage.total_tokens}")
Run it:
python first_call.py
You should see the model's explanation printed to the terminal, followed by the token count.
6. The Messages Array — Understanding Roles
The core data structure of the Chat Completions API is the messages array. Every call to the API takes a list of messages, each with a role and content. There are three roles:
system
The system message sets the context, persona, and rules for the model. It comes first in the array and shapes how the model interprets everything that follows. If you do not include a system message, the model uses its default helpful-assistant behaviour.
messages = [
{
"role": "system",
"content": "You are a helpful assistant for an Indian tax filing platform. Answer questions about ITR filing clearly and accurately. Always recommend consulting a CA for complex cases."
}
]
user
The user message is the input from the person (or from your application on the user's behalf). This is the question or instruction.
messages.append({
"role": "user",
"content": "I am a freelancer earning ₹8 lakh per year. Which ITR form should I use?"
})
assistant
The assistant message is the model's response. When you are building a multi-turn conversation, you include previous assistant responses in the messages array so the model has context about what it already said.
messages.append({
"role": "assistant",
"content": "As a freelancer, you would typically file using ITR-3 or ITR-4 (Sugam), depending on whether you opt for the presumptive taxation scheme under Section 44ADA."
})
A Complete Multi-Turn Example
messages = [
{"role": "system", "content": "You are a knowledgeable assistant for Indian tax questions."},
{"role": "user", "content": "Which ITR form should a freelancer use?"},
{"role": "assistant", "content": "Freelancers typically use ITR-4 (Sugam) under Section 44ADA if their income is under ₹75 lakh."},
{"role": "user", "content": "What if my income is above ₹75 lakh?"},
]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages
)
print(response.choices[0].message.content)
By including the prior exchange in the messages array, the model understands that "above ₹75 lakh" refers to the freelancer's income from the previous turn. Without that history, it would have no context.
Why the API Is Stateless
The API has no memory between calls. Every request must include the full conversation history you want the model to be aware of. This is different from the chat interface, which maintains history automatically. The implication for application developers is that you are responsible for storing and managing conversation history — which we cover in depth in the next chapter.
7. Key Parameters
Beyond model and messages, the API accepts several important parameters:
| Parameter | Type | What it does |
|---|---|---|
max_tokens | integer | Maximum tokens in the response. Prevents runaway long outputs. |
temperature | float 0.0–2.0 | Controls randomness. 0 = deterministic, 1 = default, 2 = very creative. |
top_p | float 0.0–1.0 | Alternative to temperature for sampling control. Usually leave at 1.0 if using temperature. |
n | integer | Number of response choices to generate. Default 1. |
stop | string or list | Sequences at which the model stops generating. |
stream | boolean | If true, streams tokens as they are generated. Covered in the next chapter. |
Temperature in Practice
Temperature is the parameter you will tune most often:
# For a classification task (always want consistent output)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Classify this review as Positive, Negative, or Neutral: 'Great product, fast delivery!'"}],
temperature=0.0
)
# For creative writing (want variety)
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Write a tagline for a premium Darjeeling tea brand."}],
temperature=1.2
)
Common Pitfalls
Pitfall 1 — Hardcoding the API key in source code. If you commit a file with your API key to GitHub, bots scan public repositories and will find it within minutes, generating charges on your account. Always use environment variables.
Pitfall 2 — Not setting max_tokens. Without a limit, the model will generate until it naturally stops. For classification tasks or short-answer queries, this wastes tokens and adds latency. Set a reasonable upper bound.
Pitfall 3 — Choosing the wrong model for the task. Using gpt-4o for simple classification tasks is like using a sledgehammer to crack a nut — expensive and slow. Use gpt-4o-mini for high-volume simple tasks; reserve gpt-4o for tasks requiring nuanced reasoning.
Pitfall 4 — Not checking finish_reason. If finish_reason is length rather than stop, the model hit your max_tokens limit and the response was cut off. If this happens on your output, increase max_tokens.
Pitfall 5 — Ignoring the usage field. Track token consumption from the start. In a production application, log usage for every call so you can monitor costs, spot unexpected spikes, and optimise your prompts.
Pitfall 6 — Using temperature=0 for creative tasks. At temperature 0, the model is nearly deterministic — it will generate the same output for the same input almost every time. This is ideal for classification or data extraction, but produces repetitive, uncreative output for marketing copy or storytelling.
Practice Exercises
-
Set up your API key as an environment variable and make your first
curlcall to the Chat Completions endpoint. Request a 3-sentence explanation of how UPI works. Print the response and the token usage. -
Write a Python script that reads a product name and price from user input on the command line and generates a 50-word product description for an Indian e-commerce listing. Use
gpt-4o-miniand print the token cost alongside the description. -
Build a multi-turn conversation in Python: start with a system prompt setting the model as an Indian cooking assistant. Send at least 3 user messages and include the full conversation history in each request. Print each response.
-
Experiment with temperature: send the prompt "Write a slogan for a vegetarian fast-food chain in India" with temperatures 0.0, 0.7, 1.0, and 1.5. Run each 3 times and observe how the outputs vary across temperatures.
-
Write a script that processes a list of 10 customer reviews (you can make them up) and classifies each as Positive, Negative, or Neutral using the API with temperature=0. Print the results in a table alongside each review and calculate the total tokens used.
Summary
- The ChatGPT API exposes the same underlying model as the chat interface but gives developers programmatic, automated, and high-volume access with full parameter control.
- An API key is generated at
platform.openai.com/api-keysand should always be stored in environment variables, never hardcoded or committed to version control. - Pricing is per token, with separate rates for input and output tokens.
gpt-4o-miniis significantly cheaper thangpt-4oand is the right default for high-volume or simple tasks. - The
curlcommand lets you test API calls directly from the terminal without writing any Python. - The Python
openailibrary wraps the API in a clean interface:client.chat.completions.create()is the primary method you will use. - The
messagesarray is the core data structure:systemsets context and rules,userprovides input, andassistantholds the model's previous responses for multi-turn conversations. - The API is stateless — you must include the full conversation history in every request; there is no built-in memory between calls.
- Key tuning parameters include
max_tokens(limits response length),temperature(controls randomness), andmodel(balances cost vs. capability).