How to Build a Paid AI Tool API with Ollama + Qwen and Earn Extra Income
How to Build a Paid AI Tool API with Ollama + Qwen and Earn Extra Income
Looking for a practical way to make side income from AI without building a full-fledged SaaS?
With Ollama + Qwen, you can turn a low-cost VPS into your own paid AI microservice API.
This guide walks you through every step โ from deployment to payment setup โ so you can start selling API access to clients or small businesses and generate recurring side income within hours.
1. Prerequisites
1. Choose a Server
Recommended VPS: LightNode (cheap, hourly billing, great for testing an MVP)
Minimum Specs:
2 vCPU / 4GB RAM / 30GB SSD
Ubuntu 22.04 OS
1TB monthly bandwidth
Cost: ~$7โ10/month or $0.084/hour
Tip: Start small with a CPU VPS (3B model). Upgrade to a GPU VPS later as demand grows.
2. Connect to Your Server via SSH
ssh root@your_server_ip
3. Install Required Packages
sudo apt update && sudo apt upgrade -y
sudo apt install git curl python3-pip docker.io docker-compose -y
sudo systemctl enable docker
Install Ollama and Qwen Model
1. Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
2. Pull the Qwen model (choose 3B or 7B version)
ollama pull qwen2.5:3b
# or
ollama pull qwen2.5:7b
3. Test the model
ollama run qwen2.5:3b
Try:
Hello, introduce yourself
You should receive a response.
3. Build a Paid API Service
Weโll use FastAPI to expose a paid HTTP API.
- Install packages:
pip install fastapi uvicorn requests
- Create a file app.py:
from fastapi import FastAPI, Request
import requests
app = FastAPI()
OLLAMA_URL = "http://localhost:11434/api/generate"
VALID_TOKENS = ["abc123", "vipuser"]
@app.post("/qwen")
async def run_qwen(request: Request):
data = await request.json()
prompt = data.get("prompt", "")
token = data.get("token", "")
if token not in VALID_TOKENS:
return {"error": "Invalid or missing token. Please recharge."}
payload = {
"model": "qwen2.5:3b",
"prompt": prompt,
"stream": False
}
response = requests.post(OLLAMA_URL, json=payload)
return response.json()
- Start the service:
uvicorn app:app --host 0.0.0.0 --port 8000
Test endpoint:
POST http://<your_server_ip>:8000/qwen
{
"token": "abc123",
"prompt": "Write a short AI marketing copy."
}
4. Add a Credit System for Paid Access
Start by manually issuing tokens to paying users.
Later, automate with a basic SQLite-based credit system:
import sqlite3
conn = sqlite3.connect('users.db')
cur = conn.cursor()
cur.execute("CREATE TABLE IF NOT EXISTS users (token TEXT, credits INTEGER)")
cur.execute("INSERT INTO users VALUES ('abc123', 100)")
conn.commit()
@app.post("/qwen")
async def run_qwen(request: Request):
data = await request.json()
token = data.get("token", "")
cur.execute("SELECT credits FROM users WHERE token=?", (token,))
row = cur.fetchone()
if not row or row[0] <= 0:
return {"error": "No credits left. Please recharge."}
cur.execute("UPDATE users SET credits=credits-1 WHERE token=?", (token,))
conn.commit()
5. Run API in Background
Keep your service running even after closing SSH:
nohup uvicorn app:app --host 0.0.0.0 --port 8000 &
or:
pip install pm2
pm2 start "uvicorn app:app --host 0.0.0.0 --port 8000"
6. Provide API Documentation to Clients
Example usage guide:
Endpoint: http://<your_ip>:8000/qwen
Method: POST
Parameters:
{
"token": "abc123",
"prompt": "Write a 50-word SEO blog intro about AI startups."
}
Response:
{
"response": "Here's your AI-generated text..."
}
Pricing suggestions:
$5 = 1,000 calls
$15/month = unlimited access
7. Scale and Optimize for More Revenue
โ Add Nginx + SSL for secure HTTPS endpoints
โ Offer multiple models (Qwen-Coder, Qwen-7B, etc.) with tiered pricing
โ Create a simple front-end landing page for better conversions
โ Upgrade to GPU VPS for faster responses and more clients
Side Hustle Potential
Cost: ~$7โ10/month LightNode VPS
Revenue: 10 clients ร $5 = $50/month (starter level)
Scalability: Easily grow to $500โ1000/month with better hosting and marketing