Skip to main content

2 posts tagged with "Cloudflare"

View All Tags

A Step-by-Step Guide to Building Your First AI Search Engine

· 9 min read

This is a step-by-step guide to building your first AI search engine using Cloudflare's Vectorize and Workers AI. It covers everything from setting up the environment to querying the vector database, with clear explanations and runnable code examples.


You'll see both paths:


  1. Manual embeddings (Euclidean, 32‑dim vectors) for learning and quick demos.
  2. AI embeddings (BGE Base, 768‑dim, cosine) using Workers AI, for real-world semantic search.

By the end, you’ll be able to seed your own index, query it via API, and understand exactly what’s going on.



What is Cloudflare Vectorize?


Definition:


A globally distributed vector database for AI-powered apps, tightly integrated with Cloudflare Workers.


Use cases:


  1. Semantic search
  2. Recommendations
  3. Anomaly detection
  4. LLM context support

Key Features of Vectorize


  1. Globally distributed, no additional infrastructure needed
  2. Store embeddings via Workers AI or external models
  3. Connect search results back to content in R2, KV, D1 — all within Workers

Meet Cloudflare Vectorize (Fun Version)


Think of Cloudflare Vectorize as your app’s super-powered librarian — except this one lives everywhere in the world at once, never sleeps, and can find what you want faster than you can say “AI.”


Instead of just searching for the exact words you type, Vectorize understands meaning.
It can match your "cute dog" search to a picture of a fluffy golden retriever, or "relaxing music" to an audio clip that feels like a spa day.

Real-World Uses


  1. Shopping → “Find me shoes like these” (and it actually gets it right).
  2. Customer Service → Instantly suggest relevant help articles before you even finish typing your problem.
  3. Streaming → Recommend movies that actually match your vibe, not just “because you watched one rom-com in 2018.”

The Cool Part


All this runs on Cloudflare’s global network, so your search results pop up in milliseconds,
even if your user is sipping coffee in Paris while your data’s hanging out in Tokyo.


Why Developers Love It


  1. No extra servers
  2. No complicated setup
  3. Just plug it into Cloudflare Workers, toss in your AI-generated “embeddings”
    (fancy word for math-y fingerprints of text, images, or audio)
    and you’ve got instant, intelligent search.

In short:


It’s like giving your app a brain, without giving yourself a headache.


Getting Started: Overview of Steps


Steps overview (from “Get started” docs):

  1. Create Worker
  2. Create Vectorize Index
  3. Bind Worker to Index
  4. (Optional) Add metadata
  5. Insert & query vectors
  6. Deploy & test

Step 1 – Create a Cloudflare Vectorize Index


First, ensure you have Wrangler installed:


npm install -g wrangler

Path A — Manual Embeddings (Euclidean)


Manual vectors are perfect for understanding how vector search works—no AI needed.

Create an index (Euclidean)


# 32 dimensions, Euclidean distance
wrangler vectorize create youtube-index --dimensions=32 --metric=euclidean

Worker code: insert & query (32 dims)


// src/index.ts (Manual demo)
// Run this as a separate Worker or bind to a different index than your AI demo.

export interface Env {
VECTORIZE: Vectorize; // bound to the 32-dim "youtube-index"
}

const sampleVectors: Array<VectorizeVector> = [
{
id: "1",
values: [
0.12, 0.45, 0.67, 0.89, 0.23, 0.56, 0.34, 0.78,
0.12, 0.90, 0.24, 0.67, 0.89, 0.35, 0.48, 0.70,
0.22, 0.58, 0.74, 0.33, 0.88, 0.66, 0.45, 0.27,
0.81, 0.54, 0.39, 0.76, 0.41, 0.29, 0.83, 0.55
],
metadata: { url: "/products/sku/13913913" },
},
{
id: "2",
values: [
0.14, 0.23, 0.36, 0.51, 0.62, 0.47, 0.59, 0.74,
0.33, 0.89, 0.41, 0.53, 0.68, 0.29, 0.77, 0.45,
0.24, 0.66, 0.71, 0.34, 0.86, 0.57, 0.62, 0.48,
0.78, 0.52, 0.37, 0.61, 0.69, 0.28, 0.80, 0.53
],
metadata: { url: "/products/sku/10148191" },
},
{
id: "3",
values: [
0.21, 0.33, 0.55, 0.67, 0.80, 0.22, 0.47, 0.63,
0.31, 0.74, 0.35, 0.53, 0.68, 0.45, 0.55, 0.70,
0.28, 0.64, 0.71, 0.30, 0.77, 0.60, 0.43, 0.39,
0.85, 0.55, 0.31, 0.69, 0.52, 0.29, 0.72, 0.48
],
metadata: { url: "/products/sku/97913813" },
},
{
id: "4",
values: [
0.17, 0.29, 0.42, 0.57, 0.64, 0.38, 0.51, 0.72,
0.22, 0.85, 0.39, 0.66, 0.74, 0.32, 0.53, 0.48,
0.21, 0.69, 0.77, 0.34, 0.80, 0.55, 0.41, 0.29,
0.70, 0.62, 0.35, 0.68, 0.53, 0.30, 0.79, 0.49
],
metadata: { url: "/products/sku/418313" },
},
{
id: "5",
values: [
0.11, 0.46, 0.68, 0.82, 0.27, 0.57, 0.39, 0.75,
0.16, 0.92, 0.28, 0.61, 0.85, 0.40, 0.49, 0.67,
0.19, 0.58, 0.76, 0.37, 0.83, 0.64, 0.53, 0.30,
0.77, 0.54, 0.43, 0.71, 0.36, 0.26, 0.80, 0.53
],
metadata: { url: "/products/sku/55519183" },
},
];

const DIMENSIONS = sampleVectors[0].values.length; // 32

export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
const path = url.pathname;

if (path === "/insert") {
const inserted = await env.VECTORIZE.insert(sampleVectors);
return Response.json({ ok: true, inserted });
}

if (path === "/query") {
// Demo vector that should be closest to id=4
const query = [
0.13, 0.25, 0.44, 0.53, 0.62, 0.41, 0.59, 0.68,
0.29, 0.82, 0.37, 0.50, 0.74, 0.46, 0.57, 0.64,
0.28, 0.61, 0.73, 0.35, 0.78, 0.58, 0.42, 0.32,
0.77, 0.65, 0.49, 0.54, 0.31, 0.29, 0.71, 0.57
];

if (query.length !== DIMENSIONS)
return Response.json({ ok: false, error: `Vector must have ${DIMENSIONS} dimensions` }, { status: 400 });

const matches = await env.VECTORIZE.query(query, {
topK: 3,
returnValues: true,
returnMetadata: "all",
});

return Response.json({ ok: true, matches });
}

return Response.json({ ok: false, error: "Try /insert then /query" }, { status: 404 });
},
} satisfies ExportedHandler<Env>;

Bindings for this manual Worker


// wrangler.jsonc (manual demo)
{
"$schema": "node_modules/wrangler/config-schema.json",
"name": "vectorize-manual-euclidean",
"main": "src/index.ts",
"compatibility_date": "2025-08-10",
"vectorize": { "binding": "VECTORIZE", "index_name": "youtube-index" }
}

Test with curl (Euclidean)


# Insert the sample 5 vectors
curl https://<your-manual-worker>.workers.dev/insert

# Query for top 3 nearest neighbors
curl https://<your-manual-worker>.workers.dev/query

Path B — AI Embeddings with Workers AI (Cosine)


This is the real-world path: you index human text, and search by meaning.


Create a preset index (Cosine)


wrangler vectorize create youtube-index-preset --preset=@cf/baai/bge-base-en-v1.5
# Preset sets: 768 dimensions, cosine distance

wrangler.jsonc bindings


{
"$schema": "node_modules/wrangler/config-schema.json",
"name": "vectorize-youtube",
"main": "src/index.ts",
"compatibility_date": "2025-08-10",
"assets": { "directory": "./public" },
"observability": { "enabled": true },

"vectorize": { "binding": "VECTORIZE", "index_name": "youtube-index-preset" },
"ai": { "binding": "AI" }
}

Worker code: index, seed, search, embed debug


This version contains a robust embedText that handles multiple response shapes from Workers AI.


// src/index.ts (AI demo)

export interface Env {
AI: any; // Workers AI binding
VECTORIZE: any; // Vectorize binding (youtube-index-preset)
}

interface InsertItem {
text: string;
id?: string;
metadata?: Record<string, any>;
}

async function embedText(env: Env, text: string): Promise<number[]> {
const model = "@cf/baai/bge-base-en-v1.5";
// send as array for best compatibility
const result: any = await env.AI.run(model, { text: [text] });

// Normalize across possible shapes:
if (Array.isArray(result?.data) && Array.isArray(result.data[0]) && result.shape) {
return (result.data[0] as number[]).map(Number); // { shape:[1,768], data:[[...]] }
}
if (Array.isArray(result?.data) && Array.isArray(result.data[0]?.embedding)) {
return (result.data[0].embedding as number[]).map(Number); // { data:[{ embedding:[...] }] }
}
if (Array.isArray(result?.embedding)) {
return (result.embedding as number[]).map(Number); // { embedding:[...] }
}
if (Array.isArray(result?.data) && typeof result.data[0] === "number") {
return (result.data as number[]).map(Number); // { data:[...] }
}

console.error("Unexpected embedding response shape:", JSON.stringify(result).slice(0, 500));
throw new Error("Unexpected embedding response");
}

function cors(resp: Response) {
resp.headers.set("Access-Control-Allow-Origin", "*");
resp.headers.set("Access-Control-Allow-Methods", "GET,POST,OPTIONS");
resp.headers.set("Access-Control-Allow-Headers", "Content-Type");
return resp;
}

export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
if (request.method === "OPTIONS") return cors(new Response(null, { status: 204 }));

if (url.pathname === "/health") return cors(Response.json({ ok: true }));

// Debug: see dims/sample
if (url.pathname === "/embed" && request.method === "GET") {
const text = url.searchParams.get("text") || "";
if (!text) return cors(Response.json({ ok: false, error: "Missing ?text" }, { status: 400 }));
const vec = await embedText(env, text);
return cors(Response.json({ ok: true, dims: vec.length, sample: vec.slice(0, 8) }));
}

// Index a single item
if (url.pathname === "/index" && request.method === "POST") {
const body = (await request.json()) as InsertItem;
const text = (body?.text || "").trim();
if (!text) return cors(Response.json({ ok: false, error: "text is required" }, { status: 400 }));

const values = await embedText(env, text);
const id = body.id || crypto.randomUUID();
const inserted = await env.VECTORIZE.insert([{
id,
values,
metadata: { text, ...(body.metadata || {}) }
}]);

return cors(Response.json({ ok: true, id, inserted }));
}

// Bulk seed
if (url.pathname === "/seed" && request.method === "POST") {
const items = (await request.json()) as InsertItem[];
if (!Array.isArray(items) || items.length === 0)
return cors(Response.json({ ok: false, error: "Provide a non-empty array" }, { status: 400 }));

const vectors: Array<{ id: string; values: number[]; metadata?: Record<string, any> }> = [];
for (const item of items) {
const text = (item?.text || "").trim();
if (!text) continue;
const values = await embedText(env, text);
vectors.push({
id: item.id || crypto.randomUUID(),
values,
metadata: { text, ...(item.metadata || {}) }
});
}

if (vectors.length === 0) return cors(Response.json({ ok: false, error: "No valid items" }, { status: 400 }));
const inserted = await env.VECTORIZE.insert(vectors);
return cors(Response.json({ ok: true, count: vectors.length, inserted }));
}

// Search by meaning
if (url.pathname === "/search" && request.method === "GET") {
const text = (url.searchParams.get("text") || "").trim();
if (!text) return cors(Response.json({ ok: false, error: "Missing ?text" }, { status: 400 }));
let topK = Number(url.searchParams.get("topK") || 3);
if (!Number.isFinite(topK) || topK <= 0) topK = 3;

const queryVec = await embedText(env, text);
const matches = await env.VECTORIZE.query(queryVec, {
topK,
returnValues: false,
returnMetadata: "all"
});

return cors(Response.json({ ok: true, query: text, matches }));
}

return cors(Response.json({ ok: false, error: "Not found" }, { status: 404 }));
}
};

Test with curl (Cosine)


# Deploy first
wrangler deploy

# Quick embedding sanity check
curl "https://<your-worker>.workers.dev/embed?text=lightweight+waterproof+jacket"

# Seed 3 items
curl -X POST "https://<your-worker>.workers.dev/seed" \
-H "content-type: application/json" \
-d '[
{"text":"Red running shoes with breathable mesh and foam sole","metadata":{"url":"/products/sku/1001","category":"shoes"}},
{"text":"Lightweight waterproof hiking jacket with hood","metadata":{"url":"/products/sku/2002","category":"jackets"}},
{"text":"Noise-cancelling wireless headphones with 30h battery","metadata":{"url":"/products/sku/3003","category":"audio"}}
]'

# Search by meaning
curl "https://<your-worker>.workers.dev/search?text=rain+proof+jacket+for+hiking&topK=2"

Troubleshooting & Logs


  1. “Worker threw exception” HTML page → open live logs:
wrangler tail
  1. AI binding errors → confirm wrangler binding:
"ai": { "binding": "AI" }

and that Workers AI is enabled for your account.

  1. Dimension mismatch → your index must match your embeddings:
  • Manual demo: 32 dims (Euclidean)
  • AI demo: 768 dims (Cosine preset)
  1. Debug embedding shape → add route /embed or /debug-embed to log result and see its shape.
  2. Check index config:
wrangler vectorize describe youtube-index-preset
wrangler vectorize describe youtube-index

FAQ

Q: Can I store images or PDFs?
Store their embeddings plus metadata/URLs. Fetch the original via the metadata you saved.

Q: Cosine vs Euclidean?

  1. Cosine is great for language/meaning.
  2. Euclidean is intuitive for numeric feature spaces.
    Use the model/preset’s recommended metric.

Q: Do I need Git?
No. wrangler deploy uploads directly from your machine.


How to Transfer a Website Domain from AWS Route 53 to Cloudflare: A Step-by-Step Guide

· 5 min read

How to transfer a domain from AWS Route 53 to Cloudflare. This process might seem a bit complicated, but we'll break it down into manageable steps to ensure a smooth transition.

If you haven't seen our previous video on transferring a domain from GoDaddy to AWS, you can check it out for reference. Now, let's dive into the process of moving a domain from AWS to Cloudflare.

If you haven't seen our previous video on transferring a domain from GoDaddy to AWS, you can check it out for reference. Now, let's dive into the process of moving a domain from AWS to Cloudflare.

Refer How To Transfer Your Domain from GoDaddy to AWS


Step 1: Preparing the Domain for Transfer

AWS Config

Before we proceed, ensure you have:

  • Access to your AWS Route 53 account.
  • A Cloudflare account ready for the transfer.


In AWS Route 53:

AWS Route 53
  1. Navigate to the Registered Domains section in the AWS Management Console.
  2. Find the domain you want to transfer.
  3. Before initiating the transfer, disable Auto-Renew for this domain. This prevents automatic renewal during the transfer process.
  4. Turn off the Transfer Lock to allow other registrars, like Cloudflare, to accept the domain. This change may take a few minutes to propagate.

Step 2: Obtain the Authorization Code

Authorization Code Transfer Out
  1. In the Registered Domains section, look for the Transfer Out option.
  2. Click on Transfer to Another Registrar and request the Authorization Code. AWS will generate a code for you. Copy this code, as it will be needed to authorize the transfer to Cloudflare.

Step 3: Starting the Transfer on Cloudflare

Domain Registration
  1. Go to Cloudflare and log into your account.
  2. In the Cloudflare dashboard, navigate to the Domain Registration section.
Transfer Domain
  1. Choose the Transfer Domain option and enter the domain name you wish to transfer. Click Continue.
  2. If you haven't already added the domain in Cloudflare, you'll be prompted to do so now. Enter the domain name and click Continue.

Step 4: Updating Nameservers in AWS Route 53

Nameservers

To successfully transfer your domain, you need to update the nameservers:

  1. Go back to your AWS Route 53 account.
  2. Navigate to the Registered Domains section and select your domain.
Edit Name Servers
  1. Click on Edit Name Servers. Update the nameservers to the ones provided by Cloudflare.
  2. Always keep a backup of your original nameserver settings in case you need to revert changes.
  3. Save the changes. Note that DNS propagation can take up to 24 hours.

Step 5: Validate the Nameservers

  • After a few hours, return to your Cloudflare dashboard. Cloudflare will automatically validate the updated nameservers.
  • Once validated, your domain will be marked as Ready to Transfer in Cloudflare.

Step 6: Finalize the Transfer in Cloudflare

Cloudflare
  1. In the Cloudflare dashboard, select the domain that is ready for transfer.
  2. Enter the Authorization Code you copied from AWS Route 53.
  3. Review the pricing details and add a payment method in Cloudflare to cover the transfer fee.
  4. Confirm and proceed with the transfer. Cloudflare will then initiate the transfer process.

Step 7: Approve the Transfer in AWS Route 53

Approve

AWS will send an email to the domain's registered email address to approve the transfer:

  1. Check your email and click the link provided to approve the domain transfer. This step is crucial to confirm that you consent to the transfer.
  2. Approve the transfer in the AWS console to expedite the process.

Step 8: Monitoring the Transfer

Active
  • The transfer might take a few hours to complete. You can monitor the status in the Cloudflare dashboard.
  • Once the transfer is complete, the domain status in Cloudflare will change to Active.

Step 9: Post-Transfer Configurations

  1. Verify that the domain is properly set up in Cloudflare and that the DNS records are configured correctly.
  2. Adjust any additional settings in Cloudflare, such as automatic renewals and domain locks.

Conclusion

Congratulations! You've successfully transferred your domain from AWS Route 53 to Cloudflare. Remember, it's always a good idea to keep backups of your original settings and follow each step carefully to avoid any interruptions.

Refer Cloud Consulting

Ready to take your cloud infrastructure to the next level? Please reach out to us Contact Us

Other Blogs

Step-by-Step Guide: Install and Configure GitLab on AWS EC2 | DevOps CI/CD with GitLab on AWS
Simplifying AWS Notifications: A Guide to User Notifications