Introduction

In the first two parts of this series, we explored LangChain (framework for building LLM-powered apps) and LangGraph (stateful reasoning and agent orchestration). Now, let’s dive into the final piece of the puzzle: LangSmith.

LangSmith is your observability and evaluation platform for LLM applications. It helps you debug, monitor, test, and improve chains, agents, and workflows built using LangChain (and beyond).

If LangChain is your code, and LangGraph is your control flow, then LangSmith is your debugging dashboard and analytics brain.

Why LangSmith?

When building with LLMs, things often break in non-obvious ways:

The model “hallucinates” incorrect answers.
Your prompt isn’t structured properly.
A single missing memory update derails an agent.
You don’t know why your pipeline is slow.

LangSmith solves this by giving you:

Tracing – Full visibility into every step of your chain/agent.
Dataset Management – Run systematic evaluations on test cases.
Feedback & Scoring – Collect user or automated feedback.
Monitoring – Track performance in production.

Setting Up LangSmith in a JavaScript Project

Package Installation

First, install the necessary packages:

npm install langchain @langchain/langsmith

Environment Variables

You’ll need to set up your LangSmith API key:

export LANGSMITH_API_KEY="your_api_key_here"

Or, in a .env file for Node.js projects:

LANGSMITH_API_KEY=your_api_key_here
LANGSMITH_TRACING=true

Enable tracing to automatically log runs to LangSmith.

Basic Example: Tracing a LangChain Call

Let’s start with a simple LLM chain and log it to LangSmith:

import { ChatOpenAI } from "@langchain/openai";
import { PromptTemplate } from "@langchain/core/prompts";
import { LLMChain } from "langchain/chains";

const model = new ChatOpenAI({
  temperature: 0.7,
  modelName: "gpt-4o-mini",
});

const prompt = new PromptTemplate({
  template: "Translate the following sentence to French: {text}",
  inputVariables: ["text"],
});

const chain = new LLMChain({
  llm: model,
  prompt,
});

const result = await chain.run("I love programming!");
console.log(result); // "J'aime programmer !"

With LANGSMITH_TRACING=true, this run is automatically logged in LangSmith. You can view:

Prompt sent
Response received
Latency
Tokens used

Viewing Traces in LangSmith

After running the above, head to the LangSmith dashboard. You’ll see:

A tree view of your run (chain → LLM call).
Full input/output history.
Logs of intermediate steps.

This is invaluable for debugging nested chains and agents.

Advanced Use Case: Tracing Agents with Tools

LangSmith shines when working with agents. Let’s trace an agent using tools:

import { initializeAgentExecutorWithOptions } from "langchain/agents";
import { SerpAPI } from "langchain/tools";
import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({ temperature: 0 });
const tools = [
  new SerpAPI(process.env.SERPAPI_API_KEY, {
    location: "San Francisco, California, United States",
    hl: "en",
    gl: "us",
  }),
];

const executor = await initializeAgentExecutorWithOptions(tools, model, {
  agentType: "openai-functions",
});

console.log("Agent loaded. Ask it something...");

const result = await executor.run("What is the latest news about LangChain?");
console.log(result);

With LangSmith tracing enabled, you’ll see:

Agent reasoning steps (thoughts).
Tool calls (e.g., SerpAPI queries).
Final answer.
Timing breakdown.

This makes agent debugging so much easier.

Evaluating Models with Datasets

One of the most powerful features of LangSmith is evaluation. You can upload a dataset of inputs and expected outputs, then run chains/agents against it.

Step 1: Create a Dataset

In the LangSmith dashboard, create a dataset, e.g., “Translation Tests”.

Step 2: Run Against Dataset in JS

import { RunEval } from "@langchain/langsmith/evaluation";
import { client } from "@langchain/langsmith";

const datasetName = "Translation Tests";

const runs = await client.runOnDataset(datasetName, chain, {
  metadata: { purpose: "French translation testing" },
});

console.log("Evaluation run started:", runs);

Step 3: Add Feedback

LangSmith lets you add manual or automated feedback:

await client.createFeedback(runId, "accuracy", {
  score: 0.9,
  comment: "Close, but missed nuance in translation",
});

This makes it possible to quantitatively track improvements as you tweak prompts and models.

Monitoring in Production

LangSmith isn’t just for dev-time debugging. You can:

Log user interactions in production.
Track metrics (latency, tokens, costs).
Analyze failures by searching run history.

Example: logging custom metadata:

const result = await chain.run("Hello World", {
  tags: ["prod", "user123"],
  metadata: { sessionId: "abc-123", feature: "translation" },
});

Now, in LangSmith, you can filter all runs for user123.

Automated Evaluation Example

LangSmith supports LLM-as-a-judge evaluation. For example, checking if translations are “fluent”:

import { RunEvalConfig } from "@langchain/langsmith/evaluation";

const evalConfig = {
  evaluators: [
    {
      evaluatorType: "criteria",
      criteria: {
        fluency: "Is the text fluent and grammatically correct?",
      },
    },
  ],
};

const evalResult = await client.runOnDataset("Translation Tests", chain, {
  evaluationConfig: evalConfig,
});

console.log(evalResult);

This automatically scores runs using an LLM.

Best Practices for JS Developers

Enable tracing early – Don’t wait until production bugs appear.
Use metadata/tags – Organize runs by environment, user, or feature.
Automate evaluations – Don’t rely only on manual testing.
Log tool usage – Especially for agents, since they often fail silently.
Monitor costs – LangSmith shows token/cost breakdowns.

Conclusion

With LangSmith, your LangChain and LangGraph apps move from “black boxes” to transparent, measurable, improvable systems.

Use tracing to debug.
Use datasets and evaluations to improve.
Use monitoring to keep things reliable in production.

This completes our 3-part series:

LangGraph – Controlling agent workflows.
LangChain – Building chains and agents.
LangSmith – Debugging, monitoring, and improving.

Together, these tools form a full-stack framework for AI applications.

LangSmith | Part 3/3 of Generative AI for JS Developers

Introduction

Why LangSmith?

Setting Up LangSmith in a JavaScript Project

Package Installation

Environment Variables

Basic Example: Tracing a LangChain Call

Viewing Traces in LangSmith

Advanced Use Case: Tracing Agents with Tools

Evaluating Models with Datasets

Step 1: Create a Dataset

Step 2: Run Against Dataset in JS

Step 3: Add Feedback

Monitoring in Production

Automated Evaluation Example

Best Practices for JS Developers

Conclusion

Comments

More from this blog

One Developer. Two Months. A Quarter Million Lines of Code. Then the Real Lesson Began.

Let's convert your image into a React Component

AI-Driven SEO: Building a Next.js Site That Writes & Optimizes Its Own Pages

The Hidden Cost of “Modern & Shiny” Frameworks: How Choosing Trendy Web Frameworks Can Hurt in Production (Next.js vs Astro)

Next.js + WebGPU: Build an Offline-First AI Chatbot That Runs 100% in the Browser (No Backend Needed)

Command Palette

Introduction

Why LangSmith?

Setting Up LangSmith in a JavaScript Project

Package Installation

Environment Variables

Basic Example: Tracing a LangChain Call

Viewing Traces in LangSmith

Advanced Use Case: Tracing Agents with Tools

Evaluating Models with Datasets

Step 1: Create a Dataset

Step 2: Run Against Dataset in JS

Step 3: Add Feedback

Monitoring in Production

Automated Evaluation Example

Best Practices for JS Developers

Conclusion

Comments

More from this blog