ChatGoogleGenerativeAI

This will help you getting started with ChatGoogleGenerativeAI chat models. For detailed documentation of all ChatGoogleGenerativeAI features and configurations head to the API reference.

Overview

Integration details

Class	Package	Local	Serializable	PY support	Package downloads	Package latest
ChatGoogleGenerativeAI	@langchain/google-genai	❌	✅	✅

Model features

Tool calling	Structured output	JSON mode	Image input	Audio input	Video input	Token-level streaming	Token usage	Logprobs
✅	✅	❌	✅	✅	✅	✅	✅	❌

Setup

You can access Google’s gemini and gemini-vision models, as well as other generative models in LangChain through ChatGoogleGenerativeAI class in the @langchain/google-genai integration package.

tip

You can also access Google's gemini family of models via the LangChain VertexAI and VertexAI-web integrations.

Click here to read the docs.

Credentials

Get an API key here: ai.google.dev/tutorials/setup

Then set the GOOGLE_API_KEY environment variable:

export GOOGLE_API_KEY="your-api-key"

If you want to get automated tracing of your model calls you can also set your LangSmith API key by uncommenting below:

# export LANGCHAIN_TRACING_V2="true"
# export LANGCHAIN_API_KEY="your-api-key"

Installation

The LangChain ChatGoogleGenerativeAI integration lives in the @langchain/google-genai package:

tip

See this section for general instructions on installing integration packages.

npm
yarn
pnpm

npm i @langchain/google-genai

yarn add @langchain/google-genai

pnpm add @langchain/google-genai

Instantiation

Now we can instantiate our model object and generate chat completions:

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";

const llm = new ChatGoogleGenerativeAI({
  model: "gemini-1.5-pro",
  temperature: 0,
  maxRetries: 2,
  // other params...
});

Invocation

const aiMsg = await llm.invoke([
  [
    "system",
    "You are a helpful assistant that translates English to French. Translate the user sentence.",
  ],
  ["human", "I love programming."],
]);
aiMsg;

AIMessage {
  "content": "J'adore programmer. \n",
  "additional_kwargs": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "response_metadata": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 21,
    "output_tokens": 5,
    "total_tokens": 26
  }
}

console.log(aiMsg.content);

J'adore programmer.

Chaining

We can chain our model with a prompt template like so:

import { ChatPromptTemplate } from "@langchain/core/prompts";

const prompt = ChatPromptTemplate.fromMessages([
  [
    "system",
    "You are a helpful assistant that translates {input_language} to {output_language}.",
  ],
  ["human", "{input}"],
]);

const chain = prompt.pipe(llm);
await chain.invoke({
  input_language: "English",
  output_language: "German",
  input: "I love programming.",
});

AIMessage {
  "content": "Ich liebe das Programmieren. \n",
  "additional_kwargs": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "response_metadata": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 16,
    "output_tokens": 7,
    "total_tokens": 23
  }
}

Tool calling

caution

The Google GenerativeAI API does not allow tool schemas to contain an object with unknown properties.

For example, the following Zod schemas will throw an error:

const invalidSchema = z.object({ properties: z.record(z.unknown()) });

and

const invalidSchema2 = z.record(z.unknown());

Instead, you should explicitly define the properties of the object field.

import { tool } from "@langchain/core/tools";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";

// Define your tool
const fakeBrowserTool = tool(
  (_) => {
    return "The search result is xyz...";
  },
  {
    name: "browser_tool",
    description:
      "Useful for when you need to find something on the web or summarize a webpage.",
    schema: z.object({
      url: z.string().describe("The URL of the webpage to search."),
      query: z.string().optional().describe("An optional search query to use."),
    }),
  }
);

const llmWithTool = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
}).bindTools([fakeBrowserTool]); // Bind your tools to the model

const toolRes = await llmWithTool.invoke([
  [
    "human",
    "Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
  ],
]);

console.log(toolRes.tool_calls);

[
  {
    name: 'browser_tool',
    args: {
      url: 'https://www.weather.com',
      query: 'weather tonight in new york'
    },
    type: 'tool_call'
  }
]

`.withStructuredOutput`

import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { z } from "zod";

// Define your model
const llmForWSO = new ChatGoogleGenerativeAI({
  model: "gemini-pro",
});

const browserSchema = z.object({
  url: z.string().describe("The URL of the webpage to search."),
  query: z.string().optional().describe("An optional search query to use."),
});

const llmWithStructuredOutput = llmForWSO.withStructuredOutput(browserSchema, {
  name: "browser_tool",
});

const structuredOutputRes = await llmWithStructuredOutput.invoke([
  [
    "human",
    "Search the web and tell me what the weather will be like tonight in new york. use a popular weather website",
  ],
]);

console.log(structuredOutputRes);

{
  url: 'https://www.accuweather.com/en/us/new-york-ny/10007/current-weather/349333',
  query: 'weather tonight'
}

Multimodal support

To provide an image, pass a human message with a content field set to an array of content objects. Each content object where each dict contains either an image value (type of image_url) or a text (type of text) value. The value of image_url must be a base64 encoded image (e.g., data:image/png;base64,abcd124):

import fs from "fs";
import { ChatGoogleGenerativeAI } from "@langchain/google-genai";
import { ChatPromptTemplate } from "@langchain/core/prompts";

// Multi-modal
const llmWithVisionModel = new ChatGoogleGenerativeAI({
  model: "gemini-1.5-flash",
  maxOutputTokens: 2048,
  maxRetries: 1,
});
const image = fs
  .readFileSync("../../../../../examples/hotdog.jpg")
  .toString("base64");
const visionPrompt = ChatPromptTemplate.fromMessages([
  [
    "human",
    [
      {
        type: "text",
        text: "Describe the following image.",
      },
      {
        type: "image_url",
        image_url: "data:image/png;base64,{image}",
      },
    ],
  ],
]);

const visionRes = await visionPrompt.pipe(llmWithVisionModel).invoke({
  image,
});

console.log(visionRes);

AIMessage {
  "content": "The image shows a hot dog in a bun, isolated against a white background. The hot dog is grilled and has a slightly crispy texture. The bun is soft and fluffy, and it appears to be lightly toasted. The hot dog is positioned horizontally, with the bun covering most of the sausage. The image captures the classic American snack food, highlighting its simplicity and appeal.",
  "additional_kwargs": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "response_metadata": {
    "finishReason": "STOP",
    "index": 0,
    "safetyRatings": [
      {
        "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HATE_SPEECH",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_HARASSMENT",
        "probability": "NEGLIGIBLE"
      },
      {
        "category": "HARM_CATEGORY_DANGEROUS_CONTENT",
        "probability": "NEGLIGIBLE"
      }
    ]
  },
  "tool_calls": [],
  "invalid_tool_calls": [],
  "usage_metadata": {
    "input_tokens": 264,
    "output_tokens": 74,
    "total_tokens": 338
  }
}

Gemini Prompting FAQs

As of the time this doc was written (2023/12/12), Gemini has some restrictions on the types and structure of prompts it accepts. Specifically:

When providing multimodal (image) inputs, you are restricted to at most 1 message of “human” (user) type. You cannot pass multiple messages (though the single human message may have multiple content entries)
System messages are not natively supported, and will be merged with the first human message if present.
For regular chat conversations, messages must follow the human/ai/human/ai alternating pattern. You may not provide 2 AI or human messages in sequence.
Message may be blocked if they violate the safety checks of the LLM. In this case, the model will return an empty response.

API reference

For detailed documentation of all ChatGoogleGenerativeAI features and configurations head to the API reference: https://api.js.langchain.com/classes/langchain_google_genai.ChatGoogleGenerativeAI.html

ChatGoogleGenerativeAI

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Chaining

Tool calling

`.withStructuredOutput`

Multimodal support

Gemini Prompting FAQs

API reference

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview​

Integration details​

Model features​

Setup​

Credentials​

Installation​

Instantiation​

Invocation​

Chaining​

Tool calling​

.withStructuredOutput​

Multimodal support​

Gemini Prompting FAQs​

API reference​

Was this page helpful?

You can also leave detailed feedback on GitHub.

Overview

Integration details

Model features

Setup

Credentials

Installation

Instantiation

Invocation

Chaining

Tool calling

`.withStructuredOutput`

Multimodal support

Gemini Prompting FAQs

API reference