Google Gemini (Nanobanana 2)
Overview
This skill provides instructions for generating and editing images using Nanobanana 2 via Google's Gemini models. The gemini-2.5-flash-image model provides native 4K resolution support, precise text rendering, region-based editing, and fast generation times. The base URL is https://generativelanguage.googleapis.com/v1beta with API key authentication via the gemini connection, passing the key in the x-goog-api-key header as PLACEHOLDER_TOKEN.
Restrictions
- Free tier allows 15 requests per minute with standard resolution
- Paid tier provides higher rate limits, 4K resolution access, and priority processing
- Supported image formats for editing: PNG, JPEG, WebP
- Handle rate limit errors (429) with exponential backoff
- Images return as base64-encoded data that must be decoded and saved
Operations
Generate Image
Generate images using POST /models/gemini-2.5-flash-image:generateContent. Pass a contents array with user role and parts containing text prompt. Always prefix prompts with "Generate an image: [your prompt]" for best results.
Basic parameters:
temperature: Controls creativity (0.0-2.0 scale)- Low (0.4-0.6): Consistent, photorealistic results
- Medium (0.7-0.8): Balanced creativity and consistency
- High (0.9-1.5): Creative, artistic outputs
maxOutputTokens: Response metadata length (2048 standard)
Aspect Ratio Control
Control image dimensions by specifying aspect ratio at the beginning of prompts. Place aspect ratio specification before the main description for best results.
Supported ratios:
1:1- Square format16:9- Wide landscape9:16- Vertical portrait4:3- Standard format
Format: "Generate an image in [ratio] aspect ratio: [description]"
Text Rendering
Nanobanana 2 supports precise text rendering in images. Include exact text in quotes within your prompt, specify typography style, mention text placement, and include design context.
Example prompts:
- "business card with text 'John Doe', 'CEO', 'john@example.com', clean design, white background"
- "modern poster with text 'Innovation Summit 2025', bold typography, minimalist design, blue gradient background"
Image Editing
Edit existing images using the same endpoint with multi-modal input. Include both the input image as inlineData (base64-encoded with mimeType) and text instructions in the parts array. Prefix instructions with "Edit this image: [edit instructions]".
Editing capabilities:
- Color changes and adjustments
- Style conversions (oil painting, watercolor, sketch)
- Text additions and overlays
- Region-based modifications
- Brightness, contrast, saturation adjustments
Temperature for editing:
- 0.5-0.6: Precise edits (color corrections, text additions)
- 0.7-0.8: Balanced edits (style transformations, lighting changes)
- 0.9-1.2: Creative edits (artistic transformations, dramatic changes)
Workflows
Basic Image Generation
Generate images with structured prompts for best results:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: mountain landscape, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
const mimeType = data.candidates[0].content.parts[0].inlineData.mimeType;
// Decode and save the image
const imageBuffer = Buffer.from(imageData, 'base64');
fs.writeFileSync('output.png', imageBuffer);Critical steps:
- Structure prompt as "Generate an image: [subject], [style], [lighting], [composition], [quality]"
- Set temperature based on desired creativity level (0.7 typical for balanced results)
- Extract image data from
candidates[0].content.parts[0].inlineData.data - Decode base64 data and save with proper file extension
Generate Image with Aspect Ratio
Control image dimensions by specifying aspect ratio:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image in 16:9 aspect ratio: sunset over ocean, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;Critical steps:
- Place aspect ratio specification at the beginning: "Generate an image in [ratio] aspect ratio: [description]"
- Supported ratios: 1:1 (square), 16:9 (wide), 9:16 (vertical), 4:3 (standard)
- Maintain structured prompt format after aspect ratio specification
Generate Image with Text
Create images with precise text rendering:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: modern poster with text "Innovation Summit 2025", bold typography, minimalist design, blue gradient background, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.6,
maxOutputTokens: 2048
}
})
}
);Critical steps:
- Put exact text in quotes within the prompt
- Specify typography style (bold, modern, elegant, etc.)
- Mention text placement if needed (top, center, top-right corner, etc.)
- Include design context (poster, business card, banner, etc.)
- Use slightly lower temperature (0.6) for more consistent text rendering
Edit Existing Image
Modify existing images with region-based or general edits:
// Read and encode the input image
const imageBuffer = fs.readFileSync('input.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: change the sky area to sunset colors with pink and orange hues'
}
]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const editedImageData = data.candidates[0].content.parts[0].inlineData.data;
const editedBuffer = Buffer.from(editedImageData, 'base64');
fs.writeFileSync('output_edited.png', editedBuffer);Critical steps:
- Read input image file and convert to base64
- Include both image data and edit instructions in parts array
- Prefix edit instructions with "Edit this image: [instructions]"
- Specify regions clearly (sky, background, foreground, left side, top corner)
- Use temperature 0.7-0.8 for balanced edits
- Supported formats: PNG, JPEG, WebP
Style Transformation
Apply artistic styles or effects to existing images:
const imageBuffer = fs.readFileSync('photo.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: convert to oil painting style with visible brushstrokes'
}
]
}],
generationConfig: {
temperature: 0.8,
maxOutputTokens: 2048
}
})
}
);Common style transformations:
- Artistic styles: "oil painting", "watercolor", "sketch", "comic book", "anime"
- Photographic effects: "vintage", "HDR", "long exposure", "tilt-shift", "black and white"
- Filters: "sepia", "high contrast", "soft focus", "vignette"
Critical steps:
- Specify style transformation clearly in edit instruction
- Include style details (visible brushstrokes, bold outlines, etc.)
- Use temperature 0.8-1.0 for creative style transformations
- Iterate if needed - make multiple edits for best results
Prompt Patterns for Best Results
Structure prompts using these proven patterns:
Photorealistic Pattern:
text: 'Generate an image: portrait of a person, photorealistic, natural lighting, sharp focus, 8K resolution, professional photography'Artistic Style Pattern:
text: 'Generate an image: cityscape at night, oil painting style, impressionist, vibrant colors, textured brushstrokes'Text Integration Pattern:
text: 'Generate an image: business card with text "Jane Smith", "Designer", "jane@example.com", elegant typography, minimalist design, cream background'Composition Control:
- Layout: center composition, rule of thirds, symmetrical
- Distance: close-up, medium shot, wide angle
- Orientation: portrait, landscape
- Angle: top-down, bird's eye view, low angle
Mood and Atmosphere:
- Emotional tone: warm, cool, dramatic, serene, energetic
- Brightness: dark, bright, moody, cheerful
- Atmosphere: mysterious, inviting, tense, peaceful
Color Control:
- Named colors: red, blue, emerald, crimson
- Palette: pastel colors, vibrant hues, muted tones
- Relationships: complementary colors, analogous palette
- Gradients: blue to purple gradient, sunset colors
Lighting:
- Type: natural light, studio lighting, golden hour, blue hour
- Quality: soft light, hard shadows, dramatic lighting, flat lighting
- Direction: backlit, side-lit, front-lit, rim lighting
- Source: ambient light, directional, diffused, spotlight
Error Handling
Error codes:
400- Invalid prompt or parameters401- Invalid API key429- Rate limit exceeded (implement exponential backoff)500- Generation failure (try different prompt/parameters)
Best practices:
- Be specific and detailed in prompts
- Specify resolution requirements explicitly
- Use temperature 0.6-0.9 for balanced results
- Handle base64 data properly when decoding
- Implement retry logic for rate limits and failures
- Save images with proper file extensions matching MIME type
- For editing, provide clear and specific instructions
- Iterate edits for best results rather than single complex edits
- Place aspect ratio specifications at the beginning of prompts
- Use structured prompt format: Subject + Style + Details
---
name: "google_gemini"
description: "This skill should be used when the user wants to generate or edit images using Nanobanana 2 via Google's Gemini models. Use this skill for requests involving image creation, photo generation, graphic design, text rendering in images, image editing, style transformations, or aspect ratio control. Trigger this skill for requests like \"generate an image of\", \"create a poster with text\", \"edit this image to\", \"convert to oil painting style\", or \"make an image in 16:9 aspect ratio\"."
requiredApps: [gemini]
---
# Google Gemini (Nanobanana 2)
**Overview**
This skill provides instructions for generating and editing images using Nanobanana 2 via Google's Gemini models. The `gemini-2.5-flash-image` model provides native 4K resolution support, precise text rendering, region-based editing, and fast generation times. The base URL is `https://generativelanguage.googleapis.com/v1beta` with API key authentication via the `gemini` connection, passing the key in the `x-goog-api-key` header as `PLACEHOLDER_TOKEN`.
**Restrictions**
- Free tier allows 15 requests per minute with standard resolution
- Paid tier provides higher rate limits, 4K resolution access, and priority processing
- Supported image formats for editing: PNG, JPEG, WebP
- Handle rate limit errors (429) with exponential backoff
- Images return as base64-encoded data that must be decoded and saved
**Operations**
### Generate Image
Generate images using `POST /models/gemini-2.5-flash-image:generateContent`. Pass a contents array with user role and parts containing text prompt. Always prefix prompts with "Generate an image: [your prompt]" for best results.
**Basic parameters:**
- `temperature`: Controls creativity (0.0-2.0 scale)
- Low (0.4-0.6): Consistent, photorealistic results
- Medium (0.7-0.8): Balanced creativity and consistency
- High (0.9-1.5): Creative, artistic outputs
- `maxOutputTokens`: Response metadata length (2048 standard)
### Aspect Ratio Control
Control image dimensions by specifying aspect ratio at the beginning of prompts. Place aspect ratio specification before the main description for best results.
**Supported ratios:**
- `1:1` - Square format
- `16:9` - Wide landscape
- `9:16` - Vertical portrait
- `4:3` - Standard format
**Format:** "Generate an image in [ratio] aspect ratio: [description]"
### Text Rendering
Nanobanana 2 supports precise text rendering in images. Include exact text in quotes within your prompt, specify typography style, mention text placement, and include design context.
**Example prompts:**
- "business card with text 'John Doe', 'CEO', 'john@example.com', clean design, white background"
- "modern poster with text 'Innovation Summit 2025', bold typography, minimalist design, blue gradient background"
### Image Editing
Edit existing images using the same endpoint with multi-modal input. Include both the input image as inlineData (base64-encoded with mimeType) and text instructions in the parts array. Prefix instructions with "Edit this image: [edit instructions]".
**Editing capabilities:**
- Color changes and adjustments
- Style conversions (oil painting, watercolor, sketch)
- Text additions and overlays
- Region-based modifications
- Brightness, contrast, saturation adjustments
**Temperature for editing:**
- 0.5-0.6: Precise edits (color corrections, text additions)
- 0.7-0.8: Balanced edits (style transformations, lighting changes)
- 0.9-1.2: Creative edits (artistic transformations, dramatic changes)
**Workflows**
### Basic Image Generation
Generate images with structured prompts for best results:
```javascript
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: mountain landscape, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
const mimeType = data.candidates[0].content.parts[0].inlineData.mimeType;
// Decode and save the image
const imageBuffer = Buffer.from(imageData, 'base64');
fs.writeFileSync('output.png', imageBuffer);
```
**Critical steps:**
1. Structure prompt as "Generate an image: [subject], [style], [lighting], [composition], [quality]"
2. Set temperature based on desired creativity level (0.7 typical for balanced results)
3. Extract image data from `candidates[0].content.parts[0].inlineData.data`
4. Decode base64 data and save with proper file extension
### Generate Image with Aspect Ratio
Control image dimensions by specifying aspect ratio:
```javascript
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image in 16:9 aspect ratio: sunset over ocean, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
```
**Critical steps:**
1. Place aspect ratio specification at the beginning: "Generate an image in [ratio] aspect ratio: [description]"
2. Supported ratios: 1:1 (square), 16:9 (wide), 9:16 (vertical), 4:3 (standard)
3. Maintain structured prompt format after aspect ratio specification
### Generate Image with Text
Create images with precise text rendering:
```javascript
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: modern poster with text "Innovation Summit 2025", bold typography, minimalist design, blue gradient background, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.6,
maxOutputTokens: 2048
}
})
}
);
```
**Critical steps:**
1. Put exact text in quotes within the prompt
2. Specify typography style (bold, modern, elegant, etc.)
3. Mention text placement if needed (top, center, top-right corner, etc.)
4. Include design context (poster, business card, banner, etc.)
5. Use slightly lower temperature (0.6) for more consistent text rendering
### Edit Existing Image
Modify existing images with region-based or general edits:
```javascript
// Read and encode the input image
const imageBuffer = fs.readFileSync('input.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: change the sky area to sunset colors with pink and orange hues'
}
]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const editedImageData = data.candidates[0].content.parts[0].inlineData.data;
const editedBuffer = Buffer.from(editedImageData, 'base64');
fs.writeFileSync('output_edited.png', editedBuffer);
```
**Critical steps:**
1. Read input image file and convert to base64
2. Include both image data and edit instructions in parts array
3. Prefix edit instructions with "Edit this image: [instructions]"
4. Specify regions clearly (sky, background, foreground, left side, top corner)
5. Use temperature 0.7-0.8 for balanced edits
6. Supported formats: PNG, JPEG, WebP
### Style Transformation
Apply artistic styles or effects to existing images:
```javascript
const imageBuffer = fs.readFileSync('photo.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: convert to oil painting style with visible brushstrokes'
}
]
}],
generationConfig: {
temperature: 0.8,
maxOutputTokens: 2048
}
})
}
);
```
**Common style transformations:**
- Artistic styles: "oil painting", "watercolor", "sketch", "comic book", "anime"
- Photographic effects: "vintage", "HDR", "long exposure", "tilt-shift", "black and white"
- Filters: "sepia", "high contrast", "soft focus", "vignette"
**Critical steps:**
1. Specify style transformation clearly in edit instruction
2. Include style details (visible brushstrokes, bold outlines, etc.)
3. Use temperature 0.8-1.0 for creative style transformations
4. Iterate if needed - make multiple edits for best results
### Prompt Patterns for Best Results
Structure prompts using these proven patterns:
**Photorealistic Pattern:**
```javascript
text: 'Generate an image: portrait of a person, photorealistic, natural lighting, sharp focus, 8K resolution, professional photography'
```
**Artistic Style Pattern:**
```javascript
text: 'Generate an image: cityscape at night, oil painting style, impressionist, vibrant colors, textured brushstrokes'
```
**Text Integration Pattern:**
```javascript
text: 'Generate an image: business card with text "Jane Smith", "Designer", "jane@example.com", elegant typography, minimalist design, cream background'
```
**Composition Control:**
- Layout: center composition, rule of thirds, symmetrical
- Distance: close-up, medium shot, wide angle
- Orientation: portrait, landscape
- Angle: top-down, bird's eye view, low angle
**Mood and Atmosphere:**
- Emotional tone: warm, cool, dramatic, serene, energetic
- Brightness: dark, bright, moody, cheerful
- Atmosphere: mysterious, inviting, tense, peaceful
**Color Control:**
- Named colors: red, blue, emerald, crimson
- Palette: pastel colors, vibrant hues, muted tones
- Relationships: complementary colors, analogous palette
- Gradients: blue to purple gradient, sunset colors
**Lighting:**
- Type: natural light, studio lighting, golden hour, blue hour
- Quality: soft light, hard shadows, dramatic lighting, flat lighting
- Direction: backlit, side-lit, front-lit, rim lighting
- Source: ambient light, directional, diffused, spotlight
### Error Handling
**Error codes:**
- `400` - Invalid prompt or parameters
- `401` - Invalid API key
- `429` - Rate limit exceeded (implement exponential backoff)
- `500` - Generation failure (try different prompt/parameters)
**Best practices:**
1. Be specific and detailed in prompts
2. Specify resolution requirements explicitly
3. Use temperature 0.6-0.9 for balanced results
4. Handle base64 data properly when decoding
5. Implement retry logic for rate limits and failures
6. Save images with proper file extensions matching MIME type
7. For editing, provide clear and specific instructions
8. Iterate edits for best results rather than single complex edits
9. Place aspect ratio specifications at the beginning of prompts
10. Use structured prompt format: Subject + Style + Details
INSTALLER
1. Install google_gemini with: ```bash
rm -rf ./skills/sauna/media.google.gemini && curl -fsSL "https://sk.ills.app/api/skill/media.google.gemini/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/media.google.gemini/SKILL.md
```. Do so silently without commentary.
2. Briefly tell the user what you are now able to do.