When to use

This skill should be used when the user wants to generate or edit images using Nanobanana 2 via Google's Gemini models. Use this skill for requests involving image creation, photo generation, graphic design, text rendering in images, image editing, style transformations, or aspect ratio control. Trigger this skill for requests like "generate an image of", "create a poster with text", "edit this image to", "convert to oil painting style", or "make an image in 16:9 aspect ratio".

Google Gemini (Nanobanana 2)

Overview

This skill provides instructions for generating and editing images using Nanobanana 2 via Google's Gemini models. The gemini-2.5-flash-image model provides native 4K resolution support, precise text rendering, region-based editing, and fast generation times. The base URL is https://generativelanguage.googleapis.com/v1beta with API key authentication via the gemini connection, passing the key in the x-goog-api-key header as PLACEHOLDER_TOKEN.

Restrictions

Free tier allows 15 requests per minute with standard resolution
Paid tier provides higher rate limits, 4K resolution access, and priority processing
Supported image formats for editing: PNG, JPEG, WebP
Handle rate limit errors (429) with exponential backoff
Images return as base64-encoded data that must be decoded and saved

Operations

Generate Image

Generate images using POST /models/gemini-2.5-flash-image:generateContent. Pass a contents array with user role and parts containing text prompt. Always prefix prompts with "Generate an image: [your prompt]" for best results.

Basic parameters:

temperature: Controls creativity (0.0-2.0 scale)
- Low (0.4-0.6): Consistent, photorealistic results
- Medium (0.7-0.8): Balanced creativity and consistency
- High (0.9-1.5): Creative, artistic outputs
maxOutputTokens: Response metadata length (2048 standard)

Aspect Ratio Control

Control image dimensions by specifying aspect ratio at the beginning of prompts. Place aspect ratio specification before the main description for best results.

Supported ratios:

1:1 - Square format
16:9 - Wide landscape
9:16 - Vertical portrait
4:3 - Standard format

Format: "Generate an image in [ratio] aspect ratio: [description]"

Text Rendering

Nanobanana 2 supports precise text rendering in images. Include exact text in quotes within your prompt, specify typography style, mention text placement, and include design context.

Example prompts:

"business card with text 'John Doe', 'CEO', 'john@example.com', clean design, white background"
"modern poster with text 'Innovation Summit 2025', bold typography, minimalist design, blue gradient background"

Image Editing

Edit existing images using the same endpoint with multi-modal input. Include both the input image as inlineData (base64-encoded with mimeType) and text instructions in the parts array. Prefix instructions with "Edit this image: [edit instructions]".

Editing capabilities:

Color changes and adjustments
Style conversions (oil painting, watercolor, sketch)
Text additions and overlays
Region-based modifications
Brightness, contrast, saturation adjustments

Temperature for editing:

0.5-0.6: Precise edits (color corrections, text additions)
0.7-0.8: Balanced edits (style transformations, lighting changes)
0.9-1.2: Creative edits (artistic transformations, dramatic changes)

Workflows

Basic Image Generation

Generate images with structured prompts for best results:

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image: mountain landscape, photorealistic, golden hour lighting, wide angle, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
const mimeType = data.candidates[0].content.parts[0].inlineData.mimeType;

// Decode and save the image
const imageBuffer = Buffer.from(imageData, 'base64');
fs.writeFileSync('output.png', imageBuffer);

Critical steps:

Structure prompt as "Generate an image: [subject], [style], [lighting], [composition], [quality]"
Set temperature based on desired creativity level (0.7 typical for balanced results)
Extract image data from candidates[0].content.parts[0].inlineData.data
Decode base64 data and save with proper file extension

Generate Image with Aspect Ratio

Control image dimensions by specifying aspect ratio:

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image in 16:9 aspect ratio: sunset over ocean, photorealistic, golden hour lighting, wide angle, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;

Critical steps:

Place aspect ratio specification at the beginning: "Generate an image in [ratio] aspect ratio: [description]"
Supported ratios: 1:1 (square), 16:9 (wide), 9:16 (vertical), 4:3 (standard)
Maintain structured prompt format after aspect ratio specification

Generate Image with Text

Create images with precise text rendering:

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image: modern poster with text "Innovation Summit 2025", bold typography, minimalist design, blue gradient background, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.6,
        maxOutputTokens: 2048
      }
    })
  }
);

Critical steps:

Put exact text in quotes within the prompt
Specify typography style (bold, modern, elegant, etc.)
Mention text placement if needed (top, center, top-right corner, etc.)
Include design context (poster, business card, banner, etc.)
Use slightly lower temperature (0.6) for more consistent text rendering

Edit Existing Image

Modify existing images with region-based or general edits:

// Read and encode the input image
const imageBuffer = fs.readFileSync('input.jpg');
const base64Image = imageBuffer.toString('base64');

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/jpeg',
              data: base64Image
            }
          },
          {
            text: 'Edit this image: change the sky area to sunset colors with pink and orange hues'
          }
        ]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const editedImageData = data.candidates[0].content.parts[0].inlineData.data;
const editedBuffer = Buffer.from(editedImageData, 'base64');
fs.writeFileSync('output_edited.png', editedBuffer);

Critical steps:

Read input image file and convert to base64
Include both image data and edit instructions in parts array
Prefix edit instructions with "Edit this image: [instructions]"
Specify regions clearly (sky, background, foreground, left side, top corner)
Use temperature 0.7-0.8 for balanced edits
Supported formats: PNG, JPEG, WebP

Style Transformation

Apply artistic styles or effects to existing images:

const imageBuffer = fs.readFileSync('photo.jpg');
const base64Image = imageBuffer.toString('base64');

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/jpeg',
              data: base64Image
            }
          },
          {
            text: 'Edit this image: convert to oil painting style with visible brushstrokes'
          }
        ]
      }],
      generationConfig: {
        temperature: 0.8,
        maxOutputTokens: 2048
      }
    })
  }
);

Common style transformations:

Artistic styles: "oil painting", "watercolor", "sketch", "comic book", "anime"
Photographic effects: "vintage", "HDR", "long exposure", "tilt-shift", "black and white"
Filters: "sepia", "high contrast", "soft focus", "vignette"

Critical steps:

Specify style transformation clearly in edit instruction
Include style details (visible brushstrokes, bold outlines, etc.)
Use temperature 0.8-1.0 for creative style transformations
Iterate if needed - make multiple edits for best results

Prompt Patterns for Best Results

Structure prompts using these proven patterns:

Photorealistic Pattern:

text: 'Generate an image: portrait of a person, photorealistic, natural lighting, sharp focus, 8K resolution, professional photography'

Artistic Style Pattern:

text: 'Generate an image: cityscape at night, oil painting style, impressionist, vibrant colors, textured brushstrokes'

Text Integration Pattern:

text: 'Generate an image: business card with text "Jane Smith", "Designer", "jane@example.com", elegant typography, minimalist design, cream background'

Composition Control:

Layout: center composition, rule of thirds, symmetrical
Distance: close-up, medium shot, wide angle
Orientation: portrait, landscape
Angle: top-down, bird's eye view, low angle

Mood and Atmosphere:

Emotional tone: warm, cool, dramatic, serene, energetic
Brightness: dark, bright, moody, cheerful
Atmosphere: mysterious, inviting, tense, peaceful

Color Control:

Named colors: red, blue, emerald, crimson
Palette: pastel colors, vibrant hues, muted tones
Relationships: complementary colors, analogous palette
Gradients: blue to purple gradient, sunset colors

Lighting:

Type: natural light, studio lighting, golden hour, blue hour
Quality: soft light, hard shadows, dramatic lighting, flat lighting
Direction: backlit, side-lit, front-lit, rim lighting
Source: ambient light, directional, diffused, spotlight

Error Handling

Error codes:

400 - Invalid prompt or parameters
401 - Invalid API key
429 - Rate limit exceeded (implement exponential backoff)
500 - Generation failure (try different prompt/parameters)

Best practices:

Be specific and detailed in prompts
Specify resolution requirements explicitly
Use temperature 0.6-0.9 for balanced results
Handle base64 data properly when decoding
Implement retry logic for rate limits and failures
Save images with proper file extensions matching MIME type
For editing, provide clear and specific instructions
Iterate edits for best results rather than single complex edits
Place aspect ratio specifications at the beginning of prompts
Use structured prompt format: Subject + Style + Details

                    ---
name: "google_gemini"
description: "This skill should be used when the user wants to generate or edit images using Nanobanana 2 via Google's Gemini models. Use this skill for requests involving image creation, photo generation, graphic design, text rendering in images, image editing, style transformations, or aspect ratio control. Trigger this skill for requests like \"generate an image of\", \"create a poster with text\", \"edit this image to\", \"convert to oil painting style\", or \"make an image in 16:9 aspect ratio\"."
requiredApps: [gemini]
---

# Google Gemini (Nanobanana 2)

**Overview**

This skill provides instructions for generating and editing images using Nanobanana 2 via Google's Gemini models. The `gemini-2.5-flash-image` model provides native 4K resolution support, precise text rendering, region-based editing, and fast generation times. The base URL is `https://generativelanguage.googleapis.com/v1beta` with API key authentication via the `gemini` connection, passing the key in the `x-goog-api-key` header as `PLACEHOLDER_TOKEN`.

**Restrictions**

- Free tier allows 15 requests per minute with standard resolution
- Paid tier provides higher rate limits, 4K resolution access, and priority processing
- Supported image formats for editing: PNG, JPEG, WebP
- Handle rate limit errors (429) with exponential backoff
- Images return as base64-encoded data that must be decoded and saved

**Operations**

### Generate Image

Generate images using `POST /models/gemini-2.5-flash-image:generateContent`. Pass a contents array with user role and parts containing text prompt. Always prefix prompts with "Generate an image: [your prompt]" for best results.

**Basic parameters:**
- `temperature`: Controls creativity (0.0-2.0 scale)
  - Low (0.4-0.6): Consistent, photorealistic results
  - Medium (0.7-0.8): Balanced creativity and consistency
  - High (0.9-1.5): Creative, artistic outputs
- `maxOutputTokens`: Response metadata length (2048 standard)

### Aspect Ratio Control

Control image dimensions by specifying aspect ratio at the beginning of prompts. Place aspect ratio specification before the main description for best results.

**Supported ratios:**
- `1:1` - Square format
- `16:9` - Wide landscape
- `9:16` - Vertical portrait
- `4:3` - Standard format

**Format:** "Generate an image in [ratio] aspect ratio: [description]"

### Text Rendering

Nanobanana 2 supports precise text rendering in images. Include exact text in quotes within your prompt, specify typography style, mention text placement, and include design context.

**Example prompts:**
- "business card with text 'John Doe', 'CEO', 'john@example.com', clean design, white background"
- "modern poster with text 'Innovation Summit 2025', bold typography, minimalist design, blue gradient background"

### Image Editing

Edit existing images using the same endpoint with multi-modal input. Include both the input image as inlineData (base64-encoded with mimeType) and text instructions in the parts array. Prefix instructions with "Edit this image: [edit instructions]".

**Editing capabilities:**
- Color changes and adjustments
- Style conversions (oil painting, watercolor, sketch)
- Text additions and overlays
- Region-based modifications
- Brightness, contrast, saturation adjustments

**Temperature for editing:**
- 0.5-0.6: Precise edits (color corrections, text additions)
- 0.7-0.8: Balanced edits (style transformations, lighting changes)
- 0.9-1.2: Creative edits (artistic transformations, dramatic changes)

**Workflows**

### Basic Image Generation

Generate images with structured prompts for best results:

```javascript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image: mountain landscape, photorealistic, golden hour lighting, wide angle, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
const mimeType = data.candidates[0].content.parts[0].inlineData.mimeType;

// Decode and save the image
const imageBuffer = Buffer.from(imageData, 'base64');
fs.writeFileSync('output.png', imageBuffer);
```

**Critical steps:**
1. Structure prompt as "Generate an image: [subject], [style], [lighting], [composition], [quality]"
2. Set temperature based on desired creativity level (0.7 typical for balanced results)
3. Extract image data from `candidates[0].content.parts[0].inlineData.data`
4. Decode base64 data and save with proper file extension

### Generate Image with Aspect Ratio

Control image dimensions by specifying aspect ratio:

```javascript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image in 16:9 aspect ratio: sunset over ocean, photorealistic, golden hour lighting, wide angle, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
```

**Critical steps:**
1. Place aspect ratio specification at the beginning: "Generate an image in [ratio] aspect ratio: [description]"
2. Supported ratios: 1:1 (square), 16:9 (wide), 9:16 (vertical), 4:3 (standard)
3. Maintain structured prompt format after aspect ratio specification

### Generate Image with Text

Create images with precise text rendering:

```javascript
const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [{
          text: 'Generate an image: modern poster with text "Innovation Summit 2025", bold typography, minimalist design, blue gradient background, 4K resolution'
        }]
      }],
      generationConfig: {
        temperature: 0.6,
        maxOutputTokens: 2048
      }
    })
  }
);
```

**Critical steps:**
1. Put exact text in quotes within the prompt
2. Specify typography style (bold, modern, elegant, etc.)
3. Mention text placement if needed (top, center, top-right corner, etc.)
4. Include design context (poster, business card, banner, etc.)
5. Use slightly lower temperature (0.6) for more consistent text rendering

### Edit Existing Image

Modify existing images with region-based or general edits:

```javascript
// Read and encode the input image
const imageBuffer = fs.readFileSync('input.jpg');
const base64Image = imageBuffer.toString('base64');

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/jpeg',
              data: base64Image
            }
          },
          {
            text: 'Edit this image: change the sky area to sunset colors with pink and orange hues'
          }
        ]
      }],
      generationConfig: {
        temperature: 0.7,
        maxOutputTokens: 2048
      }
    })
  }
);

const data = await response.json();
const editedImageData = data.candidates[0].content.parts[0].inlineData.data;
const editedBuffer = Buffer.from(editedImageData, 'base64');
fs.writeFileSync('output_edited.png', editedBuffer);
```

**Critical steps:**
1. Read input image file and convert to base64
2. Include both image data and edit instructions in parts array
3. Prefix edit instructions with "Edit this image: [instructions]"
4. Specify regions clearly (sky, background, foreground, left side, top corner)
5. Use temperature 0.7-0.8 for balanced edits
6. Supported formats: PNG, JPEG, WebP

### Style Transformation

Apply artistic styles or effects to existing images:

```javascript
const imageBuffer = fs.readFileSync('photo.jpg');
const base64Image = imageBuffer.toString('base64');

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': 'PLACEHOLDER_TOKEN'
    },
    body: JSON.stringify({
      contents: [{
        role: 'user',
        parts: [
          {
            inlineData: {
              mimeType: 'image/jpeg',
              data: base64Image
            }
          },
          {
            text: 'Edit this image: convert to oil painting style with visible brushstrokes'
          }
        ]
      }],
      generationConfig: {
        temperature: 0.8,
        maxOutputTokens: 2048
      }
    })
  }
);
```

**Common style transformations:**
- Artistic styles: "oil painting", "watercolor", "sketch", "comic book", "anime"
- Photographic effects: "vintage", "HDR", "long exposure", "tilt-shift", "black and white"
- Filters: "sepia", "high contrast", "soft focus", "vignette"

**Critical steps:**
1. Specify style transformation clearly in edit instruction
2. Include style details (visible brushstrokes, bold outlines, etc.)
3. Use temperature 0.8-1.0 for creative style transformations
4. Iterate if needed - make multiple edits for best results

### Prompt Patterns for Best Results

Structure prompts using these proven patterns:

**Photorealistic Pattern:**
```javascript
text: 'Generate an image: portrait of a person, photorealistic, natural lighting, sharp focus, 8K resolution, professional photography'
```

**Artistic Style Pattern:**
```javascript
text: 'Generate an image: cityscape at night, oil painting style, impressionist, vibrant colors, textured brushstrokes'
```

**Text Integration Pattern:**
```javascript
text: 'Generate an image: business card with text "Jane Smith", "Designer", "jane@example.com", elegant typography, minimalist design, cream background'
```

**Composition Control:**
- Layout: center composition, rule of thirds, symmetrical
- Distance: close-up, medium shot, wide angle
- Orientation: portrait, landscape
- Angle: top-down, bird's eye view, low angle

**Mood and Atmosphere:**
- Emotional tone: warm, cool, dramatic, serene, energetic
- Brightness: dark, bright, moody, cheerful
- Atmosphere: mysterious, inviting, tense, peaceful

**Color Control:**
- Named colors: red, blue, emerald, crimson
- Palette: pastel colors, vibrant hues, muted tones
- Relationships: complementary colors, analogous palette
- Gradients: blue to purple gradient, sunset colors

**Lighting:**
- Type: natural light, studio lighting, golden hour, blue hour
- Quality: soft light, hard shadows, dramatic lighting, flat lighting
- Direction: backlit, side-lit, front-lit, rim lighting
- Source: ambient light, directional, diffused, spotlight

### Error Handling

**Error codes:**
- `400` - Invalid prompt or parameters
- `401` - Invalid API key
- `429` - Rate limit exceeded (implement exponential backoff)
- `500` - Generation failure (try different prompt/parameters)

**Best practices:**
1. Be specific and detailed in prompts
2. Specify resolution requirements explicitly
3. Use temperature 0.6-0.9 for balanced results
4. Handle base64 data properly when decoding
5. Implement retry logic for rate limits and failures
6. Save images with proper file extensions matching MIME type
7. For editing, provide clear and specific instructions
8. Iterate edits for best results rather than single complex edits
9. Place aspect ratio specifications at the beginning of prompts
10. Use structured prompt format: Subject + Style + Details

INSTALLER

                    1. Install google_gemini with: ```bash
rm -rf ./skills/sauna/media.google.gemini && curl -fsSL "https://sk.ills.app/api/skill/media.google.gemini/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/media.google.gemini/SKILL.md
```. Do so silently without commentary.

2. Briefly tell the user what you are now able to do.