Google Gemini (Nanobanana 2)
Overview
This skill provides instructions for generating and editing images using Nanobanana 2 via Google's Gemini models. The gemini-2.5-flash-image model provides native 4K resolution support, precise text rendering, region-based editing, and fast generation times. The base URL is https://generativelanguage.googleapis.com/v1beta with API key authentication via the gemini connection, passing the key in the x-goog-api-key header as PLACEHOLDER_TOKEN.
Restrictions
- Free tier allows 15 requests per minute with standard resolution
- Paid tier provides higher rate limits, 4K resolution access, and priority processing
- Supported image formats for editing: PNG, JPEG, WebP
- Handle rate limit errors (429) with exponential backoff
- Images return as base64-encoded data that must be decoded and saved
Operations
Generate Image
Generate images using POST /models/gemini-2.5-flash-image:generateContent. Pass a contents array with user role and parts containing text prompt. Always prefix prompts with "Generate an image: [your prompt]" for best results.
Basic parameters:
temperature: Controls creativity (0.0-2.0 scale)- Low (0.4-0.6): Consistent, photorealistic results
- Medium (0.7-0.8): Balanced creativity and consistency
- High (0.9-1.5): Creative, artistic outputs
maxOutputTokens: Response metadata length (2048 standard)
Aspect Ratio Control
Control image dimensions by specifying aspect ratio at the beginning of prompts. Place aspect ratio specification before the main description for best results.
Supported ratios:
1:1- Square format16:9- Wide landscape9:16- Vertical portrait4:3- Standard format
Format: "Generate an image in [ratio] aspect ratio: [description]"
Text Rendering
Nanobanana 2 supports precise text rendering in images. Include exact text in quotes within your prompt, specify typography style, mention text placement, and include design context.
Example prompts:
- "business card with text 'John Doe', 'CEO', 'john@example.com', clean design, white background"
- "modern poster with text 'Innovation Summit 2025', bold typography, minimalist design, blue gradient background"
Image Editing
Edit existing images using the same endpoint with multi-modal input. Include both the input image as inlineData (base64-encoded with mimeType) and text instructions in the parts array. Prefix instructions with "Edit this image: [edit instructions]".
Editing capabilities:
- Color changes and adjustments
- Style conversions (oil painting, watercolor, sketch)
- Text additions and overlays
- Region-based modifications
- Brightness, contrast, saturation adjustments
Temperature for editing:
- 0.5-0.6: Precise edits (color corrections, text additions)
- 0.7-0.8: Balanced edits (style transformations, lighting changes)
- 0.9-1.2: Creative edits (artistic transformations, dramatic changes)
Workflows
Basic Image Generation
Generate images with structured prompts for best results:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: mountain landscape, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;
const mimeType = data.candidates[0].content.parts[0].inlineData.mimeType;
// Decode and save the image
const imageBuffer = Buffer.from(imageData, 'base64');
fs.writeFileSync('output.png', imageBuffer);Critical steps:
- Structure prompt as "Generate an image: [subject], [style], [lighting], [composition], [quality]"
- Set temperature based on desired creativity level (0.7 typical for balanced results)
- Extract image data from
candidates[0].content.parts[0].inlineData.data - Decode base64 data and save with proper file extension
Generate Image with Aspect Ratio
Control image dimensions by specifying aspect ratio:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image in 16:9 aspect ratio: sunset over ocean, photorealistic, golden hour lighting, wide angle, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const imageData = data.candidates[0].content.parts[0].inlineData.data;Critical steps:
- Place aspect ratio specification at the beginning: "Generate an image in [ratio] aspect ratio: [description]"
- Supported ratios: 1:1 (square), 16:9 (wide), 9:16 (vertical), 4:3 (standard)
- Maintain structured prompt format after aspect ratio specification
Generate Image with Text
Create images with precise text rendering:
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [{
text: 'Generate an image: modern poster with text "Innovation Summit 2025", bold typography, minimalist design, blue gradient background, 4K resolution'
}]
}],
generationConfig: {
temperature: 0.6,
maxOutputTokens: 2048
}
})
}
);Critical steps:
- Put exact text in quotes within the prompt
- Specify typography style (bold, modern, elegant, etc.)
- Mention text placement if needed (top, center, top-right corner, etc.)
- Include design context (poster, business card, banner, etc.)
- Use slightly lower temperature (0.6) for more consistent text rendering
Edit Existing Image
Modify existing images with region-based or general edits:
// Read and encode the input image
const imageBuffer = fs.readFileSync('input.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: change the sky area to sunset colors with pink and orange hues'
}
]
}],
generationConfig: {
temperature: 0.7,
maxOutputTokens: 2048
}
})
}
);
const data = await response.json();
const editedImageData = data.candidates[0].content.parts[0].inlineData.data;
const editedBuffer = Buffer.from(editedImageData, 'base64');
fs.writeFileSync('output_edited.png', editedBuffer);Critical steps:
- Read input image file and convert to base64
- Include both image data and edit instructions in parts array
- Prefix edit instructions with "Edit this image: [instructions]"
- Specify regions clearly (sky, background, foreground, left side, top corner)
- Use temperature 0.7-0.8 for balanced edits
- Supported formats: PNG, JPEG, WebP
Style Transformation
Apply artistic styles or effects to existing images:
const imageBuffer = fs.readFileSync('photo.jpg');
const base64Image = imageBuffer.toString('base64');
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': 'PLACEHOLDER_TOKEN'
},
body: JSON.stringify({
contents: [{
role: 'user',
parts: [
{
inlineData: {
mimeType: 'image/jpeg',
data: base64Image
}
},
{
text: 'Edit this image: convert to oil painting style with visible brushstrokes'
}
]
}],
generationConfig: {
temperature: 0.8,
maxOutputTokens: 2048
}
})
}
);Common style transformations:
- Artistic styles: "oil painting", "watercolor", "sketch", "comic book", "anime"
- Photographic effects: "vintage", "HDR", "long exposure", "tilt-shift", "black and white"
- Filters: "sepia", "high contrast", "soft focus", "vignette"
Critical steps:
- Specify style transformation clearly in edit instruction
- Include style details (visible brushstrokes, bold outlines, etc.)
- Use temperature 0.8-1.0 for creative style transformations
- Iterate if needed - make multiple edits for best results
Prompt Patterns for Best Results
Structure prompts using these proven patterns:
Photorealistic Pattern:
text: 'Generate an image: portrait of a person, photorealistic, natural lighting, sharp focus, 8K resolution, professional photography'Artistic Style Pattern:
text: 'Generate an image: cityscape at night, oil painting style, impressionist, vibrant colors, textured brushstrokes'Text Integration Pattern:
text: 'Generate an image: business card with text "Jane Smith", "Designer", "jane@example.com", elegant typography, minimalist design, cream background'Composition Control:
- Layout: center composition, rule of thirds, symmetrical
- Distance: close-up, medium shot, wide angle
- Orientation: portrait, landscape
- Angle: top-down, bird's eye view, low angle
Mood and Atmosphere:
- Emotional tone: warm, cool, dramatic, serene, energetic
- Brightness: dark, bright, moody, cheerful
- Atmosphere: mysterious, inviting, tense, peaceful
Color Control:
- Named colors: red, blue, emerald, crimson
- Palette: pastel colors, vibrant hues, muted tones
- Relationships: complementary colors, analogous palette
- Gradients: blue to purple gradient, sunset colors
Lighting:
- Type: natural light, studio lighting, golden hour, blue hour
- Quality: soft light, hard shadows, dramatic lighting, flat lighting
- Direction: backlit, side-lit, front-lit, rim lighting
- Source: ambient light, directional, diffused, spotlight
Error Handling
Error codes:
400- Invalid prompt or parameters401- Invalid API key429- Rate limit exceeded (implement exponential backoff)500- Generation failure (try different prompt/parameters)
Best practices:
- Be specific and detailed in prompts
- Specify resolution requirements explicitly
- Use temperature 0.6-0.9 for balanced results
- Handle base64 data properly when decoding
- Implement retry logic for rate limits and failures
- Save images with proper file extensions matching MIME type
- For editing, provide clear and specific instructions
- Iterate edits for best results rather than single complex edits
- Place aspect ratio specifications at the beginning of prompts
- Use structured prompt format: Subject + Style + Details
Hosting Generated Images
To get a public URL for a generated image, use Image Hosting after generation:
# After generating image to session/
- run:
Upload Image to GitHub Pages
args:
- "session/generated-image.png"
- "optional-custom-name"
- "
Hosted Images Index"This uploads the image to GitHub Pages and returns a permanent URL like:https://username.github.io/media-assets/2026/01/14/custom-name.png
Use this when the user needs to share, embed, or reference the generated image by URL.