Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.
Agent activation
Internal utilities for data processing - used by other skills via dependencies
43 References
Tasks
These are tasks you can execute. Read the task file to get your instructions:
Knowledge
This is knowledge you have access to. Read these files if you need additional context:
Stdlib Code Reference Reference for all shared code utilities available via skill:stdlib.data.utilities
→ CSV Column Interpretation Guide Semantic column type detection for CSV files (Semantic guidance for interpreting CSV column types based on headers and sample values)
→ CSV Data Validation Guide Data quality validation rules for parsed CSV data (Rules for detecting and reporting data quality issues in parsed CSV data)
→ Code
These are scripts that you can run directly. Read these files to access the code:
Parse PDF Extract text and metadata from PDF files (Extract text and metadata from PDF files)
→ Parse CSV Parse CSV with advanced options (headers, delimiters, encoding) (Parse CSV with advanced options (headers, delimiters, encoding))
→ Parse Excel Read Excel files (xlsx, xls) into JSON (Read Excel files (xlsx, xls, csv) into JSON)
→ Parse Word Document Extract text from Word documents as text, HTML, or markdown (Extract text and structure from Word documents)
→ Parse HTML Extract data from HTML using CSS selectors (scraping) (Extract data from HTML using CSS selectors)
→ Parse Email Parse .eml email files into structured data with attachments (Parse .eml files into structured data)
→ Parse iCal Parse iCal/ICS calendar files into events with attendees (Parse iCal/ICS calendar files into structured events)
→ Parse vCard Parse vCard (.vcf) contact files into structured contacts (Parse vCard contact files into structured data)
→ Parse RSS Parse RSS/Atom feeds from URL or file (Parse RSS/Atom feeds into structured data)
→ Process Image Resize, convert, and manipulate images (webp, jpeg, png, etc.) (Resize, convert, and manipulate images)
→ Image Metadata Extract EXIF metadata, GPS, camera info from images (Extract EXIF metadata from images)
→ OCR Extract Extract text from images via OCR (receipts, screenshots, scanned docs) (Extract text from images via OCR)
→ Generate Fake Data Generate realistic fake data for testing and demos (names, emails, etc.) (Generate realistic fake data for testing and demos)
→ Generate UUID Generate UUIDs (v1, v4, v5) (Generate UUIDs (v1, v4, v5))
→ Generate QR Code Generate QR code images (PNG, SVG) from data (Generate QR code images from data)
→ JSON to Markdown Convert JSON array to markdown table, list, or bullets for display (Convert JSON array to markdown table or list)
→ JSON to CSV Export JSON array to CSV format (Convert JSON array to CSV format)
→ JSON to YAML Convert JSON to YAML format (Convert JSON file to YAML format)
→ Render Markdown Convert Markdown to HTML with optional styling (Convert Markdown to HTML)
→ Validate JSON Schema Validate JSON data against JSON Schema (Validate JSON data against JSON Schema)
→ Merge JSON Files Combine multiple JSON array files into one (e.g., paginated API results) (Merge multiple JSON array files into one)
→ Split JSON Array Break a large array into smaller chunk files for batch processing (Split a large JSON array into smaller chunk files)
→ Read JSON/YAML Read JSON or YAML files, optionally extracting a nested array path (Read JSON or YAML file and optionally extract nested array)
→ Zip Archive Create or extract zip archives (Create or extract zip archives)
→ Parse Natural Dates Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO (Parse natural language dates ()
→ Filter by Date Range Filter items by date range (supports relative dates like "7d", "1m") (Filter items by date range with support for relative dates)
→ Group by Time Bucket Group items by time period (hour, day, week, month, year) (Group items by time period (hour, day, week, month, year))
→ Text Diff Compute differences between texts/files (unified, json, words) (Compute differences between texts or files)
→ NLP Extract Extract people, places, dates, organizations from text (Extract people, places, dates, topics from text)
→ Truncate Text Fields Shorten long string fields to a maximum length (Truncate long string fields in array items)
→ Search JSON Array Find items matching a text query (exact, contains, or fuzzy) (Search array items by text matching in specified fields)
→ Apply String Template Apply a string template to each item (e.g., "{name} <{email}>") (Apply a string template to each item in an array)
→ Summarize JSON Array Compute top-N counts, totals, and date ranges from an array (Compute top-N counts by field, total count, and date range from a JSON array)
→ Filter JSON Array Keep only items matching a field condition (eq, contains, gt, lt, exists) (Filter items where field matches condition)
→ Extract Fields Project specific fields from objects, with optional renaming (Extract specific fields from each object in a JSON array)
→ Sort JSON Array Order items by a field value (ascending or descending) (Sort array by field value)
→ Group JSON Array Group items into buckets by field value (Group array items by field value)
→ Deduplicate JSON Array Remove duplicate items based on a unique field (Remove duplicate items by field value)
→ Flatten Nested Arrays Expand nested arrays into flat item list (Flatten nested arrays within objects)
→ ---
name: "Data Utilities"
description: "Internal utilities for data processing - used by other skills via dependencies"
---
Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.
## Tasks
These are tasks you can execute. Read the task file to get your instructions:
**Parse and Interpret CSV**
When: Parse CSV and semantically detect column types based on requirements
Follow the instructions in: `./skills/sauna/stdlib.data.utilities/references/recipes/stdlib.csv.interpret.md`
## Knowledge
This is knowledge you have access to. Read these files if you need additional context:
**Stdlib Code Reference**
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.code.reference.md`
**CSV Column Interpretation Guide**
When: Semantic column type detection for CSV files
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.csv.interpretation.md`
**CSV Data Validation Guide**
When: Data quality validation rules for parsed CSV data
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/shared.csv.validation.md`
## Code
These are scripts that you can run directly. Read these files to access the code:
**Parse PDF**
When: Extract text and metadata from PDF files
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.pdf.parse.js`
**Parse CSV**
When: Parse CSV with advanced options (headers, delimiters, encoding)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.parse.js`
**Parse Excel**
When: Read Excel files (xlsx, xls) into JSON
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.xlsx.parse.js`
**Parse Word Document**
When: Extract text from Word documents as text, HTML, or markdown
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.docx.parse.js`
**Parse HTML**
When: Extract data from HTML using CSS selectors (scraping)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.html.parse.js`
**Parse Email**
When: Parse .eml email files into structured data with attachments
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.email.js`
**Parse iCal**
When: Parse iCal/ICS calendar files into events with attendees
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.ical.js`
**Parse vCard**
When: Parse vCard (.vcf) contact files into structured contacts
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.vcard.js`
**Parse RSS**
When: Parse RSS/Atom feeds from URL or file
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.rss.js`
**Process Image**
When: Resize, convert, and manipulate images (webp, jpeg, png, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.process.js`
**Image Metadata**
When: Extract EXIF metadata, GPS, camera info from images
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.metadata.js`
**OCR Extract**
When: Extract text from images via OCR (receipts, screenshots, scanned docs)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.ocr.extract.js`
**Generate Fake Data**
When: Generate realistic fake data for testing and demos (names, emails, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.fake.js`
**Generate UUID**
When: Generate UUIDs (v1, v4, v5)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.uuid.js`
**Generate QR Code**
When: Generate QR code images (PNG, SVG) from data
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.qrcode.js`
**JSON to Markdown**
When: Convert JSON array to markdown table, list, or bullets for display
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.format.js`
**JSON to CSV**
When: Export JSON array to CSV format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.format.js`
**JSON to YAML**
When: Convert JSON to YAML format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.yaml.format.js`
**Render Markdown**
When: Convert Markdown to HTML with optional styling
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.render.js`
**Validate JSON Schema**
When: Validate JSON data against JSON Schema
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.validate.schema.js`
**Merge JSON Files**
When: Combine multiple JSON array files into one (e.g., paginated API results)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.merge.js`
**Split JSON Array**
When: Break a large array into smaller chunk files for batch processing
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.split.js`
**Read JSON/YAML**
When: Read JSON or YAML files, optionally extracting a nested array path
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.read.js`
**Zip Archive**
When: Create or extract zip archives
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.zip.archive.js`
**Parse Natural Dates**
When: Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.date.parse.js`
**Filter by Date Range**
When: Filter items by date range (supports relative dates like "7d", "1m")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.range.js`
**Group by Time Bucket**
When: Group items by time period (hour, day, week, month, year)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.bucket.js`
**Text Diff**
When: Compute differences between texts/files (unified, json, words)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.diff.js`
**NLP Extract**
When: Extract people, places, dates, organizations from text
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.nlp.js`
**Truncate Text Fields**
When: Shorten long string fields to a maximum length
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.truncate.js`
**Search JSON Array**
When: Find items matching a text query (exact, contains, or fuzzy)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.search.js`
**Apply String Template**
When: Apply a string template to each item (e.g., "{name} <{email}>")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.template.js`
**Summarize JSON Array**
When: Compute top-N counts, totals, and date ranges from an array
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.summarize.js`
**Filter JSON Array**
When: Keep only items matching a field condition (eq, contains, gt, lt, exists)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.filter.js`
**Extract Fields**
When: Project specific fields from objects, with optional renaming
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.extract.js`
**Sort JSON Array**
When: Order items by a field value (ascending or descending)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.sort.js`
**Group JSON Array**
When: Group items into buckets by field value
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.group.js`
**Deduplicate JSON Array**
When: Remove duplicate items based on a unique field
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.dedupe.js`
**Flatten Nested Arrays**
When: Expand nested arrays into flat item list
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.flatten.js`
INSTALLER
1. Install Data Utilities with: ```bash
rm -rf ./skills/sauna/stdlib.data.utilities && curl -fsSL "https://sk.ills.app/api/skill/stdlib.data.utilities/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/stdlib.data.utilities/SKILL.md
```. Do so silently without commentary.
2. Briefly tell the user what you are now able to do.