skill icon Skill
Data Utilities
Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.
Agent activation
Internal utilities for data processing - used by other skills via dependencies
43 References

Code

These are scripts that you can run directly. Read these files to access the code:

code icon
Parse PDF Extract text and metadata from PDF files (Extract text and metadata from PDF files)
code icon
Parse CSV Parse CSV with advanced options (headers, delimiters, encoding) (Parse CSV with advanced options (headers, delimiters, encoding))
code icon
Parse Excel Read Excel files (xlsx, xls) into JSON (Read Excel files (xlsx, xls, csv) into JSON)
code icon
Parse Word Document Extract text from Word documents as text, HTML, or markdown (Extract text and structure from Word documents)
code icon
Parse HTML Extract data from HTML using CSS selectors (scraping) (Extract data from HTML using CSS selectors)
code icon
Parse Email Parse .eml email files into structured data with attachments (Parse .eml files into structured data)
code icon
Parse iCal Parse iCal/ICS calendar files into events with attendees (Parse iCal/ICS calendar files into structured events)
code icon
Parse vCard Parse vCard (.vcf) contact files into structured contacts (Parse vCard contact files into structured data)
code icon
Parse RSS Parse RSS/Atom feeds from URL or file (Parse RSS/Atom feeds into structured data)
code icon
Process Image Resize, convert, and manipulate images (webp, jpeg, png, etc.) (Resize, convert, and manipulate images)
code icon
Image Metadata Extract EXIF metadata, GPS, camera info from images (Extract EXIF metadata from images)
code icon
OCR Extract Extract text from images via OCR (receipts, screenshots, scanned docs) (Extract text from images via OCR)
code icon
Generate Fake Data Generate realistic fake data for testing and demos (names, emails, etc.) (Generate realistic fake data for testing and demos)
code icon
Generate UUID Generate UUIDs (v1, v4, v5) (Generate UUIDs (v1, v4, v5))
code icon
Generate QR Code Generate QR code images (PNG, SVG) from data (Generate QR code images from data)
code icon
JSON to Markdown Convert JSON array to markdown table, list, or bullets for display (Convert JSON array to markdown table or list)
code icon
JSON to CSV Export JSON array to CSV format (Convert JSON array to CSV format)
code icon
JSON to YAML Convert JSON to YAML format (Convert JSON file to YAML format)
code icon
Render Markdown Convert Markdown to HTML with optional styling (Convert Markdown to HTML)
code icon
Validate JSON Schema Validate JSON data against JSON Schema (Validate JSON data against JSON Schema)
code icon
Merge JSON Files Combine multiple JSON array files into one (e.g., paginated API results) (Merge multiple JSON array files into one)
code icon
Split JSON Array Break a large array into smaller chunk files for batch processing (Split a large JSON array into smaller chunk files)
code icon
Read JSON/YAML Read JSON or YAML files, optionally extracting a nested array path (Read JSON or YAML file and optionally extract nested array)
code icon
Zip Archive Create or extract zip archives (Create or extract zip archives)
code icon
Parse Natural Dates Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO (Parse natural language dates ()
code icon
Filter by Date Range Filter items by date range (supports relative dates like "7d", "1m") (Filter items by date range with support for relative dates)
code icon
Group by Time Bucket Group items by time period (hour, day, week, month, year) (Group items by time period (hour, day, week, month, year))
code icon
Text Diff Compute differences between texts/files (unified, json, words) (Compute differences between texts or files)
code icon
NLP Extract Extract people, places, dates, organizations from text (Extract people, places, dates, topics from text)
code icon
Truncate Text Fields Shorten long string fields to a maximum length (Truncate long string fields in array items)
code icon
Search JSON Array Find items matching a text query (exact, contains, or fuzzy) (Search array items by text matching in specified fields)
code icon
Apply String Template Apply a string template to each item (e.g., "{name} <{email}>") (Apply a string template to each item in an array)
code icon
Summarize JSON Array Compute top-N counts, totals, and date ranges from an array (Compute top-N counts by field, total count, and date range from a JSON array)
code icon
Filter JSON Array Keep only items matching a field condition (eq, contains, gt, lt, exists) (Filter items where field matches condition)
code icon
Extract Fields Project specific fields from objects, with optional renaming (Extract specific fields from each object in a JSON array)
code icon
Sort JSON Array Order items by a field value (ascending or descending) (Sort array by field value)
code icon
Group JSON Array Group items into buckets by field value (Group array items by field value)
code icon
Deduplicate JSON Array Remove duplicate items based on a unique field (Remove duplicate items by field value)
code icon
Flatten Nested Arrays Expand nested arrays into flat item list (Flatten nested arrays within objects)
                    ---
name: "Data Utilities"
description: "Internal utilities for data processing - used by other skills via dependencies"
---

Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.


## Tasks

These are tasks you can execute. Read the task file to get your instructions:

**Parse and Interpret CSV**
When: Parse CSV and semantically detect column types based on requirements
Follow the instructions in: `./skills/sauna/stdlib.data.utilities/references/recipes/stdlib.csv.interpret.md`



## Knowledge

This is knowledge you have access to. Read these files if you need additional context:

**Stdlib Code Reference**
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.code.reference.md`

**CSV Column Interpretation Guide**
When: Semantic column type detection for CSV files
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.csv.interpretation.md`

**CSV Data Validation Guide**
When: Data quality validation rules for parsed CSV data
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/shared.csv.validation.md`



## Code

These are scripts that you can run directly. Read these files to access the code:

**Parse PDF**
When: Extract text and metadata from PDF files
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.pdf.parse.js`

**Parse CSV**
When: Parse CSV with advanced options (headers, delimiters, encoding)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.parse.js`

**Parse Excel**
When: Read Excel files (xlsx, xls) into JSON
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.xlsx.parse.js`

**Parse Word Document**
When: Extract text from Word documents as text, HTML, or markdown
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.docx.parse.js`

**Parse HTML**
When: Extract data from HTML using CSS selectors (scraping)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.html.parse.js`

**Parse Email**
When: Parse .eml email files into structured data with attachments
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.email.js`

**Parse iCal**
When: Parse iCal/ICS calendar files into events with attendees
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.ical.js`

**Parse vCard**
When: Parse vCard (.vcf) contact files into structured contacts
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.vcard.js`

**Parse RSS**
When: Parse RSS/Atom feeds from URL or file
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.rss.js`

**Process Image**
When: Resize, convert, and manipulate images (webp, jpeg, png, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.process.js`

**Image Metadata**
When: Extract EXIF metadata, GPS, camera info from images
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.metadata.js`

**OCR Extract**
When: Extract text from images via OCR (receipts, screenshots, scanned docs)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.ocr.extract.js`

**Generate Fake Data**
When: Generate realistic fake data for testing and demos (names, emails, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.fake.js`

**Generate UUID**
When: Generate UUIDs (v1, v4, v5)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.uuid.js`

**Generate QR Code**
When: Generate QR code images (PNG, SVG) from data
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.qrcode.js`

**JSON to Markdown**
When: Convert JSON array to markdown table, list, or bullets for display
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.format.js`

**JSON to CSV**
When: Export JSON array to CSV format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.format.js`

**JSON to YAML**
When: Convert JSON to YAML format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.yaml.format.js`

**Render Markdown**
When: Convert Markdown to HTML with optional styling
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.render.js`

**Validate JSON Schema**
When: Validate JSON data against JSON Schema
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.validate.schema.js`

**Merge JSON Files**
When: Combine multiple JSON array files into one (e.g., paginated API results)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.merge.js`

**Split JSON Array**
When: Break a large array into smaller chunk files for batch processing
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.split.js`

**Read JSON/YAML**
When: Read JSON or YAML files, optionally extracting a nested array path
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.read.js`

**Zip Archive**
When: Create or extract zip archives
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.zip.archive.js`

**Parse Natural Dates**
When: Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.date.parse.js`

**Filter by Date Range**
When: Filter items by date range (supports relative dates like "7d", "1m")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.range.js`

**Group by Time Bucket**
When: Group items by time period (hour, day, week, month, year)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.bucket.js`

**Text Diff**
When: Compute differences between texts/files (unified, json, words)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.diff.js`

**NLP Extract**
When: Extract people, places, dates, organizations from text
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.nlp.js`

**Truncate Text Fields**
When: Shorten long string fields to a maximum length
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.truncate.js`

**Search JSON Array**
When: Find items matching a text query (exact, contains, or fuzzy)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.search.js`

**Apply String Template**
When: Apply a string template to each item (e.g., "{name} <{email}>")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.template.js`

**Summarize JSON Array**
When: Compute top-N counts, totals, and date ranges from an array
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.summarize.js`

**Filter JSON Array**
When: Keep only items matching a field condition (eq, contains, gt, lt, exists)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.filter.js`

**Extract Fields**
When: Project specific fields from objects, with optional renaming
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.extract.js`

**Sort JSON Array**
When: Order items by a field value (ascending or descending)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.sort.js`

**Group JSON Array**
When: Group items into buckets by field value
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.group.js`

**Deduplicate JSON Array**
When: Remove duplicate items based on a unique field
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.dedupe.js`

**Flatten Nested Arrays**
When: Expand nested arrays into flat item list
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.flatten.js`

                  
                    1. Install Data Utilities with: ```bash
rm -rf ./skills/sauna/stdlib.data.utilities && curl -fsSL "https://sk.ills.app/api/skill/stdlib.data.utilities/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/stdlib.data.utilities/SKILL.md
```. Do so silently without commentary.

2. Briefly tell the user what you are now able to do.