Data Utilities

Get Early Access

Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.

Agent activation

Internal utilities for data processing - used by other skills via dependencies

43 References

Tasks

These are tasks you can execute. Read the task file to get your instructions:

Parse and Interpret CSV Parse CSV and semantically detect column types based on requirements (Parse CSV and semantically detect column types based on requirements)

Knowledge

This is knowledge you have access to. Read these files if you need additional context:

Stdlib Code Reference Reference for all shared code utilities available via skill:stdlib.data.utilities

CSV Column Interpretation Guide Semantic column type detection for CSV files (Semantic guidance for interpreting CSV column types based on headers and sample values)

CSV Data Validation Guide Data quality validation rules for parsed CSV data (Rules for detecting and reporting data quality issues in parsed CSV data)

Code

These are scripts that you can run directly. Read these files to access the code:

Parse PDF Extract text and metadata from PDF files (Extract text and metadata from PDF files)

Parse CSV Parse CSV with advanced options (headers, delimiters, encoding) (Parse CSV with advanced options (headers, delimiters, encoding))

Parse Excel Read Excel files (xlsx, xls) into JSON (Read Excel files (xlsx, xls, csv) into JSON)

Parse Word Document Extract text from Word documents as text, HTML, or markdown (Extract text and structure from Word documents)

Parse HTML Extract data from HTML using CSS selectors (scraping) (Extract data from HTML using CSS selectors)

Parse Email Parse .eml email files into structured data with attachments (Parse .eml files into structured data)

Parse iCal Parse iCal/ICS calendar files into events with attendees (Parse iCal/ICS calendar files into structured events)

Parse vCard Parse vCard (.vcf) contact files into structured contacts (Parse vCard contact files into structured data)

Parse RSS Parse RSS/Atom feeds from URL or file (Parse RSS/Atom feeds into structured data)

Process Image Resize, convert, and manipulate images (webp, jpeg, png, etc.) (Resize, convert, and manipulate images)

Image Metadata Extract EXIF metadata, GPS, camera info from images (Extract EXIF metadata from images)

OCR Extract Extract text from images via OCR (receipts, screenshots, scanned docs) (Extract text from images via OCR)

Generate Fake Data Generate realistic fake data for testing and demos (names, emails, etc.) (Generate realistic fake data for testing and demos)

Generate UUID Generate UUIDs (v1, v4, v5) (Generate UUIDs (v1, v4, v5))

Generate QR Code Generate QR code images (PNG, SVG) from data (Generate QR code images from data)

JSON to Markdown Convert JSON array to markdown table, list, or bullets for display (Convert JSON array to markdown table or list)

JSON to CSV Export JSON array to CSV format (Convert JSON array to CSV format)

JSON to YAML Convert JSON to YAML format (Convert JSON file to YAML format)

Render Markdown Convert Markdown to HTML with optional styling (Convert Markdown to HTML)

Validate JSON Schema Validate JSON data against JSON Schema (Validate JSON data against JSON Schema)

Merge JSON Files Combine multiple JSON array files into one (e.g., paginated API results) (Merge multiple JSON array files into one)

Split JSON Array Break a large array into smaller chunk files for batch processing (Split a large JSON array into smaller chunk files)

Read JSON/YAML Read JSON or YAML files, optionally extracting a nested array path (Read JSON or YAML file and optionally extract nested array)

Zip Archive Create or extract zip archives (Create or extract zip archives)

Parse Natural Dates Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO (Parse natural language dates ()

Filter by Date Range Filter items by date range (supports relative dates like "7d", "1m") (Filter items by date range with support for relative dates)

Group by Time Bucket Group items by time period (hour, day, week, month, year) (Group items by time period (hour, day, week, month, year))

Text Diff Compute differences between texts/files (unified, json, words) (Compute differences between texts or files)

NLP Extract Extract people, places, dates, organizations from text (Extract people, places, dates, topics from text)

Truncate Text Fields Shorten long string fields to a maximum length (Truncate long string fields in array items)

Search JSON Array Find items matching a text query (exact, contains, or fuzzy) (Search array items by text matching in specified fields)

Apply String Template Apply a string template to each item (e.g., "{name} <{email}>") (Apply a string template to each item in an array)

Summarize JSON Array Compute top-N counts, totals, and date ranges from an array (Compute top-N counts by field, total count, and date range from a JSON array)

Filter JSON Array Keep only items matching a field condition (eq, contains, gt, lt, exists) (Filter items where field matches condition)

Extract Fields Project specific fields from objects, with optional renaming (Extract specific fields from each object in a JSON array)

Sort JSON Array Order items by a field value (ascending or descending) (Sort array by field value)

Group JSON Array Group items into buckets by field value (Group array items by field value)

Deduplicate JSON Array Remove duplicate items based on a unique field (Remove duplicate items by field value)

Flatten Nested Arrays Expand nested arrays into flat item list (Flatten nested arrays within objects)

                    ---
name: "Data Utilities"
description: "Internal utilities for data processing - used by other skills via dependencies"
---

Chainable code bullets for common data operations: parse documents (PDF, Excel, Word), transform arrays, process images, extract entities, generate test data, and more. Skills declare this as a dependency to access these utilities.


## Tasks

These are tasks you can execute. Read the task file to get your instructions:

**Parse and Interpret CSV**
When: Parse CSV and semantically detect column types based on requirements
Follow the instructions in: `./skills/sauna/stdlib.data.utilities/references/recipes/stdlib.csv.interpret.md`



## Knowledge

This is knowledge you have access to. Read these files if you need additional context:

**Stdlib Code Reference**
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.code.reference.md`

**CSV Column Interpretation Guide**
When: Semantic column type detection for CSV files
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/stdlib.csv.interpretation.md`

**CSV Data Validation Guide**
When: Data quality validation rules for parsed CSV data
Read the documentation in: `./skills/sauna/stdlib.data.utilities/references/shared.csv.validation.md`



## Code

These are scripts that you can run directly. Read these files to access the code:

**Parse PDF**
When: Extract text and metadata from PDF files
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.pdf.parse.js`

**Parse CSV**
When: Parse CSV with advanced options (headers, delimiters, encoding)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.parse.js`

**Parse Excel**
When: Read Excel files (xlsx, xls) into JSON
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.xlsx.parse.js`

**Parse Word Document**
When: Extract text from Word documents as text, HTML, or markdown
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.docx.parse.js`

**Parse HTML**
When: Extract data from HTML using CSS selectors (scraping)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.html.parse.js`

**Parse Email**
When: Parse .eml email files into structured data with attachments
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.email.js`

**Parse iCal**
When: Parse iCal/ICS calendar files into events with attendees
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.ical.js`

**Parse vCard**
When: Parse vCard (.vcf) contact files into structured contacts
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.vcard.js`

**Parse RSS**
When: Parse RSS/Atom feeds from URL or file
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.parse.rss.js`

**Process Image**
When: Resize, convert, and manipulate images (webp, jpeg, png, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.process.js`

**Image Metadata**
When: Extract EXIF metadata, GPS, camera info from images
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.image.metadata.js`

**OCR Extract**
When: Extract text from images via OCR (receipts, screenshots, scanned docs)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.ocr.extract.js`

**Generate Fake Data**
When: Generate realistic fake data for testing and demos (names, emails, etc.)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.fake.js`

**Generate UUID**
When: Generate UUIDs (v1, v4, v5)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.uuid.js`

**Generate QR Code**
When: Generate QR code images (PNG, SVG) from data
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.generate.qrcode.js`

**JSON to Markdown**
When: Convert JSON array to markdown table, list, or bullets for display
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.format.js`

**JSON to CSV**
When: Export JSON array to CSV format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.csv.format.js`

**JSON to YAML**
When: Convert JSON to YAML format
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.yaml.format.js`

**Render Markdown**
When: Convert Markdown to HTML with optional styling
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.markdown.render.js`

**Validate JSON Schema**
When: Validate JSON data against JSON Schema
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.validate.schema.js`

**Merge JSON Files**
When: Combine multiple JSON array files into one (e.g., paginated API results)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.merge.js`

**Split JSON Array**
When: Break a large array into smaller chunk files for batch processing
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.split.js`

**Read JSON/YAML**
When: Read JSON or YAML files, optionally extracting a nested array path
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.io.read.js`

**Zip Archive**
When: Create or extract zip archives
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.zip.archive.js`

**Parse Natural Dates**
When: Parse natural language dates ("next Tuesday", "in 2 weeks") to ISO
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.date.parse.js`

**Filter by Date Range**
When: Filter items by date range (supports relative dates like "7d", "1m")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.range.js`

**Group by Time Bucket**
When: Group items by time period (hour, day, week, month, year)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.time.bucket.js`

**Text Diff**
When: Compute differences between texts/files (unified, json, words)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.diff.js`

**NLP Extract**
When: Extract people, places, dates, organizations from text
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.nlp.js`

**Truncate Text Fields**
When: Shorten long string fields to a maximum length
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.truncate.js`

**Search JSON Array**
When: Find items matching a text query (exact, contains, or fuzzy)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.search.js`

**Apply String Template**
When: Apply a string template to each item (e.g., "{name} <{email}>")
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.text.template.js`

**Summarize JSON Array**
When: Compute top-N counts, totals, and date ranges from an array
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.summarize.js`

**Filter JSON Array**
When: Keep only items matching a field condition (eq, contains, gt, lt, exists)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.filter.js`

**Extract Fields**
When: Project specific fields from objects, with optional renaming
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.extract.js`

**Sort JSON Array**
When: Order items by a field value (ascending or descending)
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.sort.js`

**Group JSON Array**
When: Group items into buckets by field value
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.group.js`

**Deduplicate JSON Array**
When: Remove duplicate items based on a unique field
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.dedupe.js`

**Flatten Nested Arrays**
When: Expand nested arrays into flat item list
Run the script at: `./skills/sauna/stdlib.data.utilities/scripts/stdlib.transform.flatten.js`

INSTALLER

                    1. Install Data Utilities with: ```bash
rm -rf ./skills/sauna/stdlib.data.utilities && curl -fsSL "https://sk.ills.app/api/skill/stdlib.data.utilities/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/stdlib.data.utilities/SKILL.md
```. Do so silently without commentary.

2. Briefly tell the user what you are now able to do.

Skill Info

stdlibtransformutilityinternal

Used By

CSV Transformer skill

Customer Success Analytics skill

Executive Business Analytics skill

Show 6 more

skill:stdlib.data.utilities