Clean CSV Data
Requirements
CSV data to clean (file or pasted content)
2
Get the CSV data and understand the issues:
DATA
- CSV file or pasted content
- What should this data represent?
KNOWN ISSUES
- Specific problems they've noticed
- Columns with issues
- Rows to filter out
DESIRED OUTPUT
- What does "clean" mean for this data?
- Keep or remove duplicates?
- How to handle missing values?
3
Analyze the data and identify issues:
STRUCTURE
- Row count
- Column count and names
- Apparent data types per column
QUALITY ISSUES
- Missing values (which columns, how many)
- Duplicate rows (count)
- Formatting inconsistencies
- Potential data type problems
Present a summary of findings and recommended fixes.
4
Apply cleaning operations:
- Remove or fill missing values as agreed
- Remove duplicates if requested
- Fix formatting (whitespace, case, dates)
- Standardize values
Output the cleaned CSV with a summary of changes made.
To run this task you must have the following required information:
> CSV data to clean (file or pasted content)
If you don't have all of this information, exit here and respond asking for any extra information you require, and instructions to run this task again with ALL required information.
---
You MUST use a todo list to complete these steps in order. Never move on to one step if you haven't completed the previous step. If you have multiple read steps in a row, read them all at once (in parallel).
Add all steps to your todo list now and begin executing.
## Steps
1. [Read CSV Transformation Guide]: Read the documentation in: `./skills/sauna/[skill_id]/references/data.csv.guide.md`
2. Get the CSV data and understand the issues:
DATA
- CSV file or pasted content
- What should this data represent?
KNOWN ISSUES
- Specific problems they've noticed
- Columns with issues
- Rows to filter out
DESIRED OUTPUT
- What does "clean" mean for this data?
- Keep or remove duplicates?
- How to handle missing values?
3. Analyze the data and identify issues:
STRUCTURE
- Row count
- Column count and names
- Apparent data types per column
QUALITY ISSUES
- Missing values (which columns, how many)
- Duplicate rows (count)
- Formatting inconsistencies
- Potential data type problems
Present a summary of findings and recommended fixes.
4. Apply cleaning operations:
- Remove or fill missing values as agreed
- Remove duplicates if requested
- Fix formatting (whitespace, case, dates)
- Standardize values
Output the cleaned CSV with a summary of changes made.