task icon Task

Merge CSV Files

Requirements
Two or more CSV files to merge and the merge criteria
2

Get the files and merge requirements:

FILES

  • First CSV file
  • Second CSV file (and any additional)

MERGE TYPE

  • Stack/append: Add rows from second file below first
  • Join: Combine on matching key columns
    • Which columns to match on?
    • Inner, left, right, or outer join?

COLUMN HANDLING

  • What to do if column names differ?
  • What to do if columns exist in only one file?
3

Analyze both files:

FILE 1

  • Row count
  • Columns
  • Key column values (sample)

FILE 2

  • Row count
  • Columns
  • Key column values (sample)

COMPATIBILITY

  • Matching columns
  • Key overlap (if joining)
  • Potential issues

Confirm merge approach before proceeding.

4

Execute merge:

  • Perform the merge operation
  • Handle any mismatches or duplicates
  • Validate row count matches expectations

Output the merged data with summary:

  • Rows from each source
  • Matched/unmatched counts (for joins)
  • Final row and column count
                    To run this task you must have the following required information:

> Two or more CSV files to merge and the merge criteria

If you don't have all of this information, exit here and respond asking for any extra information you require, and instructions to run this task again with ALL required information.

---

You MUST use a todo list to complete these steps in order. Never move on to one step if you haven't completed the previous step. If you have multiple read steps in a row, read them all at once (in parallel).

Add all steps to your todo list now and begin executing.

## Steps

1. [Read CSV Transformation Guide]: Read the documentation in: `./skills/sauna/[skill_id]/references/data.csv.guide.md`

2. Get the files and merge requirements:

FILES
- First CSV file
- Second CSV file (and any additional)

MERGE TYPE
- Stack/append: Add rows from second file below first
- Join: Combine on matching key columns
  - Which columns to match on?
  - Inner, left, right, or outer join?

COLUMN HANDLING
- What to do if column names differ?
- What to do if columns exist in only one file?


3. Analyze both files:

FILE 1
- Row count
- Columns
- Key column values (sample)

FILE 2
- Row count
- Columns
- Key column values (sample)

COMPATIBILITY
- Matching columns
- Key overlap (if joining)
- Potential issues

Confirm merge approach before proceeding.


4. Execute merge:

- Perform the merge operation
- Handle any mismatches or duplicates
- Validate row count matches expectations

Output the merged data with summary:
- Rows from each source
- Matched/unmatched counts (for joins)
- Final row and column count