Combine Multiple Files Using Claude Code
Merge data from 2-3 sources into one unified output.
Combining Multiple Files
You know how to work with CSVs, spreadsheets, and PDFs individually. But real work rarely lives in one file. Your customer data is in a CRM export, their payment history is in a spreadsheet, and the contract details are in a PDF. Now you need all of it in one place.
The simplest case: stacking similar files
Sometimes you have multiple files with the same structure (monthly reports, weekly exports, regional data) and you just need them combined into one:
Read all CSV files in the /monthly-reports folder.
Stack them into a single file with all rows combined.
Add a "source_file" column so I can tell which file each row came from.
Remove any duplicate header rows.
Save as full_year_report.csvThis is the equivalent of copying and pasting sheets together in Excel, except you don't have to open twelve files and do it row by row.
Merging files that share a key
More often, you have two files with different columns but a shared identifier, like an email address or order ID:
Read customers.csv and orders.csv.
Match rows by the "email" column.
Create a combined file that includes:
- All columns from customers.csv
- The "order_total" and "order_date" columns from orders.csv
If a customer has multiple orders, create one row per order.
If a customer has no orders, still include them with empty order fields.
Save as customers_with_orders.csvThat last detail matters. "If a customer has no orders, still include them" versus "only include customers who have orders" will give you very different results. Tell Claude which one you want.
Mixing file formats
Your inputs don't all need to be the same format. You can combine a CSV, a spreadsheet export, and data pulled from a PDF in one instruction:
Read these three files:
- contacts.csv (our CRM export)
- payments.xlsx (accounting spreadsheet)
- vendor_list.pdf (the approved vendor directory)
Match contacts to payments using the "email" column.
Match contacts to vendors using "company_name".
Create a single CSV that shows each contact with:
- Their name and email (from contacts.csv)
- Their total payments to date (from payments.xlsx)
- Whether their company is an approved vendor (yes/no, from vendor_list.pdf)
Save as contact_overview.csvManually, this is an hour of VLOOKUPs and copy-pasting between windows. Here, you describe the logic once.
Resolving conflicts between files
When two files have different values for the same thing, Claude needs to know which one wins:
Read crm_contacts.csv and email_tool_contacts.csv.
Merge by email address.
Where both files have a phone number for the same person, keep the one from crm_contacts.csv.
Where one file has a value and the other is empty, use the non-empty value.
Flag any rows where the names don't match between files — add a "name_mismatch" column with "yes" or "no".
Save as merged_contacts.csvWithout conflict rules, Claude will make a reasonable guess. But "reasonable" might not match what you want. Spell it out.
Building a summary from multiple sources
Sometimes the goal isn't a merged dataset. It's a report that pulls from several places:
Read these files:
- q1_sales.csv
- q1_expenses.csv
- q1_headcount.csv
Create a Q1 business summary that includes:
- Total revenue and total expenses
- Net profit (revenue minus expenses)
- Revenue per employee (total revenue ÷ headcount)
- Top 5 deals by revenue
- The 3 largest expense categories
Format as a markdown document. Save as q1_summary.mdThree inputs, one readable output. No merged CSV needed — just answers.
Folder-based workflows
If you regularly combine files from a folder, make it part of your instruction:
Read all .csv files in the /weekly-drops folder.
Each file has columns: date, store_id, item, quantity, revenue.
Combine all files into one dataset.
Then create two outputs:
1. combined_weekly.csv — all rows merged, sorted by date
2. weekly_summary.md — total revenue by store, and total revenue by item, for the full periodThis works well as a repeatable task. Drop new files into the folder each week, run the same instruction, and get updated outputs.
When things go wrong
Column names rarely match across files. One calls it "email", another "Email Address", another "e_mail". Tell Claude explicitly:
Match using email columns — they may be labeled "email", "Email Address", or "e_mail" across files. Treat them as the same field.Date formats are another common mismatch. One file uses "03/09/2026", another "2026-03-09", another "March 9, 2026". Add this to your instruction:
Standardize all dates to YYYY-MM-DD format in the output.If files have different numbers of columns, that's fine. Just tell Claude which columns you want in the output. You don't need every column from every file.
The instruction pattern for combining files
Most multi-file tasks follow this shape: list the input files (be explicit about what each one contains), specify the join key (what column connects them), define what to include (which columns from which files), handle edge cases (missing matches, conflicts, duplicates), and describe the output (file name, format, sort order).
Here's the template:
Read [file 1] and [file 2].
Match rows using the [column name] column.
Create a combined file with:
- [columns] from [file 1]
- [columns] from [file 2]
When [conflict scenario], [what to do].
If a row in [file 1] has no match in [file 2], [include/exclude it].
Save as [output file].Recap
Combining files is where all the earlier lessons come together. You're reading CSVs, working with spreadsheet exports, pulling data from PDFs, and merging it all into something usable.
The approach is the same as every other lesson: tell Claude what to read, how to match it, what to include, and how to handle the messy parts. The more precise your instructions, the less time you spend checking the output.