skillsFirst-partyReview first Safety · Privacy ·

CSV/Excel Data Wrangler Skill

Clean, filter, join, pivot, and export CSV/XLSX data reliably with reproducible steps. Transform messy spreadsheets into production-ready datasets with pandas. Handle encoding issues, data type conversion, missing values, duplicates, and complex merges.

by JSONbored·added 2025-10-15·

Claude CodeCodexWindsurfGeminiCursorCLI

HarnessClaude CodeCodexWindsurfGeminiCursorCLI

Level:advancedType:generalVerified:draft

Review first — review before installing

Open the source and read safety notes before installing.

Prerequisites

Python 3.11+
pandas
openpyxl
pyarrow (optional for Parquet)
File system read/write access for input CSV/Excel files and output processed files
Sufficient memory for data processing (minimum 2GB RAM recommended for large files, use chunked processing for files >100MB)

Schema details

Install type: package
Reading time: 4 min
Difficulty score: 94
Troubleshooting: Yes
Breaking changes: No

Package metadata

Download URL: /downloads/skills/csv-excel-data-wrangler.zip
Package verified: Yes
SHA-256: 7a075dd66f661d1f928dff7924159040bd6ab2c898f782cbb2862ae79658178e

Skill and platform metadata

Skill type: general
Skill level: advanced
Verification: draft
Verified at: 2025-10-15

Retrieval sources

https://pandas.pydata.org/

Tested platforms

ClaudeCodexOpenClawCursorWindsurfGemini

Platform	Support	Install path
claude-code	Native	.claude/skills/<skill-name>/SKILL.md
codex	Native	.agents/skills/<skill-name>/SKILL.md
windsurf	Native	.windsurf/skills/<skill-name>/SKILL.md
gemini	Native	.gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md
cursor	Adapter	.cursor/rules/<skill-name>.mdc
cli	Manual	AGENTS.md or tool-specific context file

Full copyable content

import pandas as pd

customers = pd.read_csv('customers.csv', dtype=str)
orders = pd.read_excel('orders.xlsx')

# Normalize and dedupe
customers['email'] = customers['email'].str.strip().str.lower()
customers = customers.drop_duplicates(subset=['email'])

# Join and summarize
df = orders.merge(customers, on='customer_id', how='left')
sales_by_region = df.groupby('region', dropna=False)['total'].sum().reset_index()

sales_by_region.to_excel('sales_by_region.xlsx', index=False)

About this resource

What This Skill Enables

Claude can clean, transform, analyze, and merge CSV and Excel files with pandas. Upload messy spreadsheets and get production-ready data pipelines, statistical summaries, and formatted exports.

Compatibility

Native

Claude Code / Claude: native skill usage via SKILL.md.
Codex/OpenAI workflows: compatible with Agent Skills-style SKILL.md content as reusable workflow instructions.

Manual Adaptation

Gemini CLI: native skill usage via .gemini/skills/<skill-name>/SKILL.md or .agents/skills/<skill-name>/SKILL.md where supported.
Cursor: use the generated .cursor/rules/*.mdc adapter for project rules.
OpenClaw and similar agents: use the same skill content as a reusable prompt/workflow file when native skill import is unavailable.

Prerequisites

Required:

Claude Pro subscription
Code Interpreter feature enabled
CSV or Excel file uploaded to conversation

What Claude handles:

Installing pandas, openpyxl, and data processing libraries
Detecting file encodings and formats
Type inference and conversion
Memory-efficient processing of large files

How to Use This Skill

Quick Data Cleaning

Prompt: "Clean this CSV file: remove duplicates, fix missing values, standardize column names, and export as clean.csv"

Claude will:

Load and analyze the file structure
Identify data quality issues
Apply cleaning transformations
Export cleaned version

Data Merging & Joining

Prompt: "Merge customers.csv and orders.csv on customer_id. Show me the combined data and export as customer_orders.xlsx"

Claude will:

Load both files
Detect join keys
Perform the merge (inner/left/right/outer)
Validate results
Export formatted Excel file

Data Analysis & Summaries

Prompt: "Analyze this sales data: show me summary statistics, identify top products, calculate monthly trends, and create a pivot table by region."

Claude will:

Generate descriptive statistics
Perform aggregations
Create pivot tables
Calculate trends
Present insights

Format Conversion

Prompt: "Convert this Excel workbook to CSV files, one per sheet, with UTF-8 encoding."

Claude will:

Read all Excel sheets
Export each as separate CSV
Handle encoding properly
Preserve data types where possible

Common Workflows

CRM Data Cleanup

"Clean this customer export:
1. Remove duplicate emails (keep most recent)
2. Standardize phone numbers to (XXX) XXX-XXXX format
3. Fill missing company names with 'Unknown'
4. Split full_name into first_name and last_name
5. Export as customers_clean.xlsx"

Sales Report Generation

"Analyze this sales data:
1. Calculate total revenue by product category
2. Identify top 10 customers by revenue
3. Show month-over-month growth
4. Create a pivot table: rows=salesperson, columns=month, values=revenue
5. Export summary as sales_report.xlsx with formatted numbers"

Data Validation

"Validate this CSV:
1. Check for duplicate IDs
2. Identify rows with missing required fields (name, email, phone)
3. Flag invalid email formats
4. Report data quality issues
5. Export clean rows and error rows separately"

Multi-File Consolidation

"Combine all CSV files I upload into one master file:
1. Ensure columns match (add missing ones)
2. Add a 'source_file' column
3. Remove duplicates across all files
4. Sort by date column
5. Export as consolidated_data.csv"

Tips for Best Results

Be Specific About Columns: Name the exact columns you want to work with
Describe Your Data: Mention what each column represents for better context
Specify Output Format: Tell Claude exactly how you want the result formatted
Handle Missing Data: Be explicit about how to handle nulls (drop, fill with value, forward-fill, etc.)
Large Files: For files >100MB, ask Claude to process in chunks or sample first
Date Formats: Specify your expected date format (MM/DD/YYYY vs DD/MM/YYYY)
Encoding Issues: If you see garbled text, ask Claude to try different encodings (UTF-8, latin-1, etc.)

Advanced Operations

Complex Transformations

Unpivoting (melt) wide data to long format
Creating calculated columns with business logic
Grouping and aggregating with custom functions
Handling multi-index data
Time series resampling and rolling windows

Data Quality Checks

Outlier detection and reporting
Referential integrity validation
Format consistency checks
Statistical anomaly detection

Troubleshooting

Issue: File encoding errors or garbled characters Solution: Ask Claude to detect encoding or try: "Read this with UTF-8-SIG encoding" or "Try latin-1 encoding"

Issue: Memory errors on large files Solution: "Process this file in 10,000 row chunks" or "Sample 10% of rows first to test"

Issue: Wrong data types (dates as strings, numbers as text) Solution: Be explicit: "Convert created_at column to datetime" or "Cast price to float"

Issue: Merge produces unexpected results Solution: Ask Claude to show sample rows before/after merge and explain the join type used

Issue: Excel export loses formatting Solution: "Export with formatted numbers, bold headers, and auto-column-width"

Learn More

Pandas Documentation - Comprehensive data manipulation guide
Excel to Pandas Mapping - Translate Excel operations
Data Cleaning Best Practices - Common data issues and solutions
Claude Code Interpreter Guide - How Claude processes data

Features

Import/export with explicit schema control
Deduplicate and null-safe transformations
Join/merge/pivot with predictable results
Encoding-aware IO with UTF-8/UTF-8-SIG handling
Parquet round-trips for performance
Memory-efficient processing for large files (chunksize, Parquet)
Data validation and quality checks (outlier detection, referential integrity)
Multi-file consolidation and batch processing

Use Cases

Clean messy CRM exports
Join sales and marketing datasets
Generate analyst-ready summary tables
Data migration and format conversion workflows
ETL pipelines for business intelligence
Automated data quality reporting and validation

Content outline

What This Skill Enables
Compatibility
Native
Manual Adaptation
Prerequisites
How to Use This Skill
Quick Data Cleaning
Data Merging & Joining
Data Analysis & Summaries
Format Conversion
Common Workflows
CRM Data Cleanup
Sales Report Generation
Data Validation
Multi-File Consolidation
Tips for Best Results

#csv#xlsx#data-cleaning#pandas#python

Source citations

Signals

Loading live community signals…

Prerequisites

Schema details

About this resource

What This Skill Enables

Compatibility

Native

Manual Adaptation

Prerequisites

How to Use This Skill

Quick Data Cleaning

Data Merging & Joining

Data Analysis & Summaries

Format Conversion

Common Workflows

CRM Data Cleanup

Sales Report Generation

Data Validation

Multi-File Consolidation

Tips for Best Results

Advanced Operations

Complex Transformations

Data Quality Checks

Troubleshooting

Learn More

Features

Use Cases

Source citations

Related resources

CLI Data Visualization Quickstart Skill

DOCX Report Generator Skill

FastAPI Python API Development Skill

Python Data Science - CLAUDE.md Rules for Claude Code

Signals