Skip to content

Latest commit

 

History

History
145 lines (109 loc) · 5.86 KB

File metadata and controls

145 lines (109 loc) · 5.86 KB

Sales Data Analysis Project

1. Project Overview

Objective:
To analyze sales performance, identify trends, and evaluate product category performance using historical sales data.

Scope:

  • Time period covered in the dataset
  • Key metrics such as total sales, average order value, and customer count
  • Monthly sales trends and product category performance

Outcome:

  • Clear understanding of sales patterns
  • Identification of top-performing product categories
  • Data-driven insights for business decision-making

2. Key Components

A. Data Collection

Sources:

  • Sales data extracted from a CSV file (sales_data_sample.csv)

Metrics Tracked:

  • Total sales
  • Average order value
  • Number of unique customers
  • Monthly sales trends
  • Sales by product category

B. Analysis Framework

Quantitative Analysis:

  • Aggregation of sales by month and product category
  • Calculation of key performance indicators (KPIs)

Qualitative Analysis:

  • Interpretation of trends in monthly sales
  • Comparison of product category performance

Tools Used:

  • Python (Pandas for data manipulation, Matplotlib & Seaborn for visualization)

C. Key Findings

  • The dataset covers sales from January 29, 2003 to May 31, 2005.
  • Total sales amounted to $2,261,756.77.
  • The average order value was $3,842.61.
  • 92 unique customers were recorded in the dataset.
  • Monthly sales trends showed stable (the data shows consistent sales activity without a clear upward or downward trend over time)
  • The top-performing product category was Classic Cars.

D. Recommendations

  • Focus on promoting high-performing product categories.
  • Investigate fluctuations in monthly sales to identify underlying causes.
  • Enhance customer engagement strategies to increase order value.

3. Implementation

  • Data was loaded and cleaned, ensuring correct date formatting.
  • Key metrics were calculated and visualized using bar charts for trends and horizontal bar charts for category performance.

4. Results & Impact

  • The analysis provided actionable insights into sales performance.
  • Visualizations helped quickly identify trends and top performers.

5. Visual Aids

  • Monthly Sales Trend: Bar chart showing sales over time. Monthly Sales Trend

  • Sales by Product Category: Horizontal bar chart ranking categories by total sales. Sales by Product Category

6. Lessons Learned

  • Proper data encoding is crucial when handling datasets with special characters.
  • Visualizations significantly improve the interpretability of sales trends.
  • Regular analysis of sales data can help in proactive decision-making.

Here’s a structured breakdown of the provided Python code for sales data analysis:

1. Importing Libraries

Importing Libraries

  • Purpose:
    • pandas: Data manipulation and analysis.
    • matplotlib.pyplot and seaborn: Data visualization.
    • datetime: Handling date-time conversions.

2. Loading the Data

Loading the Data

  • Key Actions:
    • Attempts to read a CSV file named sales_data_sample.csv with Korean encoding (cp949).
    • Uses a try-except block to handle potential errors (e.g., file not found, encoding issues).
    • Prints a success message if loaded correctly.

3. Displaying Basic Data Info

Displaying Basic Data Info

  • Outputs:
    • Data Overview: Shows a snapshot of the first 3 rows.
    • Data Structure: Summary of columns, data types, and missing values.

4. Data Cleaning & Transformation

Data Cleaning   Transformation

  • Purpose:
    • Converts ORDERDATE to a datetime format for time-based analysis.
    • Creates a new column YEAR_MONTH to aggregate sales by month.

5. Generating Key Metrics

Generating Key Metrics

  • Metrics Calculated:
    • Time Period: Date range of the dataset.
    • Total Sales: Sum of all sales.
    • Avg. Order Value: Mean sales per order.
    • Unique Customers: Count of distinct customer names.

6. Data Visualizations

A. Monthly Sales Trend

Monthly Sales Trend

  • Steps:
    1. Groups data by YEAR_MONTH and sums SALES.
    2. Plots a bar chart with formatted labels/titles.

B. Product Category Performance

Product Category Performance

  • Steps:
    1. Groups sales by PRODUCTLINE, sorts values in ascending order.
    2. Plots a horizontal bar chart for easy comparison.

7. Error Handling

Error Handling

  • Purpose:
    • Catches and displays errors (e.g., file not found, encoding issues).
    • Provides troubleshooting steps for common problems.

Key Takeaways

  1. Data Loading: Explicit encoding handling for non-ASCII characters (e.g., Korean).
  2. Transformations: Date parsing and period extraction for time-based analysis.
  3. Metrics: Focus on sales performance and customer engagement.
  4. Visualizations: Bar charts for trends, horizontal bars for categorical comparisons.
  5. Robustness: Error handling to guide debugging.