Skip to main content
Portfolio Projects Analysis of US avocado sales and prices (2015–2020)
Analysis of US avocado sales and prices (2015–2020)
2025

Analysis of US avocado sales and prices (2015–2020)

Exploratory analysis and modelling of avocado sales data (organic and conventional) in multiple US markets using R. The project includes outlier detection, correlation analysis, calculation of price-sales elasticities and price forecasting using time series models.

R

Year

2025

Tools

1

Reading

3 min

The problem

Avocados have become a staple of the American diet, but their price is highly volatile: it ranges from $0.44 to $3.25 per unit. Producers and distributors need to understand how price changes affect sales, and whether organic avocados respond differently from conventional ones.

This project analyses 33,045 weekly records from multiple markets (2015‑2020) to answer those questions using exploratory analysis, price-sales elasticity, and time series modelling.


The approach

I designed a three-phase analysis pipeline.

Phase 1 · Exploratory analysis (EDA) — R

I used R with tidyverse, ggplot2 and dplyr to examine the dataset structure. I detected massive outliers (especially in organic avocados), analysed price distribution by region and product type, and calculated key statistics.

  • Average price: $1.38 (min $0.44, max $3.25; Q3 = $1.62)
  • Average weekly volume: 968,400 units (Q3 ≈ 505,828, max ≈ 63.7M)
  • Regions with highest average price: Albany ($1,684) and Boston ($1,743)

Phase 2 · Price-sales elasticity — R

I fitted log-log linear regression models to measure sales sensitivity to price, separating the models by avocado type. The results show an inverse but weak relationship:

TypeCovarianceCorrelation
Conventional-122,979.7-0.092
Organic-3,027.0-0.047

Phase 3 · Time series forecasting — R

I selected the organic avocado price series for Albany (one of the highest-priced regions). I applied seasonal decomposition using decompose() and generated a 12-week forecast using ARIMA models combined with exponential smoothing (forecast package).


The findings

Finding 1 · Conventional avocados are highly price sensitive

Elasticity of -1.32. A 10% price increase reduces sales by 13.2%. Conventional avocados require very careful pricing strategies to avoid damaging demand.

Finding 2 · Organic avocados tolerate price increases better

Elasticity of -0.767. A 10% price increase reduces sales by only 7.67%. Organic buyers prioritise quality or sustainability, allowing for higher margins and more flexible pricing policies.

Finding 3 · Albany prices are expected to remain stable

The 12-week forecast shows no sharp fluctuations. This provides certainty for purchasing and inventory management in that region. Albany, along with Boston ($1.74), is one of the highest-priced regions.


The conclusion

The analysis yields three commercial implications:

  • High market volatility (massive outliers in price and volume) requires continuous monitoring to anticipate changes.
  • Organic avocados allow higher margins; conventional ones require more rigorous optimisation strategies.
  • The expected stability in Albany favours purchasing and inventory planning.

This project demonstrates how to combine descriptive statistics, log-log regression and time series modelling to generate solid, decision-oriented commercial recommendations.


Stack and methodology

ToolUse in the project
R · tidyverse, dplyr, ggplot2Cleaning, EDA, visualisations
R · lm, log-logElasticity models
R · decompose, forecastTime series and forecasting
Kaggle (Avocado Prices 2020)Data source

Dataset: 33,045 records · 2015–2020 · price, volume, region, type

Regression model: log-log · conventional: elasticity = -1.32 · organic: elasticity = -0.767

Forecasting: ARIMA + exponential smoothing · 12-week horizon

Let's talk

Interested in this project?

Feel free to reach out if you'd like to know more or exchange ideas.