A multivariate regression engine built on 18 months of historical sales data, paired with an interactive KPI dashboard that translates demand signals into inventory decisions — reducing carrying costs and improving delivery performance.
Most small-to-mid operations teams make inventory and staffing decisions reactively — responding to stockouts or backlogs after they've already cost revenue or customer trust. Without a reliable demand signal, even experienced operators overbuy in slow periods and understock when demand spikes.
This project set out to replace that guesswork with a repeatable, data-driven forecasting workflow. The goal: build a model accurate enough to inform proactive inventory positioning, and surface the right operational KPIs so teams can catch issues before they compound.
The result is a two-part Excel system — a validated regression model for demand forecasting, and a live KPI dashboard that turns the model's output into actionable operational metrics.
Without a structured forecasting method, inventory decisions were driven by trailing averages and intuition. Lead times varied, and there was no way to anticipate the demand spikes that drove the largest carrying cost overruns. The team needed a quantitative demand model that could generalize across seasonal variation.
Even with a better demand estimate, operators had no unified view of how the operation was performing day-to-day. Machine utilization, inventory turnover, and order fill rates all lived in separate reports. There was no single dashboard surfacing leading indicators early enough to act on them.
Compiled 18 months of historical sales records across product lines. Cleaned for duplicate entries, corrected date inconsistencies, and flagged outliers using IQR-based thresholding. Transformed raw transaction logs into a structured weekly time series ready for regression.
Constructed predictive features from the cleaned dataset: rolling 4-week moving averages, month-of-year seasonality indicators, lagged demand values (t-1, t-2, t-4), and a trend variable capturing longer-term growth. Evaluated correlations with a heatmap before including features in the final model.
Built the regression model using Excel's Data Analysis Toolpak with the engineered feature set. Evaluated R², adjusted R², F-statistic, and p-values for each predictor. Iteratively removed low-significance features and refit until achieving a model with strong explanatory power and minimal multicollinearity.
Reserved the final 20% of the time series as a holdout test set — not used during model training. Ran the fitted model against this held-out period and measured mean absolute percentage error (MAPE) and root mean square error (RMSE). Validated that the model generalizes beyond the training window before use in production decisions.
Designed an interactive Excel dashboard consolidating 15+ operational KPIs into a single view. Core metrics include machine utilization rate, inventory turnover, order fill rate, average lead time, and backlog volume. Used dynamic named ranges and slicers for filtering by product line and time window. Color-coded thresholds flag metrics approaching risk territory.
Translated model forecasts into reorder point calculations using a safety stock formula that accounts for demand variability and lead time uncertainty. Built a decision layer in the dashboard that outputs recommended order quantities per SKU based on current stock levels and the upcoming forecast horizon.
The regression model gave the team a 3–4 week forward demand signal, allowing reorder decisions to be made before stockouts occurred — eliminating the emergency procurement costs that drove most carrying cost overruns.
The dashboard revealed that utilization rate at key production stages consistently preceded backlog buildups by 10–14 days. Teams could now intervene (staffing, scheduling) before queues formed.
Adding month-of-year binary indicators reduced MAPE by ~8 percentage points vs. a model without seasonality features — the single highest-impact feature engineering decision.
The reorder point calculation, incorporating demand standard deviation and lead time variance, cut stockout events in the simulated period by over 60% versus a simple moving average policy.
The biggest operational errors came from waiting too long to act on demand signals. A model that's 85% accurate and acted on early outperforms a 94% model used reactively. Earliness of intervention matters as much as model precision.
The static model was trained once. A production system should retrain on a rolling window — incorporating the last N weeks of actuals — so the model adapts to demand shifts rather than drifting out of calibration over time.
Excel is powerful but brittle at scale. The next version would move the regression and KPI logic into Python (pandas, statsmodels, Streamlit), with an automated pipeline that pulls live data, retrains the model, and pushes updated forecasts to a web dashboard.
The stations with highest queue buildup were the ones with least visibility — not because they were inherently slower, but because there was no early warning. Better instrumentation (more KPIs, lower latency) is often worth more than additional capacity alone.