Data Readiness Assessment for AI Demand Forecasting

Most AI demand forecasting projects that stall in pilot or get abandoned within 12 months share a common root cause: the data environment was not ready before the tool was selected. Not underpowered hardware, not change management resistance, not vendor limitations — the data.

This guide is a structured assessment framework. Work through it before you issue an RFP, before you schedule vendor demos, and definitely before you sign a contract. The output is a readiness verdict at each gate — either a clear go, a conditional go with remediation steps, or a stop that tells you what to fix first.

Why Data Readiness Fails to Get Assessed Properly

The typical failure pattern looks like this: a vendor demo shows impressive forecast accuracy on their reference dataset. The buying team is convinced. IT is asked to "connect the ERP." Three months later, the implementation partner is requesting data extracts that nobody prepared for, and the project is in remediation mode.

Vendor-led assessments are structurally biased toward optimism. A vendor's pre-sales team is not incentivized to tell you that your data is too sparse or too dirty to support the product they are selling. Their scoping questionnaires ask whether data exists — not whether it is fit for ML training.

The assessment in this guide is structured around four dimensions that actually determine whether an AI forecasting model can be trained, validated, and maintained in production: history depth, data quality and consistency, feature availability, and integration architecture. Each has specific minimum thresholds and common failure modes.

Dimension 1: History Depth

AI forecasting models learn from patterns in historical demand. Without sufficient history, the model cannot distinguish signal from noise — and more importantly, it cannot learn seasonal patterns, promotional effects, or lifecycle curves.

Minimum History Requirements by Model Type

Minimum demand history thresholds by AI forecasting model type. These are training requirements, not archival requirements.
Forecasting Approach	Minimum History Required	Preferred History	Notes
Statistical baseline (ARIMA, ETS)	12 months	24–36 months	Adequate for stable, low-seasonality SKUs
Gradient boosting (XGBoost, LightGBM)	18–24 months	36+ months	Needs enough seasonal cycles to learn patterns
Deep learning (LSTM, Transformer-based)	24–36 months	48+ months	Data-hungry; sparse SKUs degrade accuracy significantly
Probabilistic / quantile forecasting	24 months minimum	36+ months	Requires enough tail observations to calibrate intervals
Causal / external feature models	Matches longest external signal lag	36+ months	External data (weather, promotions) must align temporally

Assessing Your History Depth

Extract a representative sample of 200–500 active SKU-location combinations from your ERP or order management system.
For each combination, calculate the number of consecutive months with at least one transaction. Gaps of 3+ months are not continuous history.
Segment results into buckets: <12 months, 12–24 months, 24–36 months, 36+ months. The distribution matters more than the average.
Flag any SKUs introduced through acquisition, system migration, or re-SKU events — their "history" may not reflect actual demand continuity.
Identify what percentage of your active revenue-generating SKUs fall below the minimum threshold for your target model type.

If more than 30% of your active SKUs fall below the minimum threshold, you have a structural history problem, not an edge case. The remediation options are limited: extend the assessment period before deploying, accept that those SKUs will use statistical fallbacks rather than ML models, or use cross-SKU transfer learning if your vendor supports it (few do well at this).

Dimension 2: Data Quality and Consistency

History depth is a prerequisite. Data quality determines whether that history is usable. The two most common quality failures in demand data are demand signal contamination and structural inconsistency across time periods.

Demand Signal Contamination

Most ERP systems record shipments or orders, not actual end-customer demand. If you are training a model on shipment data, you are training it on a signal that includes stockouts (suppressed demand), forward buying, channel fill, and order batching. The model learns the wrong thing.

Stockout periods: If a SKU was out of stock for 2 weeks in a given month, the recorded shipments understate true demand. Uncorrected, the model learns that demand was lower than it was.
Returns and cancellations: If your data includes gross shipments without netting out returns, the model sees inflated demand in some periods.
Intercompany transfers: Transfers between DCs or between legal entities often appear in the same demand stream as customer orders. They need to be separated before training.
Promotional spikes without flags: A 3x demand spike during a promotion looks like a structural anomaly to a model that has no promotional calendar attached to the data.

Structural Inconsistency

Structural inconsistency occurs when the underlying data collection method or business process changed during the historical period. Common triggers include ERP migrations, acquisitions, channel restructuring, and changes in order capture systems.

A model trained across a structural break will learn a pattern that does not exist. If your company migrated from SAP ECC to S/4HANA 18 months ago and the data mapping was imperfect, the first 18 months of "history" in your current system may not be comparable to the most recent 18 months. Concatenating them and treating the result as 36 months of consistent history is a mistake that will surface as unexplained model degradation.

Quality Assessment Checklist

Stockout identification: Can you identify periods where demand was supply-constrained? Do you have a stockout flag or inventory position history at the SKU-location level?
Transaction type separation: Is customer demand separable from intercompany, sample, and return transactions in your source system?
Promotional calendar: Does a structured promotional calendar exist, and can it be joined to the demand history by SKU and date?
ERP migration history: Has the demand data been validated across any system migrations in the historical period?
SKU master consistency: Have SKUs been re-numbered, split, or merged during the historical period? Is there a crosswalk table?
Granularity consistency: Is the demand data available at the same time and location granularity throughout the entire history?

Dimension 3: Feature Availability

AI forecasting models improve over statistical baselines primarily because they can incorporate features beyond historical demand — pricing, promotions, product attributes, external signals. If those features are not available in a structured, joinable form, the model's advantage over a well-tuned statistical model narrows considerably.

Feature availability assessment: what each feature enables and what happens when it is absent.
Feature Category	Examples	Readiness Condition	If Missing
Promotional data	Price changes, trade promotions, display events	Structured calendar joinable by SKU and week	Model cannot learn promotional lift; spikes treated as noise
Product attributes	Category, brand, pack size, lifecycle stage	Available in item master, linkable to demand history	Cross-SKU learning degraded; new product forecasting impaired
Pricing history	List price, net price, competitor price (if available)	Time-stamped price table at SKU level	Demand-price elasticity cannot be modeled
External signals	Weather, economic indices, web traffic	Aligned to demand geography and time granularity	Usable only if temporal alignment is correct; misalignment degrades accuracy
Inventory position	On-hand, in-transit, backorders	Daily or weekly snapshot history	Stockout correction and service-level modeling not possible
Customer segmentation	Channel, customer tier, region	Joinable to order-level demand data	Model cannot learn channel-specific patterns

The practical question is not whether these features would be useful — they almost always are — but whether they exist in a form that can be joined to your demand history at the right granularity and time resolution. A promotional calendar that lives in a spreadsheet, updated manually by the trade marketing team, is not the same as a structured promotional table in your data warehouse.

Dimension 4: Integration Architecture

Even with good data, a deployment can fail at the integration layer. AI forecasting tools need to receive data from your source systems on a defined schedule, and they need to push forecast outputs back into your planning or ERP systems in a format those systems can consume. Both directions matter.

Inbound Data Flow

Most AI demand forecasting tools are SaaS-based and expect data delivered via API, SFTP, or a cloud data warehouse connector. The questions to answer before deployment:

What is the latency between a transaction in your ERP and its availability in the data layer that feeds the forecasting tool? If it is 48+ hours, your demand sensing use cases are effectively off the table.
Who owns the data pipeline? If the answer is "we will figure it out," that is a red flag. Pipeline ownership needs to be assigned before go-live, not after.
What happens to the pipeline when the ERP is patched or upgraded? This is not hypothetical — SAP and Oracle release updates on defined schedules, and data extracts frequently break.
Is the data delivered in the granularity the model expects? Some tools require daily demand at SKU-location; others accept weekly. Aggregation and disaggregation logic needs to be defined.

Outbound Forecast Consumption

The forecast output needs to land somewhere useful. Typical destinations are an S&OP planning tool, an inventory optimization module, or directly into ERP as a planned demand signal. Each has different format requirements and different tolerance for forecast granularity mismatches.

A common integration failure: the AI tool produces forecasts at a daily SKU-DC level, but the downstream planning system only accepts weekly forecasts at the SKU-region level. The aggregation logic needs to be built, tested, and owned — and it is rarely included in a vendor's standard implementation scope.

Readiness Scoring and Decision Gates

After completing the assessment across all four dimensions, you need a way to translate findings into a deployment decision. The framework below maps assessment outcomes to three possible verdicts.

Data readiness scoring gates. A single Red dimension is sufficient to recommend remediation before proceeding with vendor selection or deployment.
Dimension	Green (Proceed)	Yellow (Conditional)	Red (Stop)
History Depth	≥24 months at SKU-location for >70% of active SKUs	18–24 months for majority; <30% below threshold	<18 months for majority, or >30% of revenue SKUs below threshold
Data Quality	Demand signal is clean, stockouts flagged, no structural breaks	Known issues with documented remediation plan and owner	Contaminated demand signal with no cleaning plan; unresolved structural break
Feature Availability	Promotional calendar and product attributes structured and joinable	Key features exist but require ETL work to make joinable	Critical features (promotions, pricing) exist only in unstructured or manual form
Integration Architecture	Defined pipeline owner, tested extract, documented output format	Pipeline design exists but not tested end-to-end	No pipeline owner assigned; format mismatch between tool output and planning system input

A single Red verdict is a stop condition. Two or more Yellow verdicts without a concrete remediation timeline should also be treated as a stop, because the remediation work will land during the deployment and extend it unpredictably.

Segment-Specific Considerations

The thresholds above are general. Some business contexts shift the requirements materially.

High SKU Turnover (Fashion, Consumer Electronics)

When a large proportion of your SKUs are new each season, the history depth problem is structural and permanent. AI models for these environments need to rely heavily on product attribute similarity and cross-SKU transfer learning. The data readiness question shifts: do you have rich, structured product attribute data that can serve as a proxy for demand history on new introductions? If your item master is sparse or inconsistently populated, this approach also fails.

Highly Promoted Categories (CPG, Grocery)

For CPG and grocery, the promotional calendar is not a nice-to-have feature — it is a requirement. Promotions can represent 30–60% of volume in some categories. A model trained without promotional flags will learn that certain weeks have unexplainably high demand, and it will try to forecast those spikes as if they were structural. The result is systematic over-forecasting in non-promoted periods.

Post-Acquisition or Post-Migration Environments

If your organization has completed a major acquisition or ERP migration within the last 24 months, treat your history depth as starting from the point where data was consolidated into a single consistent system — regardless of what legacy archives exist. The effort required to reconcile pre- and post-migration data into a training-ready format is almost always underestimated.

Remediation Sequencing

If your assessment produces Yellow or Red verdicts, the remediation sequence matters. Not all fixes take the same time, and some unlock others.

Assign pipeline ownership first. Without a named owner for the data pipeline, no other remediation will stick. This is an organizational decision, not a technical one.
Clean the demand signal before structuring features. Transaction type separation and stockout flagging need to happen first. Adding promotional features to a contaminated demand signal does not help.
Document structural breaks and decide on truncation. If there is an ERP migration break in your history, decide whether to truncate history at that point or invest in reconciling the pre-migration data. Truncation is almost always the faster path.
Structure promotional and attribute data in parallel. This work can proceed alongside demand signal cleaning. Target a joinable, versioned promotional calendar in your data warehouse before the model training phase.
Test the integration end-to-end with synthetic data. Before loading real training data into a vendor environment, validate that the pipeline can deliver data in the required format and that forecast outputs land in your planning system correctly.

What This Assessment Does Not Cover

Data readiness is one of four readiness dimensions for AI demand forecasting implementation. A clean data environment does not guarantee a successful deployment. The other dimensions — organizational readiness (who owns the forecast and how overrides are governed), model governance (how drift is monitored and retraining is triggered), and change management (how planners interact with AI-generated forecasts) — each carry their own failure modes.

Teams that complete this data assessment and find themselves at Green across all four dimensions are in a position to move to vendor evaluation with confidence. Those at Yellow have a concrete remediation backlog. Those at Red know what to fix before spending further on the implementation.

Data Readiness Assessment for AI Demand Forecasting Implementation