AI 2026 Roadmap — Phase 1 Execution

by Joaquin Molina | Mar 16, 2026 | Artificial Intelligence, MVNO

Introduction

In the previous article “MVNOs’ Route to Bridging the AI Chasm” established WHAT MVNO leaders should do in 2026 to turn prior AI pilots into production and achieve real financial outcomes in their operations. Following, article “From AI Pilots to Profits: 2026 Roadmap for MVNOs” proposed HOW MVNO leaders could define a clear Roadmap to execute it. This new chapter zooms into the execution of the proposed Roadmap’s first phase.

Goal for this first phase should be direct and simple: to get the data right. And that means completing the data preparation work, gettting a Feature Store in production, and laying the foundation for deployment of the first wave of machine learning use cases through the following phases.

To achieve this goal, clean data feeds are needed, repeatable checks must be introduced, and a Feature Store built that different teams can grow, use, and reuse without long development cycles.

What to execute in Phase 1?

This initial phase should aim at completing three things:

1. Analyze and clean the source data feeds that MVNO receives from the MNO or internally builds. Source data feeds such as:

Call Detail Records (CDRs) for Voice and SMS services, both Mobile Originated (MO) and Mobile Terminated (MT), as well as Data services.
Ordering/Sales data to trace service usage to revenues, and
Customers data table to link customer identities, service subscriptions, and mobile numbers (MSISDNs).

Data should be analyzed for structure consistency, field completeness, time logic, and practical fitness for billing checks and business use.

2. Stabilize data pipelines ensuring consistent formats, checks, and anomaly alerts.

3. Build up the Feature Store with the first feature groups ready for both Machine Learning training processes and production live use cases.

Scope of this initial phase should be kept tight, ensuring a strong foundation to build upon.

What must be achieved in the initial analysis?

Analysis of raw data has one main purpose: to prove that data is both correct and useful! Below are examples of validation criteria:

Evaluate the consistency of the structures and formats across various data source models. By analyzing the raw CDR information, specific types of intelligence can be identified and actionable insights can be extracted.
Run integrity checks to verify that key CDR fields are present and associated time logic is being held. Test nulls or invalid data in essential fields like MSISDN, timestamp fields, duration, charged amounts, and used rating groups. Once these checks are successfully executed, basic event integrity can be ensured.
Link usage to money and context processing Ordering data to connect usage to charges and fees. The Customers data table provides the lifecycle status for each customer, as well as the status history, which is very valuable information to obtain lifecycle insights.
Build usage views to create a daily time series for session count, traffic volume in Gigabytes, and distinct MSISDNs that generate events. It is also important to map the behavior of rating packs by day and keep all query data documented to enable on-demand re-runs, if needed.

What to fix and how to clean the raw data flows?

Turning data analysis into action, the goal should be to deliver the data in its best way: meaningful, clean, and ready.

Setting data rules and checks to define explicit field lists, formats, and value rules for each data feed. Every file in the data feed must be processed through automated checks and, if a file check fails, file must be quarantined, and team providing the failed-check data file must be notified, thus preventing faulty data from being silently used in downstream processes. These checks will avoid bad data from slipping into live features.
Enforcing duplicate controls, event identity keys and tie breaker rules for late arrivals and repeats are defined. The rule shall keep the best record and drop the rest, thus preserving truth without inflating usage.
Mapping and reclassification of rating packs to align internal system rating packs with commercial service plans, and building rules to correct known mislabels in traffic type. This improves both product analysis as well as customer care scenarios that rely on traffic class.
Monitoring of time series to track daily and hourly volumes, distinct MSISDNs, session counts, and average usage per line, setting alerts on spikes, drops, and silent periods. This monitoring helps catching both data issues and real business anomalies.
Documentation in the form of short guides is important to train the teams that will use the data.

With these action in place, data feeds can be assessed to be stable and trusted, thus providing the base that is needed to switch on the Feature Store.

What the Feature Store is and how it works?

Features are engineered variables used in ML models to predict or classify outcomes. For example, when looking at a person, several features can be determined, such as height, weight, facial characteristics, hair color, etc. Exactly the same can be applied to telecommunication specific data information, such as average duration of calls, different numbers that a person makes calls to or receives calls from, number of different locations where the subscriber has used the mobile services, SIM card related info, and so on. And the Feature Store hosts all these Features making them available to different teams to be used both in training Machine Learning (ML) models, as well as production live use cases.

Design of the Feature generation pipeline should be kept quite simple, with three zones and a set of core entities.

Zones

Raw Data zone where source files land exactly as they are received. This is the initial point, yet without edits or filters.
Clean Data zone (Data Lake), where schema checks, duplicate controls, and stable keys are applied. Also, dates and formats are standardized.
Feature Store zone where ready to use features are computed and grouped by business themes. Each feature will have a name, a clear description, an owner, a refresh plan, tests, lineage, and sample code for use.

This layout keeps lineage clear and makes it easy to trace any decision back to its inputs and logic.

Every feature will bind to one of the Feature groups below. Features will also keep consistent event time or snapshot time, allowing safe point in time to join and avoid data leakage into the model training.

Examples of Feature groups

Initial Feature groups will be the set that supports the most common commercial and customer care use cases:

Usage and consumption: daily, weekly, monthly and yearly traffic volume; session count; average session size; most active hour; peak/off-peak usage, weekday/weekend usage, MO/MT ratio, among many other features that may be relevant.
Products and plans: active plan; add on packs; days since last plan change; count of plan changes in the last period.
Spend and revenue: top ups; recurring charges; promotions and refunds; billed value in the current cycle; days since last payment.
Network and quality: simple throughput views; session failure rate; drop rate when events allow.
Lifecycle: days since activation; days since first event; days since last event; signs of inactivity.
Risk and fair use: device swap count; unusual location patterns; simple rules for fraud alerts.

All features will live in the Feature Store and the store will keep the definition, owners, refresh rules, checks, and sample queries. It becomes the single place to learn how a feature is built and how to use it.

The Feature Store makes these features available in two main ways – offline and online. The offline mode is used to train Machine Learning models, while online mode will be used for quick lookups, supporting live actions in apps, web or care systems.

Batch and real time

Two different paths can be run:

The batch path updates hourly and daily features that need windows, like seven-day averages and cycle to date totals. Such batch path will be useful for training and for daily or hourly decisions.
The real time path updates fast counters and last seen values as new data files arrive, which allows the execution of models that need to be executed immediately, such as anomaly detection scenarios.

This split is convenient to keep costs under control while meeting the needs of most use cases.

Trust, tests, and controls

Each feature has:

Freshness goals and quality goals that will be tracked.
Validation tests that will run at compute time.
Drift checks to catch sudden changes in distributions.
Lineage and versioning to enable tracing a score to its inputs and code.

Last, it is essential to keep role-based access with clear rules. Product teams should see the plan features. Customer care teams should access churn and risk flags as needed. Finance teams should use revenue audit views. Data scientists and analysts shall be set for wider access for machine learning model work.

What should the deliverables be?

First phase will be considered completed when the Feature Store is live, offering to the various MVNO teams:

Stable pipelines from CDRs, Ordering/Sales, and Customers, with structure checks, null checks, time logic checks, and evidence from the analysis work. These steps ensure consistency across different tables, confirmed common formats for Data, Voice, and SMS, and validated key fields and time rules.
A Feature Store with the first wave of usage, plan, spend, lifecycle, network, and risk features.
Docs and examples in plain words showing the math when needed and explaining how teams can use the features.
Ownership and governance for each feature group, approval rules for changes, how to test a new feature, and how to retire one when it no longer adds value.

This initial phase is the crucial step to unlock speed. Teams will not have to build features from scratch for every project. They will be able to pick existing features from the shelf and build faster.

How will this help MVNO leaders?

There are three practical gains:

One source of truth: a shared set of features defined across product, customer care, finance, and growth teams.
Faster time to value: with the base features ready and tested, new use cases will be executed from idea to live in very short time.
Better audit and control: with clear links from usage to money and solid checks, operational questions will be answered with confidence and speed.

In simple terms, the Feature Store turns data into a product that teams can pick up and use.

What should be next?

With trusted features in place, phase 2 should be about getting machine learning in production, focusing on use-cases that drive clear outcomes and that can be measured in weeks instead of months. Examples of use-cases are:

Churn risk score and retention actions use case: by scoring churn risk, customer care teams can act with the right message and the right incentive at the right time. Implementation should start with a clear control test, measuring retention rate and cost per retention, and expand afterwards by segment.
“Plan the right size” use case to suggest plans that fit real usage. The goal shall be to raise customer satisfaction while protecting margins.Acceptance, churn, and net revenue impact should be measured.
Basic fraud and misuse alerts use case: implement simple rules and some high value features to flag risky behavior, measuring precision and time to action.

Each use case will need a clear target metric, a run book for experiments, and a roll out plan starting small and growing with evidence.

What are the risks and how to manage them?

No plan is risk free, and required controls must be planned in a few areas:

If files arrive late, some features may miss their target update. Alerts are needed and delayed data must be computed as soon as it lands.
When usage behavior shifts, a model trained on past patterns can lose accuracy. Drift monitors and fresh training cycles should help to adapt faster.
Starting with simple, strong baselines before moving to complex models which keeps explainability and trust high and helps in learning what works.
New features and models change how teams work. Change management is essential to keep the documentation simple, host short sessions, and keep a feedback loop open with customer care, product, and finance teams.

How to measure success?

A few clear metrics can be tracked:

Feature adoption: number of teams and use cases using the Feature Store.
Time for the first result: days from idea to first live test.
Data quality: freshness, completeness, and drift across key features.
Business impact: retention rates, revenue per user (ARPU), and customer satisfaction scores.

Conclusion

I said 2026 would be the year MVNOs move from AI pilots to profits. The initial phase is about laying the foundations: analyzing the data feeds, fixing the weak points, building strong data pipelines, and bringing the Feature Store into production. The initial phase is completed once the data is ready and the Feature Store is live, holding reusable features that teams can trust and use right away.

Following upon the initial phase, the first machine learning models can be built into the flow of the business, starting with simple use cases that ensure visible impact on the MVNO operation. Examples are churn prediction, plan the right size, and basic fraud alerts use cases.