Change Happens: Adaptability of Machine Learning Models
Change Happens: Adaptability of Machine Learning Models

Authored by Terry Leslie, Vice President of External Research & Development

“There is nothing permanent except change.” – Heraclitus

This saying from the ancient Greek philosopher reflects a major challenge companies are facing today in using machine learning models for their businesses.

This comes into play when the statistical properties of the target variable, which the model is trying to predict, change over time in unforeseen ways. In machine learning terms, this change is described as “concept drift.” Given the dynamic world we live in, especially when it involves human behaviors, you might expect this to be a more common occurrence than not.

Concept drift causes problems because predictions become less accurate as time passes.1 Machine learning models are usually static models created from historical data and the resulting model prediction reliability decreases over time. Many factors drive this change, ranging from changing customer behaviors, to change driven by normal economic cycles. Concept drift also results from inadequate data sets being used to create the machine learning model. If a long-term trend is not reflected in a training data set, then the model may not predict true customer behavior over time.

Concept drift illustrates a weakness of most machine learning models because they are not adaptable to change. As change in the input data occurs, the model’s decisions and predictions accumulate larger error, which can be very detrimental for applications in business, safety, and healthcare. To maintain accuracy the model must be updated or replaced. This raises some fundamental questions. Can a model be updated, or is a new model required? How large of a data set is needed and what are the attributes required in the data to make a valid model? Is the data available to create a new model? Are the resources, time, and skills available to create the data set and the model? There are also many questions related to identifying the frequency of model changes and the metrics needed to make model change decisions.

A search on Google Scholar alone demonstrates the importance of concept drift to the machine learning community, and the breadth of the issue. More than 17,000 papers have been published on this and related topics over the last five years, 11,000 of which were published in the last two years. Let’s examine some of the biggest challenges identified amidst all of this discussion:

  • Today’s machine learning models cannot be updated. The models are built and optimized by first designing and creating a network, then training, optimizing, and testing the model with large data sets. This is a complex process, which requires highly skilled individuals, large data sets, huge computational resources and capabilities, and a lot of time. Unfortunately, current model design and structure means neural network machine learning models are not adaptable to incremental change, despite the intense investment of time and resources. Another factor is that machine learning models are not interpretable. This means that humans cannot reason why particular inputs produce specific outputs. Even the creator of the model does not understand it because of the complexity of the model. Therefore, incremental updates and improvements cannot be made to a model that is not understood. When an existing model is inadequate, a new model needs to be built … and in this case, that means starting from scratch. (Read more on this in our recent post on machine learning model interpretability.)
  • The inability for a machine learning model to adapt to change impacts model usability in several ways. Without adaptability, the retraining of models must be performed by submitting new training data sets, where new features can be seen in the context of previously learned information. Training a new model is a massive exercise requiring many, many hours to complete and requires the use of large-scale computational facilities. As a major user of machine learning models, Google has recognized the inability of a model to react to change. The company has published a paper describing the challenges for managing the creation of new models with continuously changing data in a production environment. They describe, “setting up a pipeline that reliably ingests training datasets as input and generates a model as output, in most cases doing so continuously (so that new models are generated as fresh datasets arrive)”.2 Imagine the resources required to collect, curate, and annotate the data sets on a continuous basis along with training the new models to maintain this production flow.In another paper, Google describes their work to improve and manage the huge computational costs required to train large-scale deep neural networks by distributing training and inference for deep networks across large-scale clusters of machines. The authors describe an example of a model network used for acoustic processing and speech recognition, stating, “The network was fully-connected layer-to-layer, for a total of approximately 42 million model parameters,” which was, “trained on a data set of 1.1 billion weakly labeled examples.” 3 In the paper they describe the training time versus computational resources used for various innovative methods in the training. The training times noted begin at 75 hours of training on hundreds of processing cores to approximately 25 hours on over 10,000 cores for a 16 percent accuracy rate. At the time of the publication, Google was considering applying an extremely large computation resource budget of over 30,000 cores for use in production.4 This example is illustrative of the challenges companies face when building large machine learning models. What happens when new models are required to change on a regular basis?
  • If AI machines are neither adaptable or interpretable, will machine learning models be increasingly ‘controlled’ by those who own the data sets? Will this ‘control’ take AI out of the domain of all, leaving the largest companies who have the ability to harvest the huge training sets? The number and sizes of publicly available data sets are growing, but can only be used by companies if the data are relevant. This forces companies to create their own proprietary data sets if they can.As the need for larger and larger data sets and frequent model builds grows, what impact will this have on the company’s ability to be competitive? The issue of data set control is becoming part of venture capital firms’ decision making when considering the financing of startup companies. This is reflected in a recent article by MMC Ventures which describes data access as number six in a list of key factors to evaluate when investing in an AI startup. “For ML to create value, it needs suitable data sets on which to be trained and deployed. We evaluate the extent to which a company can access suitable data. We gauge data suitability in the context of two stages of data manipulation required for ML:
    • selection: data availability; the existence of gaps and duplication in data; quality of data labeling, the existence of bias in data;
    • processing: data fragmentation; data cleaning requirements; a need for data sampling; the need for data transformation, decomposition and aggregation.5

Summarizing the current revolution of machine learning based on neural networks, things really took off when it was discovered that the GPU enabled the parallel computation needed to make them practical, yet significant issues remain:

  • Training machine learning models is a costly, and time-consuming process.
  • Machine learning models are black-boxes. Humans don’t understand why a given set of inputs results in specific outputs.
  • Companies who control data have an advantage, and smaller companies will find it difficult to compete.

We at Natural Intelligence Semiconductor (NIS) believe that the Natural Neural Processor (NNP) will inspire and empower the next generation of artificial intelligence models through neuromorphic processing. The NNP is a novel, brain-inspired architecture that is enabling research and development into new symbolic and graph-based neural networks. Building a neural network based on a hierarchy of symbolic patterns will allow the model to be interpretable, and allow incremental adaptability to change. Adaptability is achieved by making individual pattern changes in the hierarchical layers as opposed to building a new complex model and training it on massive data sets.

These new solutions for AI models that more accurately replicate human intelligence and learning, and help solve for the challenges that concept drift presents, will soon emerge by tapping into our Natural Neural Processor. Keep in touch with us as the new models of artificial intelligence become a reality by subscribing for updates and learn more about Natural Intelligence Semiconductor via our site and social channels.

  1. Wikipedia
  2. Polyzotis, N. et al., “Data Management Challenges in Production Machine Learning”, SIGMOD’17 May 14-19, 2017, Chicago, IL, USA.
  3. Dean, J., et al., “Large Scale Distributed Deep Networks”, NIPS 2012: Neural Information Processing Systems, December, 2012
  4. Dean, J., et al., “Large Scale Distributed Deep Networks”, NIPS 2012: Neural Information Processing Systems, December, 2012
  5. Kelnar, D., “The MMC Ventures AI Investment Framework: 17 success factors for the age of AI”, https://medium.com/mmc-writes/the-mmc-ventures-ai-investment-framework-17-success-factors-for-the-age-of-ai-671722e7ecd2
Add Comment

Your email address will not be published. Required fields are marked *