I think the computing capabilities for the NPU is probably not as important as the power efficiency of an NPU. While the AI models are becoming increasingly intelligent. I would say given the same size of the model (file size), you can pack in maybe two or three or even five times more knowledge into it now, right? You don't really have to run big language models to get that intelligence. You can actually run the same 3 billion, 4 billion size of the model, and get maybe five times better intelligence than before. So really, the key is not about increasing the computing capability of larger models; it's more about how you run it more efficiently. That's very key. And for smaller form factors like smartphones, having it run like 20-30 billion models probably doesn't make sense because it's going to burn out your power.
I think running the models more efficiently is the key and for that to happen, sometimes you need always-sensing capabilities that really run some very small, tiny models and do some AI training on your phone so that your data is stored on your phone; it doesn't really go to the cloud.
We are looking into how we can make the smartphone chipsets' NPU capabilities available to third-party applications. We want the AI to run on the device but there are also situations where you still need to go to the cloud, however we are looking into how to run these models on the device so that they can create new user scenarios and use cases. For example, when you don't have the connections on a flight, you cannot do too much if the application needs to use the cloud to do certain things. Can we do that on the airplane or when you don't have good connectivity?