- Over a Cup of Coffee
- Posts
- How science is helping shrink AI models
How science is helping shrink AI models
AI models are large and energy-intensive but they do not have to be.
Welcome to this edition of Over a Cup of Coffee.
In this newsletter
AI models are everywhere.
Since the arrival of ChatGPT, there hasn’t been a day when companies have not spoken about how AI is improving their work and, hence, our lives.
But with every improvement, AI models are getting bigger, more power-hungry, and planet-damaging. Running them off renewable energy isn’t feasible right now.
The solution is to make them smaller and more efficient; this is where science helps.
AI answers complex physics questions
Researchers at MIT and the University of Basel in Switzerland were working on developing a generative AI model that can answer tough questions in physics.
These are not the ones that students find difficult in exams but the ones that stump physicists. Like how to quantify a phase change (liquid to solid or gas) in a lesser-known system or when does a material transition from being a conductor to a superconductor?
An AI model approach is much more efficient than spending several hours conducting experiments. However, the researchers also found that their model does not need large datasets for training.
AI models, like the ones that run ChatGPT or DALL-E, estimate the probability distribution of data and then use that estimate to generate their data, e.g. using old cat images to generate new cat images.
However, when an AI model is trained on simulations of scientific techniques, the probability distribution is readily available, and the model does not have to undergo a learning curve with sample data. This significantly improves the computational efficiency of the model.
Not only can one such AI model ask tough physics questions, but it can also make future models better by adjusting a few important parameters.
The research study was published in the journal Physical Review Letters.
Shrinking AI models to run on your smartphone
Even though AI models might seem to be the hottest thing in the market right now, OpenAI and Google’s offerings aren’t where the science of AI is at.
Research is underway to make smaller AI models that can run on a laptop or smartphone.
Interestingly, it is Microsoft that is building them.
Last month, Microsoft's research team published a technical report about its Phi-3 multimodal model. It is called multimodal because it can work with text, audio, and video but does not need to connect to the cloud.
This is a major development since it allows users to keep their information or queries on their devices and offline.
How is this possible? It is the same logic that MIT researchers used. Instead of training models on the knowledge of the Internet, Microsoft diligently curated the data Phi-3 is trained on.
Microsoft researchers trained Phi-3 using only 1/17th of OpenAI's data to train GPT3.5. But when the two models were pitted against each other, Phi-3 showed better programming skills.
When compared in their ability to tell children’s stories, Phi-3 told more coherent stories instead of the gibberish that ChatGPT gives out.
This is why the researchers are doubling down on efforts to shrink the size of AI and run it on smartphones instead of in the cloud.
This could potentially deliver cost and energy savings and more environment-friendly AI.
The technical report can be found on arXiv.
If you enjoyed reading today’s newsletter, you will also like The Rundown AI.
Every day, Rowan Chung brings you the latest developments from the world of AI, and this newsletter has a following of 600,000!
Click on the link below to get it straight to your inbox.
Keep up with AI
How do you keep up with the insane pace of AI? Join The Rundown — the world’s largest AI newsletter that keeps you up-to-date with everything happening in AI, and why it actually matters in just a 5-minute read per day.
Thank you for reading.
Until next time,
Ameya
Reply