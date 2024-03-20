TL;DR:

Apple recently revealed research on its own AI model, MM1, which can understand both text and images.

This development suggests Apple is working on more powerful AI capabilities for its products.

The research indicates Apple is playing catch-up and gearing up for a bigger role in the AI race.

MM1 seems to share similarities in design and complexity with recent AI models from other tech titans, like Google's Gemini and Meta's open-source Llama 2. Research conducted by Apple's competitors and academic circles indicates that models of this caliber can fuel proficient chatbots or develop "agents" capable of executing tasks by coding and taking actions such as interacting with computer interfaces or websites. This hints that MM1 might eventually become a key component in Apple's lineup of products.



In a thread on X, Brandon McKinzie, an Apple researcher and the lead author of the MM1 paper, commented:



Recommended Stories MM1 is a multimodal large language model, or MLLM, which means it is trained on both images and text. This unique training enables the model to respond to text prompts and tackle intricate questions about specific images.



In an example from the Apple research paper, MM1 was given a picture of a restaurant table with beers and a menu. When prompted about the expected cost of " all the beer on the table ," the model accurately identifies the price and calculates the total expense.



Apple's iPhone already features an AI assistant, Siri. However, with the rapid emergence of competitors like ChatGPT, Siri's once groundbreaking capabilities are starting to feel constrained and outmoded. Both Amazon and Google have announced plans to incorporate Large Language Model (LLM) technology into their respective assistants, Alexa and Google Assistant. Google has even enabled Android phone users to swap out the Assistant with Gemini




