What is multimodal AI and why should we care?
With the AI race well and truly underway, another contest has opened up for the next frontier in AI. Microsoft-backed OpenAI made ChatGPT multimodal last month giving the bot the ability to analyse images and speak to users via its mobile app.
OpenAI rushed to release these features after a recent report by The Information stated that Google was already testing its soon-to-be released multimodal large language model, Gemini with other companies.
While Google is expected to have an upper hand over OpenAI because it has a bank of multimodal content like images and video owing to its search and YouTube, OpenAI is hiring actively to close the gap. The Information report also said that OpenAI is building a multimodal model from scratch under a project called Gobi.
While popular AI image generators like DALL.E or MidJourney are also examples of multimodal systems, they are expected to become more complex. The idea is that if a machine is to be as intelligent as a human, they should also have full cognition abilities that help them absorb information in a better way than just feeding in text.
Microsoft could debut AI chip next month, report says
Yesterday, a report by The Information stated that Microsoft could potentially unveil their first AI chip at its annual developer’s conference next month. The AI chip is built for Microsoft’s data centers that train and run the large language models which power their conversational bots like ChatGPT.
The move is expected to remove Microsoft’s dependence on Nvidia’s chips which have been seeing a boom in demand.
Early this month, ChatGPT-maker, OpenAI also revealed that they were considering making their own AI chips and was exploring a related acquisition. The shortage of chips has become a cause for concern among AI-focused companies like OpenAI which has been looking at alternatives to Nvidia for some time now.
X updates policies as Israel-Palestine content floods platform
Microblogging platform X said it has updated policies to fight a stream of misleading videos and hate speech on the platform since the renewed conflict between Israel and Palestine. Users can choose to not see graphic content now but X has chosen to keep them public on the platform. The company has also rolled out Community Notes to prevent misinformation, while removing abusive users and Hamas-affiliated accounts. Besides these, Musk is also testing out a feature that will allow only verified users to reply to certain posts.
Meanwhile, WSJ shared today that CEO Linda Yaccarino will not be attending the WSJ Tech Live Conference scheduled to be held next week citing the need to focus on platform safety given the global turmoil.