The Associated Press

This is a test for Consumer Pay Call to Action

AI nonprofit CEO says ‘closed nature’ of most artificial intelligence research hinders innovation

A year before Elon Musk helped start OpenAI in San Francisco, philanthropist and Microsoft co-founder Paul Allen already had established his own nonprofit artificial intelligence research laboratory in Seattle.

Their mission was to advance AI for humanity’s benefit.

More than a decade later, the Allen Institute for Artificial Intelligence, or Ai2, isn’t nearly as well-known as the ChatGPT maker but is still pursuing the “high-impact” AI sought by Allen, who died in 2018. One of its latest AI models, Tulu 3 405B, rivals OpenAI and China’s DeepSeek on several benchmarks. But unlike OpenAI, it says it’s developing AI systems that are “truly open” for others to build upon.

The institute’s CEO Ali Farhadi has been running Ai2 since 2023 after a stint at Apple. He spoke with The Associated Press. The interview has been edited for length and clarity.

Why is openness important to your mission?

Our mission is to do AI innovation and AI breakthroughs to solve some of the biggest working problems facing humanity today. The biggest threat to AI innovation is the closed nature of the practice. We have been pushing very, very strongly towards openness. If you think about open-source software, the core essence was, ‘I should be able to understand what you did. I should be able to change it. I should be able to fork from it. I should be able to use part of it, half of it, all of it. And once I build my thing, I put it out there and you should be able to do the same.’

What do you consider an open-source AI model?

It is a really heated topic at the moment. To us, open-source means that you understand what you did. Open weights models (such as Meta’s) are great because people could just grab those weights and follow the rest, but they aren’t open source. Open source is when you actually have access to every part of the puzzle.

Why aren’t more AI developers sharing training data for models they say are open?

If I want to postulate, some of these training data have a little bit of questionable material in them. But also the training data for these models are the actual IP. The data is probably the most sacred part. Many think there’s a lot of value in it. In my opinion, rightfully so. Data plays a significant role in improving your model, changing the behavior of your model. It’s tedious, it’s challenging. Many companies spend a lot of dollars, a lot of investments, in that domain and they don’t like to share it.

What are the AI applications you’re most excited about?

As it matures, I think AI is getting ready to be taken seriously for crucial problem domains such as science discovery. A good part of some disciplines involves a complicated search for a solution -- for a gene structure, a cell structure or specific configurations of elements. Many of those problems can be formulated computationally. There’s only so much you can do by just downloading a model from the web that was trained on text data and fine tuning it. Our hope is to empower scientists to be able to actually train their own model.

Business / technology / AI reporter