Friday, February 23, 2024
HomeAIBLOOM is the most important AI model of the last decade!

BLOOM is the most important AI model of the last decade!

AI model

Table of contents 

  • It’s not the DALL-E 2, the PaLM, the AlphaZero, or even the GPT-3 that matters.
  • BLOOM and BigScience mark an inflection point for the AI ​​community
    • intrinsic value
    • incidental value
  • What makes BLOOM different

It’s not the DALL-E 2, the PaLM, the AlphaZero, or even the GPT-3 that matters.

You might be wondering about AI model, can such a bold headline be true? The answer is yes. Here’s why.

GPT-3 came out in 2020 and has since established a new avenue of attention that the entire AI industry has deliberately followed. Tech companies have iteratively built better, larger-scale models. But they’ve poured millions into the challenge, none of which has fundamentally changed the leading paradigm and rules of the game that GPT-3 laid out two years ago.

Gopher , Chinchilla and PaLM (arguably the three current podiums for large language models) are significantly better than GPT-3, but in essence they are similar. Chinchilla proved successful with a slightly different scaling law, but like other models, it remains a large-scale Transformer-based model with large amounts of data and computation.

What the DALL-E 2 , Imagen , and Parti do is convert text to images, and while these models add more technology than Transformer, they’re built on much the same technological trends. Flamingo and Gato also depart slightly from GPT-3 to take a more general and multimodal approach to AI, but they are just remixes (of GPT-3) that apply the same ideas to new tasks. translation note 2).

But most importantly, all of the above AI models are born from the vast resources of private high-tech companies. That’s the common denominator. These AI models belong in the same package not only because of their technical specifications. A handful of wealthy for-profit labs wield absolute control over AI model development.

That is about to change.

(*Translation Note 1) When DeepMind developed the language model Chinchilla , in addition to the conventionally known model size (number of parameters), the amount of learning data is also related as a factor affecting the performance of the language model . It revealed that. Regarding the factors that determine the performance of the language model, see the AINOW translated article ” GPT-4 Coming Soon. what we know about it. See the headline ” Model Size: GPT-4 Won’t Be Super Large “.
(*Translation Note 2) The image recognition model Flamingo developed by DeepMind employs an architecture called Perceiver Resampler to process multimodal inputs such as images and videos. This architecture is an evolution of Perceiver , an architecture that supports multimodal input developed by the company , and Perceiver itself is based on Transformer.
Gato achieves extensive multimodality by converting various inputs into sequential tokens and processing them with a Transformer.
From the above , both Flamingo and Gato can be said to be multimodal transformer networks .

BLOOM and BigScience mark an inflection point for the AI ​​community

BLOOM (BigScience Language Open-science Open-access Multilingual) is unique not because of its different architecture from GPT-3. In fact, BLOOM is a Transformer-based model with 176B parameters (GPT-3 is 175B), which is actually the most similar to GPT-3 compared to the language models (like Gopher) mentioned above. What makes BLOOM unique is that this model is the starting point for the social and political paradigm shift that will shape the AI ​​research field in the coming years, and is the leading high-tech firm for Large Language Model (LLM) research and development. Because it breaks corporate control.

Meta , Google and OpenAI can be said to have recently open sourced models using large Transformers (OPT, Switch Transformers and VPT respectively). Is this due to the sudden rise in open source’s reputation among these big companies? Most engineers and researchers at large companies would have appreciated open source from the beginning. They know the value of the libraries and tools built on top of open source foundations because they use them every day. But immoral, money-making companies don’t give in so easily to the broader AI community’s predilection for open source.

Big tech companies wouldn’t have open sourced their models if a few institutions and research institutes hadn’t started putting tremendous pressure in the direction of open sourcing.

BigScience , Hugging Face , EleutherAI , and others don’t like what big tech companies have brought to the AI ​​R&D space. It’s not morally right to have a monopoly on a technology that could – hopefully – benefit many people in the future. But when I asked Google and OpenAI to share their research, I didn’t expect a positive response. So some institutions, like BigScience, have decided to secure their own funding to build large-scale AI models and open them up freely to researchers who want to explore their wonders. State-of-the-art AI is no longer just for big companies to make their own money.

BLOOM is the culmination of these efforts. After more than a year of collaborative work that began in January 2021 and more than three months of training at the French public supercomputer Jean Zay, BLOOM is finally complete. The model is the result of the BigScience research workshop , which consists of the work of over 1000 researchers worldwide, and the collaboration of over 250 institutions such as Hugging Face, IDRIS , GENCI , and the Montreal AI Ethics Institute . and rely on support (*3).

Researchers and institutions participating in BigScience research workshops have in common that technology, especially AI, should be open, diverse and inclusive, responsible and accessible for the benefit of humanity. That’s what I’m thinking.

Their impressive collective effort and idiosyncratic stance in the AI ​​industry draws heavily on the social, cultural, political and environmental contexts underlying the design of AI models (especially BLOOM) and the processes of data selection, curation and governance. It is nothing but the consideration of

Members of BigScience have released a charter of ethics that sets out the values ​​they hold in developing and implementing the technology used in BLOOM . The sense of values ​​is classified into two, the essential one, ” value as an end”, and the incidental one, ” value as a means “. In the following, to understand the unprecedented significance of BigScience and BLOOM, I will emphasize their respective values ​​and summarize them by quoting the charter (the charter is short, so I recommend reading it).

Intrinsic value

  • Inclusiveness: “…equal access to BigScience artifacts…not just non-discriminatory, but equitable in terms of belonging…”
  • Diversity: “…researchers and communities of over 900 people in 50 countries, over 20 languages…”
  • Reproducibility: “…BigScience aims to ensure the reproducibility of research experiments and scientific conclusions…”
  • Openness: “…AI-related researchers around the world can contribute and participate in this effort…[and] the results…will be shared on an open basis…”
  • Responsibility: “Each contributor has both personal and collective (social and environmental) responsibility for their work within the BigScience project…”

Incidental value

  • Accessibility: “We define this as a means to achieve openness. BigScience makes every effort to ensure that our research and technical output can be easily interpreted and explained by the widest possible audience.” ing…”
  • Transparency “We define this as a means to achieve reproducibility. The work of BigScience is actively promoted by various conferences, webinars, academic studies, scientific dissemination activities to be seen by others… ”
  • Interdisciplinarity: “We define this as a means to achieve inclusivity. To adopt a holistic approach in developing the BigScience deliverables, we will examine computer science, linguistics, law, sociology, philosophy, and others. Always bridging between related fields.”
  • Multilingualism: “This is defined as a means to achieve diversity. It is achieved by having a multilingual system from the conceptual stage. Target…”

BigScience and BLOOM are arguably the most notable attempts to break down all the barriers that big companies have put up (willingly or unwillingly) in the AI ​​space over the past decade. And it’s also the most sincere and honest effort to build AI (especially large-scale language models) that benefits everyone.

Readers interested in learning more about BigScience’s approach should read this excellent series of three articles on the social context of large-scale language modeling research . Access to BLOOM is possible from Hugging Face .

A summary of the institutions that participated in the BigScience research workshops mentioned in the article can be found in the table below.

Overview of (some of) institutions participating in BigScience research workshops

Hugging Face A company that advocates “democratization of machine learning” . It publishes various machine learning models.
IDRISMore IDRIS (Institute for Development and Resources in Intensive Scientific Computing) is a French research center specializing in intensive numerical computing. Provided the supercomputer ” Jean Zay ” that trained BLOOM .
GENCI GENCI (Grand équipement national de calcul intensif) is a French research institute that promotes the use of intensive numerical computation in various fields.
Montreal AI Ethics Institute An international NPO that democratizes AI ethical literacy. It runs a newsletter and learning community on AI ethics .
(*Translation Note 4) On June 17, 2022, the Montreal AI Institute summarized its activities during its participation in the BigScience research workshop, focusing on the social, legal and ethical aspects of the workshop. published an article . According to the article, these activities are further subdivided into three categories:

Three Activities Focusing on Social, Legal and Ethical Aspects in BigScience Research Workshops

  • Clarification of the project’s ethical and legal grounds: Formulate an ethical charter that serves as the basic policy for the BLOOM development project, and stipulate methods for operating open source large-scale language models in line with the social backgrounds of countries around the world. Create a playbook .
  • Clarification of data governance: Creation of rules and tools for training data management of large-scale language models. Deliverables include data governance rules and tools for curating collected language data .
  • Clarification of model governance: Creation of rules that stipulate usage rules for large-scale language models. As deliverables, there are model cards that stipulate appropriate and inappropriate uses of BLOOM .

For further details about these workshops, see the Montreal AI Institute blog post category Social Context in Large-Scale Language Modeling Research: A BigScience Approach .

What makes BLOOM different

As I mentioned at the beginning of the article, BLOOM is not the first large-scale open source language model. Meta, Google and others have already open-sourced some models. But they certainly aren’t the best these companies can offer. Their primary goal is to make money, not to share cutting-edge research. Under these circumstances, there are not enough signs of intent to participate in open science as part of a major tech company’s PR strategy.

BigScience and BLOOM embody ethical values ​​that companies by definition cannot express. The visible result is in each case an open-source large-scale language model. But the hidden — and much-needed — foundation that guides BigScience highlights the stark difference between these collective efforts and the powerful tech giants.

There’s a big difference between adopting open source methods reluctantly (like a big tech company) and firmly believing it’s the right approach. The members of BigScience believe that AI model should be democratized, open access and results, addressed ethical issues, and should aim to benefit as many people as possible. What makes BLOOM unique. And based on this belief, I would unequivocally conclude that BLOOM is the most important AI model of the decade.

BLOOM is a frontrunner at the forefront of a field that is about to change dramatically for the better. It marks a movement beyond current research trends (where big tech companies dominate AI model research). The model marks the dawn of a new era in AI model that will not only move the field forward faster, but will also encourage those who would otherwise proceed to embrace the new rules that govern the field. .

This isn’t the first time open source has triumphed over privacy and control. There are also examples of open source triumphs in computers, operating systems, browsers and search engines. Recent history is full of clashes between those who want to secure profit for themselves and those who have fought and won for all others. And we have won these conflicts. Open source and open science are the ultimate stage of technology. And we are about to enter a new era of open source for AI model as well.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Recent Posts

Most Popular

Recent Comments