rinna Co., Ltd. released an image generation model “Japanese Stable Diffusion” specialized for Japanese and started providing image generation services.
Table of Contents
- About rinna Inc.
- Image generation model specialized for Japanese “Japanese Stable Diffusion”
- Development Background and Features of Japanese Stable Diffusion
- Future prospects
About rinna Inc.
rinna Inc. is an AI character development company established in June 2020. With the mission of “Draw out your unique creativity with AI characters and make the world colorful”, we are proposing new ways of connecting and communicating between people and people, between people and information, and between people and society. “Tamashiru”, which allows you to create an AI character that reflects the tone of your personality and learning topics and speaks naturally, “Coordiru”, which increases the transparency of internal communication and strengthens organizational ties. We also provide SNS application “Chararu” that interacts with other AI characters.
rinna Co., Ltd. has a vision of “a co-created world between people and AI”, and aims to realize a society where everyone can demonstrate their own creativity by intervening AI between people. In these efforts, we have focused on non-verbal communication such as images, generated images sent by AI characters such as AI Rinna, and released a learned language image model as a research result.
In addition, the coexistence of humans and AI that rinna Co., Ltd. considers is “the development of human creativity through AI.” As rinna, whose mission is to “bring out your unique creativity with AI characters and make the world colorful,” the image generation model “Japanese Stable Diffusion” to be released this time will support the creativity of diverse people. It is
Image generation model specialized for Japanese “Japanese Stable Diffusion”
From the spring of 2022, high-precision AI image generation such as DALL-E 2, Midjourney, and Stable Diffusion is attracting attention. Therefore, rinna Co., Ltd. has developed an image generation model ” Developed Japanese Stable Diffusion. This model considers Japanese as a text prompt and realizes image generation that reflects the culture of the Japanese-speaking world, which is difficult to express through translation. And by publishing this model on the AI model library and GitHub, we give back to the language and image research and development community.
This model will also be implemented in the service “Chararu” operated by rinna Inc. and SNS, and you will be able to experience the image generation of “Japanese Stable Diffusion”.
In the AI character SNS “Chararu”, AI characters spontaneously generate images with Japanese Stable Diffusion by giving badges such as “Good drawing” to AI characters. In addition, on the official Discord of “Chararu”, 🎨a channel “# | ai drawing venue” will be opened, and by calling the bot “Tsukuru”, an image will be generated from the prompt entered in Japanese. Now you can.
Also, if you send a reply to Rinna’s specific tweet, an image generated by Japanese Stable Diffusion will be returned.
In addition, API “Text To Image API v2” using this model was released on rinna Developers, an API site open to developers . By using this API, it will be possible to implement the image generation function of “Japanese Stable Diffusion” in applications, etc.
Development Background and Features of Japanese Stable Diffusion
Stable Diffusion provided by Stability AI is trained from images with English captions, so in order to generate images from Japanese, it is necessary to prepare text prompts translated into English. However, expressions unique to Japanese (proper nouns, Japanese English, onomatopoeia, etc.) are difficult to translate and cannot be reflected in image generation.
In addition, most of the training data are images of English-speaking countries, and images that strongly reflect the culture of the English-speaking world are generated. Therefore, we developed “Japanese Stable Diffusion”, an image generation model specialized for Japanese. This model has the following features:
- Approximately 100 million images with Japanese captions, including the Japanese subset of LAION-5B, are used as training data.
- In order to respond to Japanese text prompts, we fixed the Stable Diffusion generative model parameters published by Stability AI, and conducted additional learning using images with Japanese captions only for the text encoder.
- After that, by performing additional learning that simultaneously updates the parameters of the text encoder and the generative model, it is further optimized for Japanese image generation.
- Learned “Japanese Stable Diffusion” inherits Stable Diffusion’s license CreativeML Open RAIL-M and publishes it on Hugging Face
- Publish models early to keep up with progress in the English-speaking world in the dramatically advancing AI field
Below is a sample image generated from a Japanese text prompt.
rinna Co., Ltd. will continue to research AI, aim to develop high-performance products and services, and will continue to disclose research results and return them to the research and development community.
In recent years, text prompts for creating attractive content by generating sentences and images using AI models have become important, and there is a growing movement to seek specialists who can maximize the power of AI. increase.
At rinna, which aims to create a society where both AI and people can play an active role, we have newly established the position of prompt engineer who specializes in text prompts and are promoting recruitment activities. In promoting the social implementation of AI, rinna will actively create employment opportunities related to AI.