Japanese AI Breakthrough: Shisa V2 405B Open-Source Model Surpasses GPT-4
Japan's AI Reverses the Global Trend! Shisa V2 405B Open Source Release Surpasses GPT-4 in Japanese Tasks! Recently, AIbase obtained the latest information from social media platforms about Shisa.AI, a Tokyo-based startup that specializes in fine-tuning HuggingFace models for the Japanese market. The company's newly released bilingual model, Shisa V2 405B, has garnered significant attention in the industry. This article will delve into Shisa.AI's latest achievements and their groundbreaking progress in the field of Japanese AI. Shisa V2 405B: Birth of Japan’s Strongest Open Source Model According to AIbase, Shisa.AI has launched Shisa V2 405B, an open-source model based on Llama 3.1, which is being hailed as the "strongest large language model ever trained in Japan." This model excels not only in Japanese tasks but also retains strong English processing capabilities, demonstrating exceptional performance as a bilingual model. Test data shows that Shisa V2 405B outperforms GPT-4 and GPT-4 Turbo in various Japanese benchmarks, rivaling even the latest GPT-4o and DeepSeek V3 in Japanese tasks. This achievement marks a significant rise of Japanese AI labs in global competition and paves the way for new possibilities in Japanese AI applications. Focused on Japanese Optimization, Fine-Tuning Techniques Advanced Shisa.AI, a startup headquartered in Tokyo, is dedicated to developing and deploying advanced open-source AI language and speech models for the Japanese market. Unlike earlier models, the Shisa V2 series has moved away from expensive continuous pre-training and tokenizer expansion, concentrating instead on optimizing the post-training process. By leveraging a synthetic data-driven approach, the company has significantly enhanced the model’s performance. The core dataset, ultra-orca-boros-en-ja-v1, has been meticulously filtered, regenerated, and resampled, making it one of the most robust bilingual datasets available for improving any base model’s Japanese capabilities. This dataset is freely accessible under the Apache 2.0 license, providing a valuable resource for global developers. A Versatile Model Family, Ranging from 7B to 405B Parameters The Shisa V2 series includes models with parametric scales ranging from 7B to 405B, catering to diverse needs, from lightweight devices to high-performance computing environments. These models excel in tasks such as Japanese grammar, role-playing, and translation. Specifically, they outperform their respective base models in evaluations like shisa-jp-ifeval (Japanese instruction-following test), shisa-jp-rp-bench (Japanese role-playing benchmark), and shisa-jp-tl-bench (Japanese-English translation benchmark). Notably, Shisa V2 405B incorporates a small amount of Korean and Traditional Chinese data during training, further enhancing its multilingual capabilities and broadening its application in cross-language scenarios. Open Source Spirit Fuels Global AI Innovation Beyond just boosting Japanese AI performance, Shisa.AI's commitment to open source has greatly contributed to the development of the global AI community. The training logs for the Shisa V2 series have been openly shared on the Weights and Biases platform, detailing the use of AWS SageMaker’s 4-node H100 cluster along with cutting-edge technologies like Axolotl, DeepSpeed, and Liger Kernel to ensure efficient model development. Additionally, Shisa.AI plans to open-source its specialized Japanese benchmark testing tools, aiding research and evaluation of large Japanese language models and offering further support to developers worldwide. Future Outlook: Enhancing Japan's Global AI Competitiveness Shisa.AI's success demonstrates that even smaller AI labs can make a substantial impact in the global AI race. By releasing open-source models and datasets, the company is providing strong support for the widespread adoption of Japanese AI applications. AIbase anticipates that as Shisa.AI continues to update its models and resources, Japan's position in the global AI landscape will become increasingly solid. For developers dealing with complex Japanese tasks, the Shisa V2 series is undoubtedly a powerful tool worth exploring. AIbase recommends keeping an eye on Shisa.AI’s official website and HuggingFace page for the latest technical details and model usage opportunities. Through the Shisa V2 series, Shisa.AI has showcased Japan's innovative strength in the realm of AI. Whether for academic research or commercial applications, these open-source models have laid a solid foundation for the future development of Japanese AI.