KUALA LUMPUR, April 2 (Bernama) -- Visual Bank Inc, via its subsidiary amanaimages Inc, has announced the expansion of its Qlean Dataset, a premium artificial intelligence (AI) training data solution for developers building high-performance Japanese speech foundation models.
The expansion strengthens its position in providing rights-cleared datasets for research and development and large-scale AI applications.
“As demand for culturally contextualised foundation models grows, high-quality, legally compliant Japanese training data is becoming increasingly critical.
“Visual Bank is committed to bridging the gap between raw content and production-ready AI systems through rigorous data preparation and engineering,” said its chief executive officer, Saneyuki Nagai, in a statement.
The datasets are fully rights-cleared for commercial use and aligned with global compliance standards such as the General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA).
They include high-fidelity audio assets recorded at 48 kilohertz (kHz)/16-bit or higher, enabling capture of both studio-quality speech and diverse acoustic environments.
In addition, the datasets support detection of harmful language, including hate speech and abusive prompts, and include evaluation datasets aligned with international benchmarks such as MMSU to assess reasoning and linguistic nuance in Japanese.
The solution also incorporates Japan-specific audio, including traditional and urban sound environments, to support multimodal and spatial AI applications.
The datasets are available through AI Data Recipe, which offers both ready-to-use datasets and customised data production, including speaker casting, recording and annotation tailored to specific development needs.
-- BERNAMA
No comments:
Post a Comment