A few hours ago, Sony Interactive Entertainment announced the acquisition of iSIZE, a London-based company founded in 2016 focusing on machine learning applications for video processing. Financial terms of the transaction were not disclosed.
Sony claims it will benefit the company's R&D and its video streaming services. Indeed, iSIZE's website reveals three main products, two of which are directly related to that: BitClear and BitSave. The former is an AI-based video processing technology that cleans up compression artifacts such as ringing, blurring, and artifacts from heavily compressed content.
On NVIDIA RTX, T4, or V100 GPUs, BitClear can operate over multiple input video assets in real-time (at 25/30 fps). Allows for video upscaling, all with as little as 5ms processing latency on GPUs or high-performance CPUs.
Processes any highly compressed content and produces a higher-quality output that improves the value of the asset. It operates on all types of content.
Revives to the maximum possible quality without affecting the artistic intent of the original creators. As video encoding noise is applied across the entirety of each video frame, it operates on the entirety of each input frame, and it is not a region-of-interest approach.
Scalable neural network architecture that can scale to high volumes of content and is ideal for cloud as well as on-premise deployment
The second iSIZE product is BitSave, which the company describes as an AI-based 'perceptual optimizer' that allows encoders to produce higher quality video at a lower bitrate.
Offers up to 50% reduction in video bitrate over the state-of-the-art at the same or improved visual quality. Boosts the compression efficiency of any video codec and runs on client devices with minimal or no additional overhead
BitSave technology is compatible with any codec, including AVC, HEVC, VP9, AV1, and even the upcoming AV2 and VVC, with increased gains in bitrate and quality offered for newer standards
Single pass processing per content for an entire ABR ladder. Requires no change in encoding, delivery or decoding devices. Compatible with any existing video coding infrastructure
Allows simpler encoding recipes, offers significant computational and energy efficiency, allowing for sustainable video streaming at scale.
Clearly, both of these technologies can be very useful for Sony as it attempts to improve its own streaming technology. Just recently, Sony launched the PS5 cloud streaming feature earlier this month, just ahead of the upcoming release of the streaming-focused PlayStation Portal handheld. In the future, it is conceivable that the new streaming tech (which supports 4K, unlike Microsoft's xCloud) could land on PCs, too.
By the way, iSIZE's third product could very well be of interest to Sony's PlayStation Studios for game development. Called BitGen, it allows live generation of photorealistic 2D/3D avatars.
Avoidance of deep fakes by ensuring faithful rendering of what is captured by the camera. AI-based rendering and structure extraction in latent space; No offline capture and training.
Decreasing bitrate by 3-10x in comparison to state-of-the-art generic video encoders (AV1, HEVC) for the same visual quality.
5x lower video latency. 5x lower transceiver power. Uninterrupted remote presence under poor wireless signal conditions. Significantly better Quality of Experience -> increased user engagement time.
Scale out to millions of users with the use of a single model. Bespoke models can be tailored to content: conversational, IoT, infrared, multispectral, etc. Deployable across devices.