Showing posts with label GPU. Show all posts
Showing posts with label GPU. Show all posts

Monday, August 03, 2020

Basis Universal : Supercompressed GPU Video Texture Codec


I was poking about inside Mozilla's big WebXR demo to see how it was made.
 https://mixedreality.mozilla.org/hello-webxr/

Most of the assets are xxx.basis files.  ".basis"  I had never heard of a dot basis file and not much came up on the internet. 

Digging deeper I found a compiled Web Assembly Library and a javascript library wrapper for it.

  60935   basis_transcoder.js
 367268   basis_transcoder.wasm



They also used Draco which is also interesting for 3D object compression. 

In further searching it's starting to look like this basis codec is going to be a standard part of three.js and therefore aframe.io for web based VR development. 

Basis Universal GPU Texture Compression

texture video compression system that outputs a highly compressed intermediate file format (.basis) that can be quickly transcoded to a wide variety of GPU texture compression formats.


basis_universal

Basis Universal Supercompressed GPU Texture Codec
Basis Universal is a "supercompressed" GPU texture compression system that outputs a highly compressed intermediate file format (.basis) that can be quickly transcoded to a very wide variety of GPU compressed and uncompressed pixel formats: ASTC 4x4 L/LA/RGB/RGBA, PVRTC1 4bpp RGB/RGBA, PVRTC2 RGB/RGBA, BC7 mode 6 RGB, BC7 mode 5 RGB/RGBA, BC1-5 RGB/RGBA/X/XY, ETC1 RGB, ETC2 RGBA, ATC RGB/RGBA, ETC2 EAC R11 and RG11, FXT1 RGB, and uncompressed raster image formats 8888/565/4444.
The system now supports two modes: a high quality mode which is internally based off the UASTC compressed texture format, and the original lower quality mode which is based off a subset of ETC1 called "ETC1S". UASTC is for extremely high quality (similar to BC7 quality) textures, and ETC1S is for very small files. The ETC1S system includes built-in data compression, while the UASTC system includes an optional Rate Distortion Optimization (RDO) post-process stage that conditions the encoded UASTC texture data in the .basis file so it can be more effectively LZ compressed by the end user. More technical details about UASTC integration are here.
Basis files support non-uniform texture arrays, so cubemaps, volume textures, texture arrays, mipmap levels, video sequences, or arbitrary texture "tiles" can be stored in a single file. The compressor is able to exploit color and pattern correlations across the entire file, so multiple images with mipmaps can be stored very efficiently in a single file.
The system's bitrate depends on the quality setting and image content, but common usable ETC1S bitrates are .3-1.25 bits/texel. ETC1S .basis files are typically 10-25% smaller than using RDO texture compression of the internal texture data stored in the .basis file followed by LZMA. For UASTC files, the bitrate is fixed at 8bpp, but with RDO post-processing and user-provided LZ compression on the .basis file the effective bitrate can be as low as 2bpp for video or for individual textures approximately 4-6bpp.
The transcoder has been fuzz tested using zzuf.
So far, we've compiled the code using MSVS 2019, under Ubuntu x64 using cmake with either clang 3.8 or gcc 5.4, and emscripten 1.35 to asm.js. (Be sure to use this version or later of emcc, as earlier versions fail with internal errors/exceptions during compilation.) The compressor is multithreaded by default, but this can be disabled using the -no_multithreading command line option. The transcoder is currently single threaded.
Basis Universal supports "skip blocks" in ETC1S compressed texture arrays, which makes it useful for basic compressed texture video applications. Note that Basis Universal is still at heart a GPU texture compression system, not a video codec, so bitrates will be larger than even MPEG1.

Important Usage Notes

Probably the most important concept to understand about Basis Universal before using it: The system supports two very different universal texture modes: The original "ETC1S" mode is low/medium quality, but the resulting file sizes are very small because the system has built-in compression for ETC1S texture format files. This is the command line encoding tool's default mode. ETC1S textures work best on images, photos, map data, or albedo/specular/etc. textures, but don't work as well on normal maps. There's the second "UASTC" mode, which is significantly higher quality (near-BC7 grade), and is usable on all texture types including complex normal maps. UASTC mode purposely does not have built-in file compression like ETC1S mode does, so the resulting files are quite large (8-bits/texel - same as BC7) compared to ETC1S mode. The UASTC encoder has an optional Rate Distortion Optimization (RDO) encoding mode (implemented as a post-process over the encoded UASTC texture data), which lowers the output data's entropy in a way that results in better compression when UASTC .basis files are compressed with Deflate/Zstd, etc. In UASTC mode, you must losslessly compress the file yourself.
Basis Universal is not an image compression codec. It's a texture compression codec. It can be used just like an image compression codec, but that's not the only use case. Here's a good intro to GPU texture compression. If you're looking to primarily use the system as an image compression codec on sRGB photographic content, use the default ETC1S mode, because it has built-in compression.
The "-q X" option controls the output quality in ETC1S mode. The default is quality level 128. "-q 255" will increase quality quite a bit. If you want even higher quality, try "-max_selectors 16128 -max_endpoints 16128" instead of -q. -q internally tries to set the codebook sizes (or the # of quantization intervals for endpoints/selectors) for you. You need to experiment with the quality level on your content.
For tangent space normal maps, you should separate X into RGB and Y into Alpha, and provide the compressor with 32-bit/pixel input images. Or use the "-separate_rg_to_color_alpha" command line option which does this for you. The internal texture format that Basis Universal uses (ETC1S) doesn't handle tangent space normal maps encoded into RGB well. You need to separate the channels and recover Z in the pixel shader using z=sqrt(1-x^2-y^2).

3rd party code dependencies

The stand-alone transcoder (in the "transcoder" directory) is a single .cpp source file library which has no 3rd party code dependencies.
The encoder uses lodepng for loading and saving PNG images, which is Copyright (c) 2005-2019 Lode Vandevenne. It uses the zlib license. It also uses apg_bmp for loading BMP images, which is Copyright 2019 Anton Gerdelan. It uses the Apache 2.0 license.
The encoder uses tcuAstcUtil.cpp, from the Android drawElements Quality Program (deqp) Testing Suite, for unpacking the transcoder's ASTC output for testing/validation purposes. This code is Copyright 2016 The Android Open Source Project, and uses the Apache 2.0 license. We have modified the code so it has no external dependencies, and disabled HDR support.



This package uses BASIS and discusses is a little. 

Friday, August 04, 2017

JPEG2000 GPU Codec toolkit

http://comprimato.com/


Ultra-high speed compression and life-like viewing experience starts here. JPEG2000 Codec for GPU and CPU. Comprimato's JPEG2000 GPU Codec toolkit helps Media & Entertainment and Geospatial Imaging technology companies keep it real and with more accurate decision-making power.

Saturday, June 06, 2015

VideoCoreIV-AG100-R Raspberry Pi Gpu docs


https://docs.broadcom.com/doc/12358545

and just in case

https://github.com/doe300/VC4C/tree/master/doc

You want:   VideoCoreIV-AG100-R.pdf 


Original , Now Dead Link
http://www.broadcom.com/docs/support/videocore/VideoCoreIV-AG100-R.pdf


How to optimize Raspberry Pi code using its GPU « Pete Warden's blog

http://petewarden.com/2014/08/07/how-to-optimize-raspberry-pi-code-using-its-gpu/


OpenMAX

http://en.m.wikipedia.org/wiki/OpenMAX

OpenMAX (Open Media Acceleration), often shortened as "OMX", is a non-proprietary and royalty-free cross-platform set of C-language programming interfaces that provides abstractions for routines especially useful for audio, video, and still images processing. It is intended for low power and embedded system devices (including smartphonesgame consolesdigital media players, and set-top boxes) that need to efficiently process large amounts of multimedia data in predictable ways, such as video codecs, graphics libraries, and other functions for video, image, audio, voice and speech.

Thursday, December 01, 2011

NVIDIA's Tegra 3 Outruns Apple's A5 In First Benchmarks

From Slashdot:




NVIDIA's Tegra 3 Outruns Apple's A5 In First Benchmarks



"NVIDIA's new Tegra 3 SoC (System on a Chip) has recently been released for performance reviews in the Asus Eee Pad Transformer Prime Android tablet. Tegra 3 is comprised of a quad-core primary CPU complex with a 5th companion core for lower-end processing requirements and power management. The chip can scale up to 1.4GHz on a single core and 1.3GHz on up to four of its cores, while the companion core operates at 500MHz. It makes for a fairly impressive new tablet platform and offers performance that bests Apple's A5 dual-core processor in more than a few tests. The Asus Eee Pad Transformer Prime with optional keyboard dock and NVIDIA's Tegra 3 is set to be available in volume sometime around December 19th."

More about the actual chip.
http://www.nvidia.com/object/tegra-superchip.html

Tuesday, September 27, 2011

Fwd: AMD Delivers Multi-Display Support with Longevity on Latest Entry-Level Embedded Discrete GPU

---------- Forwarded message ----------
From: "AMD Embedded Solutions" <embedded.solutions@amd.com>
Date: Sep 27, 2011 8:31 AM
Subject: AMD Delivers Multi-Display Support with Longevity on Latest Entry-Level Embedded Discrete GPU



This message contains graphics. If you do not see the graphics, View this online.

AMD Delivers Multi-Display Support with Longevity on Latest Entry-Level Embedded Discrete GPU
technical updates  September 2011

AMD Radeon™ E6460 GPU brings the latest desktop graphics performance and features to the casino gaming, digital signage, instrumentation and industrial controls markets



Shot of two computer chips plus the AMD Accelerated Parallel Processing Technology/AMD Eyefinity Multi-Display Technology/and AMD HD3D Technology shieldsYesterday at Embedded Systems Conference East, AMD introduced the
AMD Radeon™ E6460 discrete graphics processor as AMD's next generation entry-level embedded graphics processor and complements the previously announced AMD Radeon™ E6760.  With support for up to four simultaneous displays and more than double the 3D graphics performance of the ATI Radeon™ E2400 GPU¹, the
AMD Radeon E6460 GPU sets a new bar for features and performance in an entry-level embedded GPU. With five years of planned supply availability and with the graphics memory included in the same package, the AMD Radeon E6460 GPU delivers the longevity, small footprint, and ease of design demanded by embedded system designers.
The AMD Radeon E6460 GPU enables an immersive experience with desktop-level 3D graphics and multimedia features:
  • An advanced 3D graphics engine and programmable shader architecture supports Microsoft DirectX® 11 technology for superior graphics rendering.
  • The third generation unified video decoder enables dual HD decode of H.264, VC-1, MPEG4 and MPEG2 compressed video streams.
  • The 512 MB GDDR5 frame buffer included in the BGA package provides high memory bandwidth while reducing the total footprint of the platform solution and the effort needed to design and maintain the system over time.
  • Targeted at casino gaming, digital signage, instrumentation and industrial control systems, the
    AMD Radeon E6460 GPU enables system designers to deliver products with value-conscious mindset.
  • One system design, multiple customer product categories.  The BGA ballout of the AMD Radeon E6460 GPU is a subset of the higher performing AMD Radeon E6760 GPU enabling system designers to develop one system for both GPUs.
  • Featuring multi-display support with AMD Eyefinity technology, the AMD Radeon E6460 GPU supports up to four independent output displays², HDMI 1.4 stereoscopic video and DisplayPort 1.2 for higher link speeds and simplified display connectivity.
  • The AMD Radeon E6460 GPU comes with five years of planned supply availability. Technical support is provided by a dedicated team of application engineering experts.
AMD Radeon E6460 GPU can be paired with select models of AMD's next generation high-performance Accelerated Processing Units (APU) to offer additional graphics capability and additional parallel computing power.

» Learn more about the AMD Radeon E6460 GPU


1 System configuration: AMD Athlon™ II X4 620 @ 2.6GHz 4 Core, Gigabyte GA-MA770T-UD3P, Corsair XMS3 4GB (2x2GB) 1333MHz, Windows® 7 64bit Ultimate with 3DMark2006 v 1.2 and 3DMark Vantage v 1.0.2.

2 AMD Eyefinity technology can support multiple displays limited by display output clock dependencies and has certain restrictions on supported display interfaces.  For AMD Radeon E6460, AMD Eyefinity technology supports up to 4 displays.  Microsoft® Windows® 7, Windows Vista®, or Linux® is required in order to support more than two displays.  SLS ("Single Large Surface") functionality requires an identical display resolution on all configured displays.  AMD Eyefinity technology works with applications that support non-standard aspect ratios, which is required for panning across multiple displays.

©2011 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. DirectX is a registered trademark of Microsoft Corporation.  Other names are for information only and may be trademarks of their respective owners.

Monday, August 08, 2011

Khronos Releases OpenGL 4.2 Specification

From Slashdot:

jrepin tips news that the Khronos Group has announced the release of the OpenGL 4.2 specification.
 
Some of the new functionality includes: "Enabling shaders with atomic counters and load/store/atomic read-modify-write operations to a single level of a texture (These capabilities can be combined, for example, to maintain a counter at each pixel in a buffer object for single-rendering-pass order-independent transparency); Capturing GPU-tessellated geometry and drawing multiple instances of the result of a transform feedback to enable complex objects to be efficiently repositioned and replicated; Modifying an arbitrary subset of a compressed texture, without having to re-download the whole texture to the GPU for significant performance improvements; Packing multiple 8 and 16 bit values into a single 32-bit value for efficient shader processing with significantly reduced memory storage and bandwidth, especially useful when transferring data between shader stages."

Thursday, September 10, 2009

6 Displays off one graphics card.

Eyefinity pushes over 24 million pixels with one next-gen Radeon

Using one Graphics card and GPU AMD has six monitors at a resolution of 2560x1600, f0r a total resolution: (or 24.6 megapixels).

With their driver it can be configured to appears in Windows like one gigantic 7680x3200 screen so software will not have to be adapted to support it. Pre-existing games should just come up and support it.