Releases: zeux/meshoptimizer
v0.12
This release contains a few improvements for various algorithms, introduces support for triangle strips with degenerate triangles and adds gltfpack (alpha).
Interface changes:
meshopt_stripify
andmeshopt_unstripify
now require an extra argument,restart_index
Improvements:
- Improve
meshopt_simplifySloppy
performance by up to 10% by using three-point interpolation search - Improve results of
meshopt_optimizeVertexCache
by up to 0.5% by using a new data set obtained with differential evolution meshopt_stripify
now supports stitching strips using degenerate triangles instead of restart indices; this typically results in a 10% larger index buffer compared to restart indices, but on some GPUs it can be substantially faster to render
gltfpack:
This release introduces an alpha vesion of gltfpack. gltfpack is a command-line tool that converts .obj or .gltf files to glTF files that are optimized for render performance and transmission time. gltfpack merges meshes and materials to reduce draw call count, merges buffers to reduce draw setup cost, quantizes vertex attributes to reduce GPU memory footprint, optimizes vertex and index data for more efficient GPU rendering, resamples and quantizes animation data to reduce memory footprint, and can optionally compress the vertex/index/animation buffers in the output using meshoptimizer codecs to further reduce the file size.
The resulting files rely on two not-yet-standardized extensions; when compression is not used, the resulting files can be loaded using three.js (r107+) and Babylon.js (4.1+) glTF loaders. Loading compressed files requires integrating JavaScript decoders (js/meshopt_decoder.js
); demo/GLTFLoader.js
contains a custom version of three.js loader that can be used to load them.
v0.11
This release contains a few improvements for simplifier, introduces a new simplification algorithm, adds support for custom allocators and improves performance and code size of JavaScript decoders.
Interface changes:
meshopt_computeMeshletBounds
now passesmeshlet
parameter by pointer instead of by value.
New algorithms:
- Introduce a new simplification algorithm,
meshopt_simplifySloppy
, that performs decimation without concerns for topological integrity. The algorithm can and will merge small disjoint features together, and is extremely fast at ~20M triangles/sec on large meshes on modern desktop CPUs. - Memory allocation can now be configured to use custom allocation callbacks using
meshopt_setAllocator
.
Improvements:
- Default simplifier now uses normalized error metric, which makes it much easier to consistently configure
target_error
parameter - it now corresponds to linear error, normalized to mesh radius (0.01 means 1% deviation). - Fix edge cases when default simplifier could run many passes in vain, resulting in poor performance.
- Improve JavaScript decoder performance: vertex decoding is 17% faster, index decoding is 1.7x faster.
- Improve JavaScript decoder size:
decoder.js
is now 2.4x smaller (3.5 KB after gzip)
Compatibility:
- Fix gcc -Wshadow warnings
- Work around a bug in Edge ChakraCore compiler that could result in indices being incorrectly decoded with
decoder.js
.
v0.10
This release contains a number of fixes and improvements for vertex codec, substantially improves performance of several algorithms in Debug builds and introduces support for decompressing vertex/index data from JavaScript.
New algorithms:
- Introduce an experimental algorithm,
meshopt_generateVertexRemapMulti
, that generates the same remap table asmeshopt_generateVertexRemap
for indexing a mesh, but supports vertex data stored as multiple independent streams (deinterleaved) - Introduce an experimental algorithm,
meshopt_generateShadowIndexBufferMulti
, that can generate a second index buffer that shares the vertex data with the original index buffer, but supports vertex data stored as multiple independent streams (deinterleaved)
Improvements:
- Optimize NEON code in
meshopt_decodeVertexBuffer
, making it 1-2% faster - Improve compatibility of SIMD code in
meshopt_decodeVertexBuffer
, fixing compilation issues on ARM64, MSVC ARM, and clang for Windows - Fix a bug in
meshopt_encodeVertexBuffer
that resulted in incorrectly encoded data on platforms wherechar
isunsigned
(this mostly affected ARM hosts such as Android) - Substantially improve performance of multiple algorithms in Debug:
meshopt_analyzeVertexCache
is 6x fastermeshopt_optimizeVertexCache
is 4.7x fastermeshopt_analyzeOverdraw
is 3.9x fastermeshopt_optimizeOverdraw
is 1.4x fastermeshopt_simplify
is 1.3x faster
JavaScript support:
- Introduce
js/decoder.js
that contains a WebAssembly version of vertex and index decoders with a JavaScript-friendly interface. The decoders run at 200-400 MB/s on modern desktop CPUs. - Introduce
tools/OptMeshLoader.js
that contains an example mesh loader for THREE.js that uses vertex/index codecs for compression and quantizes vertex data for efficient storage; the meshes for this loader can be produced bytools/meshencoder.cpp
using .OBJ files as an input.
v0.9
This release substantially improves mesh simplification and introduces experimental algorithms for advanced GPU mesh rendering (cone culling, meshlet construction). The library can also now be used from Rust via https://crates.io/crates/meshopt.
Interface changes:
meshopt_simplify
has an extra argument,target_error
, that can be used to limit the geometric error introduced by the simplifier
New algorithms:
- Introduce an experimental algorithm,
meshopt_buildMeshlets
, that can create meshlet data from index buffer that can be used to efficiently drive the mesh shading pipeline in NVidia RTX GPUs - Introduce experimental algorithms,
meshopt_computeClusterBounds
andmeshopt_computeMeshletBounds
, that can compute bounding sphere and bounding normal cone for use in GPU cluster culling. - Introduce an experimental algorithm,
meshopt_generateShadowIndexBuffer
, that can generate a second index buffer that shares the vertex data with the original index buffer, but is more efficient when a subset of vertex attributes is needed.
Improvements:
- Significantly rework
meshopt_simplify
to improve simplification quality, including error metric improvements, attribute-guided collapse that preserves UV seam structure better, and other tweaks - Significantly rework and optimize
meshopt_simplify
, making it ~4x faster - Optimize
meshopt_generateVertexRemap
, making it 1.25x faster - Optimize
meshopt_decodeVertexBuffer
for platforms without SIMD support, making it 1.1x faster - Fix undefined behavior (left shift of negative integer) in
meshopt_encodeVertexBuffer
v0.8
This release introduces vertex buffer encoder and a stable version of index buffer encoder.
New algorithms:
- Introduce vertex encoder that compresses vertex buffers; it can be invoked using
meshopt_encodeVertexBuffer
andmeshopt_decodeVertexBuffer
. The algorithm typically provides 1.5-2x compression ratio for quantized vertex data, and the resulting data can be compressed further by a general purpose compressor like zstd. Decoding is highly optimized using SSSE3/NEON and runs at 2 GB/s on a modern desktop CPU. - Introduce a stable index encoder that compresses index buffers; it can be invoked using
meshopt_encodeIndexBuffer
andmeshopt_decodeIndexBuffer
. The algorithm typically encodes index buffers using ~3-4 bits per index, and the resulting data can be compressed further by a general purpose compressor like zstd, yielding ~2-3 bits per index for most meshes. Decoding is highly optimized and runs at 2 GB/s on a modern desktop CPU for 32-bit indices (1 GB/s for 16-bit indices). - Introduce a new algorithm to optimize for vertex fetch,
meshopt_optimizeVertexFetchRemap
; it generates a remap table that can be used withmeshopt_remapVertexBuffer
/meshopt_remapIndexBuffer
and helps optimizing meshes with several vertex streams.
Improvements:
- Optimize cluster sorting in
meshopt_optimizeOverdraw
, making the function 10% faster - Optimize index decoder, making it 15% faster for 32-bit indices and 40% faster for 16-bit indices
- Fix
meshopt_analyzeVertexCache
andmeshopt_analyzeVertexFetch
results for sparse vertex buffers (with unused vertices) - Support in-place optimization in
meshopt_remapVertexBuffer
- Improve CMake build files to make the library easier to integrate
v0.7
This release has large interface changes and introduces several new algorithms and tweaks to existing algorithms.
Interface:
- All C++ function wrappers have been moved out of
meshopt
namespace and gainedmeshopt_
prefix to simplify documentation & interface - All structs used by the interface have been renamed and now also have
meshopt_
prefix to avoid name conflicts meshopt_quantizeX
functions now use function arguments instead of template parameters for better compatibilitycache_size
argument has been removed frommeshopt_optimizeVertexCache
andmeshopt_optimizeOverdraw
; to perform optimization for a FIFO cache of a fixed size, usemeshopt_optimizeVertexCacheFifo
New algorithms:
- Introduce an algorithm that compresses index buffers; it can be invoked using
meshopt_encodeIndexBuffer
andmeshopt_decodeIndexBuffer
. The algorithm typically encodes index buffers using ~3-4 bits per index, and the resulting data can be compressed further by a general purpose compressor like zstd, yielding ~2-3 bits per index for most meshes. - Introduce an algorithm that can convert an index buffer to a triangle strip that is still reasonably cache efficient; indexed triangle strips are faster to render on some hardware and can reduce the index buffer size. The algorithm can be invoked using
meshopt_stripify
and typically produces buffers with around 60-65% indices compared to triangle lists, and a 5-10% ACMR penalty on GPUs with small caches. - Introduce a new quantization function,
meshopt_quantizeFloat
, that can reduce the precision of a floating-point number while keeping the floating-point representation. This can be useful to generate vertex data that can be compressed more effectively using a general purpose compression algorithm.
Improvements:
- Overdraw analyzer (
meshopt_analyzeOverdraw
) now uses a pixel center fill convention to match hardware rendering more closely. - Vertex cache analyzer (
meshopt_analyzeVertexCache
) now models cache that matches real hardware a bit more closely, and requires additional parameters to configure (namely, primitive group size and warp/wavefront size). - Vertex cache optimizer (
meshopt_optimizeVertexCache
) has been tuned to generate better output that performs well on real hardware, especially given meshes that have topology similar to that of a uniform grid as an input. - Various algorithms have been optimized for performance and memory consumption.
v0.6
This release has significant interface changes and introduces several new algorithms and tweaks to existing algorithms.
Interface:
- The library now has a C89 interface;
meshoptimizer.hpp
has been renamed tomeshoptimizer.h
accordingly. Templated functions are still available in namespacemeshopt
for C++. optimizeVertexFetch
,optimizeOverdraw
andanalyzeOverdraw
parameter order has changed - make sure to revise existing calls to these functions.
New algorithms:
- Introduce alternative vertex cache optimizer based on Tom Forsyth's algorithm; it can be invoked by setting
cache_size
parameter ofoptimizeVertexCache
to 0. It generally takes ~3x longer to optimize meshes but usually produces more efficient output with the exception of regular grids. - Introduce mesh simplification algorithm based on edge collapses, see
meshopt::simplify
. This is an early version of the algorithm - expect to see performance and quality improvements in future versions.
Fixes:
remapVertexBuffer
now correctly handles indexed vertex buffers where some vertices are not referencedoptimizeOverdraw
now correctly handles index buffers with degenerate trianglesoptimizeOverdraw
is able to preserve the vertex cache efficiency much better
v0.5
This is the first release of meshoptimizer library. Features:
- Algorithms to index or reindex meshes and optimize them for vertex cache, vertex fetch and overdraw
- Quantization helper functions to convert vertex attributes to GPU-friendly formats
- Algorithms that analyze efficiency of given meshes for vertex cache/fetch/overdraw
- C++ interface with support for 16-bit/32-bit indices
All algorithms have been optimized for both runtime performance (it's actually practical to run them at load time!) and efficiency of the resulting meshes.