Torque3D/Engine/source/math/public/math_backend.cpp

80 lines
1.9 KiB
C++
Raw Normal View History

#pragma once
#include "math/public/math_backend.h"
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
namespace math_backend::float4::dispatch
{
// Single definition of the global dispatch table
Float4Funcs gFloat4{};
}
namespace math_backend::float3::dispatch
{
// Single definition of the global dispatch table
Float3Funcs gFloat3{};
}
namespace math_backend::mat44::dispatch
{
Mat44Funcs gMat44{};
}
math_backend::backend math_backend::choose_backend(U32 cpu_flags)
{
#if defined(__x86_64__) || defined(_M_X64) || defined(_M_IX86)
if (cpu_flags & CPU_PROP_AVX2) return backend::avx2;
if (cpu_flags & CPU_PROP_AVX) return backend::avx;
if (cpu_flags & CPU_PROP_SSE4_1) return backend::sse41;
if (cpu_flags & CPU_PROP_SSE2) return backend::sse2;
#elif defined(__aarch64__) || defined(__ARM_NEON)
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
if (cpu_flags & CPU_PROP_NEON) return backend::neon;
#endif
return backend::scalar;
}
void math_backend::install_from_cpu_flags(uint32_t cpu_flags)
{
{
g_backend = choose_backend(cpu_flags);
switch (g_backend)
{
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
#if defined(__x86_64__) || defined(_M_X64) || defined(_M_IX86)
case backend::avx2:
float4::dispatch::install_avx2();
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float3::dispatch::install_avx2();
break;
case backend::avx:
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float4::dispatch::install_avx();
float3::dispatch::install_avx();
break;
case backend::sse41:
float4::dispatch::install_sse41();
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float3::dispatch::install_sse41();
break;
case backend::sse2:
float4::dispatch::install_sse2();
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float3::dispatch::install_sse2();
break;
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
#elif defined(__aarch64__) || defined(__ARM_NEON)
case backend::neon:
float4::dispatch::install_neon();
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float3::dispatch::install_neon();
break;
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
#endif
default:
float4::dispatch::install_scalar();
ISA backends float3 and float4 - cleanup history squash working for both neon32 and neon64 Update math_backend.cpp further sse simd additions avx2 float3 added added normalize_magnitude added divide fast to float3 may copy to float4 move static spheremesh to drawSphere (initialize on first use) so platform has a chance to load the math backend all float3 and float4 functions and isas completed all options of float3 and float4 functions in isas and math_c neon still to be done but that will be on mac. Update math_backend.cpp mac isa neon update added float3 restructured the classes to look more like the final version of the x86 classes linux required changes Update build-macos-clang.yml Update build-macos-clang.yml Revert "Update build-macos-clang.yml" This reverts commit 29dfc567f40f20d2400a9967a35bbdb823182e2d. Revert "Update build-macos-clang.yml" This reverts commit 2abad2b4ca4de717c5f4278708f289dd1bb22561. Update CMakeLists.txt fix macs stupid build remove god awful rolling average from frame time tracker.... use intrinsic headers instead each isa implementation now uses a header for that isa's intrinsic functions these are then used in the impl files. This will make it easier for matrix functions when those are implemented. fixed comment saying 256 when it should be 512 for avx512 consolidated initializers for function tables Update neon_intrinsics.h fixes for some neon intrinsics no idea if this is the best way to do these but they work at least v_cross is especially messy at the moment we basically just do it as a c math function need to look into getting this done correctly
2026-02-26 16:45:13 +00:00
float3::dispatch::install_scalar();
mat44::dispatch::install_scalar();
break;
}
}
}