- Joined
- Jan 8, 2017
- Messages
- 9,436 (3.28/day)
System Name | Good enough |
---|---|
Processor | AMD Ryzen R9 7900 - Alphacool Eisblock XPX Aurora Edge |
Motherboard | ASRock B650 Pro RS |
Cooling | 2x 360mm NexXxoS ST30 X-Flow, 1x 360mm NexXxoS ST30, 1x 240mm NexXxoS ST30 |
Memory | 32GB - FURY Beast RGB 5600 Mhz |
Video Card(s) | Sapphire RX 7900 XT - Alphacool Eisblock Aurora |
Storage | 1x Kingston KC3000 1TB 1x Kingston A2000 1TB, 1x Samsung 850 EVO 250GB , 1x Samsung 860 EVO 500GB |
Display(s) | LG UltraGear 32GN650-B + 4K Samsung TV |
Case | Phanteks NV7 |
Power Supply | GPS-750C |
In recent light of the Vulkan vs OpenGL comparison on why supposedly developers don't want to use Vulkan which was posted some time ago I thought it would make sense to show something similar on the CPU side of things .
Scalar, non-AVX :
AVX :
That's why game developers stay away from optimizing games using AVX I guess . In all seriousness, there is always a price to pay for better performance, the code above runs about 6.5 times faster even though it looks unintelligible. Same thing with Vulkan vs OpenGL vs DirectX or whatever, it makes no sense to compare things like this.
Scalar, non-AVX :
Code:
void interpolate(vector<vector<int>>& mat)
{
for(int i=2; i<mat.size()-1; i=i+2)
for(int j=0; j<mat[0].size(); j++)
{
mat[i][j] = mat[i-1][j] + 0.5f * (mat[i+1][j] - mat[i-1][j]);
}
}
AVX :
Code:
void interpolate_avx(vector<vector<int>>& mat)
{
for(int i=2; i<mat.size()-1; i=i+2)
for(int j=0; j<mat[0].size(); j=j+8)
{
_mm256_storeu_si256((__m256i *)&mat[i][j], _mm256_cvtps_epi32(_mm256_add_ps(_mm256_mul_ps(_mm256_sub_ps(_mm256_cvtepi32_ps(_mm256_loadu_si256((__m256i *)&mat[i+1][j])), _mm256_cvtepi32_ps(_mm256_loadu_si256((__m256i *)&mat[i-1][j]))), _mm256_set1_ps(0.5f)), _mm256_cvtepi32_ps(_mm256_loadu_si256((__m256i *)&mat[i-1][j])))));
}
}
That's why game developers stay away from optimizing games using AVX I guess . In all seriousness, there is always a price to pay for better performance, the code above runs about 6.5 times faster even though it looks unintelligible. Same thing with Vulkan vs OpenGL vs DirectX or whatever, it makes no sense to compare things like this.