Many Issue with Mali 400 on A10 (Linux)
#1
Posted 17 August 2012 - 10:19 AM
I am facing many problems with Mali 400. One of the big issue is lib is close source so can't change anything.
I am in advance stage of product development but stuck with many issues.
Point 1.
I was using Ericsson Texture Compression Mipmap Example and trying to work with normal RGBA texture.
Here is my load texture code.
/* Load just base level texture data. */
GL_CHECK(glGenTextures(1, &textureID));
GL_CHECK(glBindTexture(GL_TEXTURE_2D, textureID));
unsigned char *textureData = NULL;
Texture::loadData(texturePath.c_str(), &textureData);
GL_CHECK(glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, 256, 256, 0, GL_RGBA, GL_UNSIGNED_BYTE, textureData));
Here it just render a black area.
Just by referring RotoZoom program and with hit and try I fogure out it work if I add.
/* Set texture mode. */
GL_CHECK(glGenerateMipmap(GL_TEXTURE_2D));
after glTexImage2D.
Is glGenerateMipmap must to use texture ?
besides there is not any sample program which load basic rgba texture.
Point 2.
Besides with 2 or 3 test system give sigmentation fault continuously. Even is workable program.
Same program which was working before not work now and can work again on system restart.
Point 3.
EGLSwapbuffer is consuming 50% CPU. & I can't find any way to render directly to frame-buffer. Mali document say it's platform specific.
I can't understand FB interface is fix in all linux how come it could be platform specific ?.
Every body no most of vendor not support for anything like this.
Point 4.
http://www.arm.com/p...mali-400-mp.php
here mention 30M tri/s and up to 1.1G pix/s at 275MHz.
Correct me if I am wrong 1.1G Pix is filling pixels of screen with color or texture.
1.1 * 1000 *1000 *1000 / 1920 / 1080 / 60 = 8.84 Layers
if means if I render full hd @ 60 FPS I should be able to render full screen 8.84 times.
I am using a simple triangle program which is filling the screen and rendering 8 times in one round. then performance is dropping to 14fps.
Means 4 times slower than speq. where other points gpu is running on 320 Mhz not on 275 Mhz.
Point 5.
A very simple triangle program is not working in GLES1 so I am bound to use only GLES2.
As many places internet mention GLES1 is faster than GLES2.
There is not any sample program for GLES1. and document talk about gles1 simulator but there is not any gles1 simulator on mali website.
Point 6.
What is the process to get Mali 400 DDK Source.
Is it possible to get even after spending some money ?
Any clarification in above points can help us.
Thanks in Advance.
Regards
Piyush Verma
#2
Posted 17 August 2012 - 10:50 AM
for point 1, as you have discovered, if OpenGL-ES considers the texture "incomplete" it will instead render black texels. In this case, the glTexParameteri() mode for GL_TEXTURE_MIN_FILTER has probably been left on its default of GL_NEAREST_MIPMAP_LINEAR - in which case OpenGL-ES will consider the texture incomplete unless you load *all* mipmap levels, or generate them automatically using the glGenerateMipmap() call as you have done.
Hope this clarifies what was happening. Cheers, Pete
ARM Media Processing Division
#3
Posted 17 August 2012 - 10:52 AM
for point 2 I think more information would be needed to understand the problem. Can you post a stack backtrace of the segmentation fault? Can you reduce the problem to a small code snippet and share the steps to reproduce?
Cheers, Pete
ARM Media Processing Division
#4
Posted 17 August 2012 - 10:54 AM
for point 3 - how are you measuring this? Because the Mali GPU uses a deferred architecture, it is sometimes not unusual to see a seemlingly large amount of time in eglSwapBuffers(). The way the deferred architecture works, the driver collects all of the draw commands during the frame, but just adds them to a queue - it does not do actual rendering at the time of the glDraw*() call like an immediate mode renderer would. Instead, when the end of the frame is reached as indicated by eglSwapBuffers(), the driver will then work out the required drawing operations and execute them.
Could this be what you are seeing?
HTH, Pete
ARM Media Processing Division
#5
Posted 17 August 2012 - 10:58 AM
regarding point 5 - have you seen the Mali OpenGL-ES 1.1 emulator here?
http://www.malidevel...11-emulator.php
If you can post your OpenGL-ES code that isn't working perhaps we can help you debug it.
I'm not sure the statement about OpenGL-ES 1.1 vs 2.0 being faster is necessarily true - it depends what effects you are trying to achieve. Some effects will be much harder when forced to use the fixed funtionality of 1.1 when you could write a much more efficient custom shader yourself.
Can you explain more about what you are trying to achieve?
HTH, Pete
This post has been edited by Pete: 17 August 2012 - 12:36 PM
ARM Media Processing Division
#6
Posted 17 August 2012 - 11:02 AM
for point 6, no - ARM would only expect to licence the Mali-400 Driver Development Kit source code to a Mali-400 silicon licencee.
It should not be necessary to need the OpenGL-ES implementation source code in order to write or debug applications using the OpenGL-ES API - this is part of the point of abstracting the graphics operations into a standards-body specified API.
HTH, Pete
ARM Media Processing Division
#7
Posted 17 August 2012 - 11:08 AM
regarding point 4, I expect the numbers are considering the best possible scenario. For instance, they may assume an infinitely fast memory bus which is able to consume the output from the Mali. In the real world, perhaps your device has become memory bandwidth saturated before the Mali has reached peak pixel output? What is the memory bandwidth of your device?
Also, I imagine things like color depth and Z buffer come into play - to achieve the maximum throughput you would probably ensure the depth buffer is disabled, and that no writes are happening to it to consume extra cycles. Similarly, the color depth would be configured to the minimum to maximise output bus efficiency.
What configuration are you using for your depth and color buffers? What method are you using to try rendering 8 layers on top of each other?
Cheers, Pete
ARM Media Processing Division
#8
Posted 17 August 2012 - 11:56 AM
Pete, on 17 August 2012 - 10:50 AM, said:
for point 1, as you have discovered, if OpenGL-ES considers the texture "incomplete" it will instead render black texels. In this case, the glTexParameteri() mode for GL_TEXTURE_MIN_FILTER has probably been left on its default of GL_NEAREST_MIPMAP_LINEAR - in which case OpenGL-ES will consider the texture incomplete unless you load *all* mipmap levels, or generate them automatically using the glGenerateMipmap() call as you have done.
Hope this clarifies what was happening. Cheers, Pete
First Thank you very much Pete for so quick all reply. I was so much surprised with so quick response.
So as conclusion even if I don't need mipmap I need to call glGenerateMipmap() right ?
#9
Posted 17 August 2012 - 12:07 PM
Pete, on 17 August 2012 - 10:58 AM, said:
regarding point 5 - have you seen the Mali OpenGL-ES 1.1 emulator here?
http://www.malidevel...11-emulator.php
If you can post your OpenGL-ES code that isn't working perhaps we can help you debug it.
I'm not sure the statement about OpenGL-ES 1.1 vs 2.0 being faster is necessarily true - it depends what effects you are trying to achieve. Some effects will be much harder when forced to use the fixed funtionality of 1.1 when you could write a much more efficient customer shader yourself.
Can you explain more about what you are trying to achieve?
HTH, Pete
Thanks Pete,
I don't have early program But I will write once again and post here latter.
My use case is multiple digital photo frame in single screen. where is not use of 3d engine similer to cocos2d.
Each photo frame will change picture with animation effects.
There will also be video playback same time in some partial area. where decoded frame will be rendered as texture.
In My case one texture is not going to render again. So it may not need mimap.
Thanks & Regards
Piyush Verma
#10
Posted 17 August 2012 - 12:14 PM
Pete, on 17 August 2012 - 10:58 AM, said:
regarding point 5 - have you seen the Mali OpenGL-ES 1.1 emulator here?
http://www.malidevel...11-emulator.php
If you can post your OpenGL-ES code that isn't working perhaps we can help you debug it.
I'm not sure the statement about OpenGL-ES 1.1 vs 2.0 being faster is necessarily true - it depends what effects you are trying to achieve. Some effects will be much harder when forced to use the fixed funtionality of 1.1 when you could write a much more efficient customer shader yourself.
Can you explain more about what you are trying to achieve?
HTH, Pete
Got it thanks.
Was it really available there before or u uploaded it
Thanks
#11
Posted 17 August 2012 - 12:34 PM
Piyush Verma, on 17 August 2012 - 11:56 AM, said:
Not quite - if you don't want to use mipmaps, you can use glTexParameteri() to set GL_TEXTURE_MIN_FILTER to either GL_NEAREST (no filtering at all, will look low visxual quality but be fast) or GL_LINEAR (no mipmap levels, but 4 texels will be sampled and averaged, so higher visual quality).
However, typically using mipmaps is advisable - they mean the GPU can make better use of the limited amount of texture cache available. When a large texture is drawn on a small triangle, only sparse points on the texture will be sampled even though the cache brings in blocks of adjacent texels - so the cache will be filled with data which won't be used again, defeating the point of the cache. When using mipmaps, the GPU selects the most appropriate mipmap level to use, and the texels sampled will be much more likely to be adjacent - leading to cache hits and improved performance, memory bandwidth usage and power consumption.
HTH, Pete
This post has been edited by Pete: 17 August 2012 - 12:34 PM
ARM Media Processing Division
#12
Posted 17 August 2012 - 12:36 PM
Pete, on 17 August 2012 - 10:54 AM, said:
for point 3 - how are you measuring this? Because the Mali GPU uses a deferred architecture, it is sometimes not unusual to see a seemlingly large amount of time in eglSwapBuffers(). The way the deferred architecture works, the driver collects all of the draw commands during the frame, but just adds them to a queue - it does not do actual rendering at the time of the glDraw*() call like an immediate mode renderer would. Instead, when the end of the frame is reached as indicated by eglSwapBuffers(), the driver will then work out the required drawing operations and execute them.
Could this be what you are seeing?
HTH, Pete
Just run AntiAlias Demo in normal case and then replace glSwapBuffer to glFlush.
If it's rendering on fixed interval the cpu uses will down dramatically or if it's running in full loop FPS will increase dramatically.
Thanks & Regards
Piyush Verma
#13
Posted 17 August 2012 - 12:37 PM
Piyush Verma, on 17 August 2012 - 12:14 PM, said:
Was it really available there before or u uploaded it
I'm afraid it was there all along :-)
ARM Media Processing Division
#14
Posted 06 September 2012 - 02:40 PM
引用框(Pete @ 17 August 2012 - 10:54 AM)
for point 3 - how are you measuring this? Because the Mali GPU uses a deferred architecture, it is sometimes not unusual to see a seemlingly large amount of time in eglSwapBuffers(). The way the deferred architecture works, the driver collects all of the draw commands during the frame, but just adds them to a queue - it does not do actual rendering at the time of the glDraw*() call like an immediate mode renderer would. Instead, when the end of the frame is reached as indicated by eglSwapBuffers(), the driver will then work out the required drawing operations and execute them.
Could this be what you are seeing?
HTH, Pete
Hello, regarding your anwser to the point 3 I have a question: here is our usual process to render a scene (not speaking about multithreading now) :
while (gameloop)
{
1 - Do all GL commands according to the previous game behavior update
2 - Do game behavior update
3 - Call eglSwapBuffers
};
We call eglSwapBuffers(3) after the game behavior update(2) to let the GPU work during this time (as we have sent all GL command before in (1)).
Does it means that on Mali 400 this method is not a good one? What would you advise?
Thank you,
#15
Posted 06 September 2012 - 03:42 PM
The following is far more common
while( true )
Do game behavior update
Call glClear ( COLOR | DEPTH | STENCIL )
Do all GL commands according to the game behavior update
Call eglSwapBuffers
Remember eglSwapBuffers is asynchronous and doesn't actually swap anything, it just tells the driver "I'm done with this window surface". The actual window system update happens "later" under driver control.
This post has been edited by isogen74: 06 September 2012 - 03:44 PM
#16
Posted 06 September 2012 - 03:57 PM
引用框(isogen74 @ 06 September 2012 - 03:42 PM)
The following is far more common
while( true )
Do game behavior update
Call glClear ( COLOR | DEPTH | STENCIL )
Do all GL commands according to the game behavior update
Call eglSwapBuffers
Remember eglSwapBuffers is asynchronous and doesn't actually swap anything, it just tells the driver "I'm done with this window surface". The actual window system update happens "later" under driver control.
Hello, thank you for your reply.
If I well understand: the eglSwapBuffers take care of the GL commands I do, pass it to the GPU that will handle these command asynchronously. Then, if the next time I call eglSwapBuffers, the previous list of gl command is not yet finish to draw by the GPU, eglSwapBuffers will wait until it's finished ; this is why we can have a big amount of time consumed by eglSwapBuffers.
Is that right?
Regards,
#17
Posted 06 September 2012 - 04:14 PM
The eglSwapBuffer wait time varies - it depends on the level on N-buffering supported by the window system. We only need to wait if we are running ahead of the windowing system and so do not have a buffer to render to (you don't want to run hundreds of frames ahead of the hardware, latency is bad and you just use a lot of memory, so most windowing systems rate limit the driver stack so it is only a few frames ahead of what is on screen); that rate limiting is what causes us to wait in eglSwapBuffers in most cases.
Share this 












