XBMC - Algorithm to handle the generated dirty regions

While the union of dirty regions might not be enought I spent the day
trying to create a good algorithm and since I didn't find much info on
the internet I decided to blog about it. If anyone have a done this
before or interested here is the code and the blog post :slight_smile:

http://xbmc.org/topfs2/2010/06/30/what-to-do-when-you-have-the-dirty-regions/

The greedyTest tests it but the code is abit uggly since I tried alot of
different stuff. The only real usefull method is the CostReduction one
in greedySDL

greedySDL.cpp (6.74 KB)

greedyTest.cpp (4.26 KB)

Geometry.h (3.7 KB)

Makefile (204 Bytes)

Tobias,

Sorry I was a bit slow on this, been busy.

Optimisation problems are `a little more complicated than that', but
you often don't really need them either. My initial thought was not
to do unions unless the regions intersect, or if they are close
together/abut. Your heuristic looks reasonable, although i'm not sure
your examples really reflect the type of rendering you're doing either
(so spread out with such an amount of untouched space). At 720p even
a small screen area is quite a lot of pixels to fill, and if for the
most part you're just rendering quads it may not be worth getting too
smart at all (e.g. cost of queueing up more quad renders vs filling
pixels - your examples still have a lot of unneeded fill pixels).

BTW have you yet managed to try turning off the video plane rendering
(not creation) to see how that affects the framerate?

!Z

lör 2010-07-03 klockan 10:22 +0930 skrev Michael Zucchi:

Tobias,

Sorry I was a bit slow on this, been busy.

Optimisation problems are `a little more complicated than that', but
you often don't really need them either. My initial thought was not
to do unions unless the regions intersect, or if they are close
together/abut. Your heuristic looks reasonable, although i'm not sure
your examples really reflect the type of rendering you're doing either
(so spread out with such an amount of untouched space). At 720p even
a small screen area is quite a lot of pixels to fill, and if for the
most part you're just rendering quads it may not be worth getting too
smart at all (e.g. cost of queueing up more quad renders vs filling
pixels - your examples still have a lot of unneeded fill pixels).

Well the algorithm uses both fixed costs and area costs, so if we don't
want many quads to be rendered we have a higher fixed cost vs the area
cost and vice versa. In my examples I have rated fixed costs to be very
high, which is why you see it generating so few regions. I have rated it
as hight since as it is now we need to render every control even if its
outside the given region (this should be very simple to optimize though)
so I would assume the generated regions would look quite a bit different
when the costs are more correct. Although, I won't argue that my
examples are a bit missleading and I doubt controls would be scattered
in this randomized fashion.

That being said I must admit this has mostly been an experiment too see
if there exist a simple and smart algorithm to create multiple quads
which would yield a better performance. I am still a bit unsure if this
algorithm is the one we would want. I think your idea with just put a
union on the ones that intersect is an interesting idea aswell, think I
need to do some tests on that one aswell!

BTW have you yet managed to try turning off the video plane rendering
(not creation) to see how that affects the framerate?

This week I have focused almost entirely to get the dirty region stuff
working properly with all that have entailed but I have checked through
the code but not much more than that unfortuanatly.

I will try to comment out it tonight and see what numbers I come up
with.

!Z

> While the union of dirty regions might not be enought I spent the day
> trying to create a good algorithm and since I didn't find much info on
> the internet I decided to blog about it. If anyone have a done this
> before or interested here is the code and the blog post :slight_smile:
>
> http://xbmc.org/topfs2/2010/06/30/what-to-do-when-you-have-the-dirty-regions/
>
> The greedyTest tests it but the code is abit uggly since I tried alot of
> different stuff. The only real usefull method is the CostReduction one
> in greedySDL
>

Cheers,
Tobias

lör 2010-07-03 klockan 10:22 +0930 skrev Michael Zucchi:

Tobias,

Sorry I was a bit slow on this, been busy.

Optimisation problems are `a little more complicated than that', but
you often don't really need them either. My initial thought was not
to do unions unless the regions intersect, or if they are close
together/abut. Your heuristic looks reasonable, although i'm not sure
your examples really reflect the type of rendering you're doing either
(so spread out with such an amount of untouched space). At 720p even
a small screen area is quite a lot of pixels to fill, and if for the
most part you're just rendering quads it may not be worth getting too
smart at all (e.g. cost of queueing up more quad renders vs filling
pixels - your examples still have a lot of unneeded fill pixels).

BTW have you yet managed to try turning off the video plane rendering
(not creation) to see how that affects the framerate?

To bring an update on this after a long week of oprofiling and trying to
make stuff work on the beagle :slight_smile:

I tried commenting out everything regarding upload and rendering but
didn't see much of a difference. However I must have done something
wrong because if I just remove the yuv2rgb part of the shader (still
uploads and presents via SGX) 480p bunny is almost watchable.

So omap overlays seems to be the path to follow and given the patches
I've commited this week CPU has gone from 100% to 10-20% (no video
playback) and 480p spends most of the time still idleing I have big
hopes on imap overlays!

Some binaries for people to try: http://www.angstrom-distribution.org/~koen/xbmc-20100710.tar.bz2

Make sure you have the depencies listed at http://www.angstrom-distribution.org/repo/?pkgname=xbmc installed. You can opt to install the ipk, but that will give you the trunk version, not the gsoc version.

regards,

Koen