A 2.5D CULLING FOR FORWARD+ AMD Takahiro Harada

AGENDA  Forward+ –  Forward, Deferred, Forward+ –  Problem description  2.5D culling  Results

2 | A 2.5D culling for Forward+ | Takahiro Harada

FORWARD+

3 | A 2.5D culling for Forward+ | Takahiro Harada

REAL-TIME SOLUTION COMPARISON  Rendering equation

 Forward

 Deferred

 Forward+

4 | A 2.5D culling for Forward+ | Takahiro Harada

FORWARD RENDERING PIPELINE  Depth prepass –  Fills z buffer   Prevent overdraw for shading

 Shading –  Geometry is rendered –  Pixel shader   Iterate through light list set for each object   Evaluates materials for the lights

5 | A 2.5D culling for Forward+ | Takahiro Harada

FORWARD+ RENDERING PIPELINE  Depth prepass –  Fills z buffer   Prevent overdraw for shading   Used for pixel position reconstruction for light culling

 Light culling

3

1

–  Culls light per tile basis

2

–  Input: z buffer, light buffer –  Output: light list per tile  Shading –  Geometry is rendered –  Pixel shader   Iterate through light list calculated in light culling   Evaluates materials for the lights

6 | A 2.5D culling for Forward+ | Takahiro Harada

[1]

[1,2,3]

[2,3]

CREATING A FRUSTUM FOR A TILE  An edge @SS == A plane @VS  A tile (4 edges) @SS == 4 planes @VS –  Open frustum (no bound in Z direction)  Max and min Z is used to cap

7 | A 2.5D culling for Forward+ | Takahiro Harada

LONG FRUSTUM  Screen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights

8 | A 2.5D culling for Forward+ | Takahiro Harada

LONG FRUSTUM  Screen space culling is not always sufficient –  Create a frustum from max and min depth values –  Edge of objects –  Captures a lot of unnecessary lights

9 | A 2.5D culling for Forward+ | Takahiro Harada

0 lights 25 lights 50 lights

GET WORSE IN A COMPLEX SCENE

10 | A 2.5D culling for Forward+ | Takahiro Harada

0 lights 100 lights 200 lights

QUESTION  Want to reduce false positives  Can we improve the culling without adding much overhead? –  Computation time, memory –  Culling itself is an optimization –  Spending a lot of resources for it does not make sense  Using a 3D grid is a natural extension –  Uses too much memory

11 | A 2.5D culling for Forward+ | Takahiro Harada

2.5D CULLING

12 | A 2.5D culling for Forward+ | Takahiro Harada

2.5D CULLING  Additional memory usage –  0B global memory –  4B local memory per WG (can compress more if you want)  Additional computation complexity –  A few bit and arithmetic instructions –  A few lines of codes for light culling –  No changes for other stages  Additional runtime overhead –  < 10% compared to the original light culling

13 | A 2.5D culling for Forward+ | Takahiro Harada

IDEA  Split frustum in z direction –  Uniform split for a frustum –  Varying split among frustums

(a) 14 | A 2.5D culling for Forward+ | Takahiro Harada

(b)

FRUSTUM CONSTRUCTION  Calculate depth bound –  max and min values of depth  Split depth direction into 32 cells –  Min value and cell size  Flag occupied cell  A 32bit depth mask per work group

15 | A 2.5D culling for Forward+ | Takahiro Harada

A tile

FRUSTUM CONSTRUCTION

A tile

 Calculate depth bound –  max and min values of depth  Split depth direction into 32 cells –  Min value and cell size  Flag occupied cell  A 32bit depth mask per work group

0

1

7

7

7

7

7

7

7

2

7

7

2

1

7

2

1

0

2

3

4

5

6

7

Depth mask = 11100001 16 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING  If a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum  Depth mask & Depth mask –  11100001 & 00011000 = 00000000

Depth mask = 00011000

Depth mask = 11100001 17 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING  If a light overlaps to the frustum –  Calculate depth mask for the light –  Check overlap using the depth mask of the frustum  Depth mask & Depth mask –  11100001 & 00110000 = 00100000

Depth mask = 00110000

Depth mask = 11100001 18 | A 2.5D culling for Forward+ | Takahiro Harada

CODE Original

19 | A 2.5D culling for Forward+ | Takahiro Harada

With 2.5D culling

RESULTS

20 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING

21 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING + 2.5D CULLING

22 | A 2.5D culling for Forward+ | Takahiro Harada

COMPARISON

Number'of'*les'

10000"

With"2.5D"culling" Without"2.5D"culling"

1000"

100"

10"

1" 1" 2" 3" 4" 5" 6" 7" 8" 9" 10" 11" 12" 13" 14" 15" 16" 17" 18" 19" 20" 21" 22" 23" Number'of'lights'(x10)'

220 lights/frustum -> 120 lights/frustum 23 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING

24 | A 2.5D culling for Forward+ | Takahiro Harada

LIGHT CULLING + 2.5D CULLING

25 | A 2.5D culling for Forward+ | Takahiro Harada

COMPARISON

Number'of'*les'

10000"

With"2.5D"culling" Without"2.5D"culling"

1000"

100"

10"

1" 1"

2"

3"

4"

5"

6"

7"

8"

9"

10" 11" 12" 13" 14" 15" 16" 17"

Number'of'lights'(x10)'

26 | A 2.5D culling for Forward+ | Takahiro Harada

PERFORMANCE

6"

!me\$(ms)\$

5" 4"

Forward+"w."frustum"culling" Forward+"w."2.5D" Deferred"

3" 2" 1" 0" 1024"

27 | A 2.5D culling for Forward+ | Takahiro Harada

2048" 3072" Number\$of\$lights\$

4096"

CONCLUSION  Proposed 2.5D culling which –  Additional memory usage   0B global memory   4B local memory per WG (can compress more if you want)

–  Additional compute complexity   3 lines of pseudo codes for light culling   No changes for other stages

–  Additional runtime overhead   < 10% compared to the original light culling

 Showed that 2.5D culling reduces false positives

28 | A 2.5D culling for Forward+ | Takahiro Harada

## a 2.5d culling for forward+

15 | A 2.5D culling for Forward+ | Takahiro Harada. FRUSTUM CONSTRUCTION. â« Calculate depth bound. â max and min values of depth. â« Split depth direction into 32 cells. â Min value and cell size. â« Flag occupied cell. â« A 32bit depth mask per work group. A tile ...

#### Recommend Documents

Shell egg culling system
Jan 21, 1992 - a system for automatically detecting and culling shell eggs having clean shells from shell eggs having soiled shells; providing such a system wherein the shell eggs are inspected by a video system; providing such a sys tem wherein vide

Occlusion Culling for the Visualization of Aeronautical Engines Digital ...
1. Occlusion Culling for the Visualization of Aeronautical Engines. Digital Mock-ups. Aiert Amundarain1, Diego Borro1, Luis Matey2, Alex Garcia Alonso3. (1) CEIT (Centro de ... to the shape and characteristics of an engine digital mock-up. Due to the

A Study of Nonlinear Forward Models for Dynamic ...
644727) and FP7 European project WALK-MAN (ICT 2013-10). .... placement control for bipedal walking on uneven terrain: An online linear regression analysis.

Besit: A Methodology for the Development of Forward ...
-30. -20. -10. 0. 10. 20. 30. 40. PDF energy (pages). Figure 3: These results were obtained by Watanabe ... measured instant messenger and RAID array latency.

manual-montacargas-msi20d-25d-30d-2e2-35-turbo ...
MSI 20 D SÃ©rie 2-E2 - MSI 20 D BUGGIE SÃ©rie 2-E2 ... 4 - ACCESORIOS OPCIONALES ADAPTABLES A LA GAMA. Page 3 of 120. Page 4 of 120. Page 4 of 120. manual-m ... idad.pdf. manual-m ... idad.pdf. Open. Extract. Open with. Sign In. Details. Comments. Ge

Moving Forward: A Unified Statement.pdf
Page 1 of 8. FEBRUARY 2018. Moving Forward: A Unified Statement on the. Humane, Sustainable, and Cost-Eective. On-Range Management of America's Wild. Horses and Burros. Â© Kimerlee Curyl. Page 1 of 8. Page 2 of 8. 1. 2. 3. Page 2 of 8. Page 3 of 8. 4

web3d-2016-limberger-slides--dynamic-25d-treemaps-using ...
... by the sum of the weights of its child nodes. Simple Graph Rectangular Treemap. 12 8 10 70. 100. 10 70. 25. 12 8. 2. 2 3. 1 5 40. 2. 6 8 11. 5 Dynamic 2.5D Treemaps using Declarative 3D on the Web 2017-01-05. Page 5 of 20. web3d-2016-limberger-sl

web3d-2016-limberger-paper--dynamic-25d-treemaps-using ...
http://lgh3292.javaeye.com. ç¬¬ 2 / 89 é¡µ. Page 2 of 4. Page 3 of 4. web3d-2016-limberger-paper--dynamic-25d-treemaps-using-declarative-3d-on-the-web.pdf.

One Step Forward, Two
clarity that on a number of issues involving the practical application of our ... comp]ete accuracy the development of this celebrated coalition of the Iskra-ist .... altogether and openly opposed it, or paid lip service to it but actually sided time

Dynamic forward error correction
Nov 5, 2003 - sponding error correction data therebetWeen during a plural ity of time frames. ..... mobile sWitching center, or any communication device that can communicate .... data according to a cyclical redundancy check (CRC) algo rithm and ...

Forward Tilt.pdf
Loadingâ¦ Whoops! There was a problem loading more pages. Whoops! There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Forward Tilt.pdf. Forward Tilt.pdf. Open.

Looking forward to 2016
Sep 4, 2015 - 4) ERP software implementation cost of US\$9.1m. ..... Information on the accounts and business of company(ies) will generally be based on ...

Looking forward to 2016
Sep 4, 2015 - We adjust for a small new share ..... DMFI's costs, primarily due to an upward revaluation of inventory which ... 4) ERP software implementation cost of US\$9.1m. ... management team has reverted the pricing strategy to below a dollar. .

P2P Cache-and-Forward Mechanisms for Mobile Ad Hoc ... - Eurecom
desired content distribution. We consider a tagged1 information content and we target two desirable distributions of information: the first uniform over the spatial ...

Home Forward - Dashboard Report For February of 2016 Property ...
Studio/SRO. 1 Bdrm. 2 Bdrm. 3 Bdrm. 4 Bdrm. 5+ Bdrm. Total. Public Housing .... Management. Fees (HMF). % Family Type (head of household). Households.

Home Forward - Dashboard Report For February of 2016 Property ...
Studio/SRO. 1 Bdrm. 2 Bdrm. 3 Bdrm. 4 Bdrm. 5+ Bdrm. Total. Public Housing .... Management. Fees (HMF). % Family Type (head of household). Households.

Symbol Error Rate Expression for Decode-and-Forward ...
Apr 4, 2009 - prove the performance of wireless communications over fading ... The advantage of this scheme is that it not only allows us to optimize the.

The Delay-Capacity product for store-and-forward ...
Data is transmitted between the network's terminals (being computers, teleprin- ..... Consider an arbitrary vertex v in graph G, such that degree (v)= deg(v)=.

P2P Cache-and-Forward Mechanisms for Mobile Ad Hoc Networks
network area where user devices are equipped with a data cache and communicate according to an ad hoc networking paradigm. We assume that users create ...