Parallel Sequential Scan •

© 2013 EDB All rights reserved.

Robert Haas, Amit Kapila | PGCon 2015 1

Overall Status •





PostgreSQL 9.4 includes the basic facilities that will be used to implement parallel query (dynamic background workers, dynamic shared memory, shared memory message queues). PostgreSQL 9.5 includes most of the plumbing needed for parallel computation (error propagation, parallel mode/contexts). Working patches exist for parallel sequential scan, but were not committed to PostgreSQL 9.5 due to unresolved issues.

© 2015 EDB All rights reserved.

2

New in 9.5: Message Propagation •





Background workers can talk to user backends using the frontend-backend protocol. Protocol messages are sent via a shared memory message queue (shm_mq). In particular, if the background worker does something that generates an ERROR, WARNING, or other message, it can send that message to the master, and the master can receive it.

© 2015 EDB All rights reserved.

3

New in 9.5: Parallel Mode/Contexts •









Using parallel contexts, backend code can launch worker processes. Various pieces of state are synchronized from the parallel group leader to each worker (more on that in a minute). Neither the master nor the workers are permitted to make permanent modifications to any of the synchronized state while parallelism is active. No writes are allowed. Lots of backend code can run just fine in a parallel worker!

© 2015 EDB All rights reserved.

4

What Gets Synchronized? •

Libraries dynamically loaded by PostgreSQL.



Authenticated user ID and current database.



All GUC values.



XID for current and top level transactions.



XIDs that appear as committed.



Combo CID mappings.



Active and transaction snapshots.



Current user ID and security context.

© 2015 EDB All rights reserved.

5

Patches for 9.6 (1 of 2) •



Heavyweight Lock Handling for Parallel Mode/Contexts – Must prevent unprincipled deadlocks between parallel workers. Assessing Parallel Safety – Every function in pg_proc is labelled to indicate whether it can be used in parallel mode. Most can! –

Query planner is modified to search the query tree for unsafe functions, or any operation that writes data.

© 2015 EDB All rights reserved.

6

Patches for 9.6 (2 of 2) •

Parallel Seq Scan – General Executor Support For Parallelism –

New Executor Nodes: Funnel, Partial Seq Scan



Might get split into several smaller patches

© 2015 EDB All rights reserved.

7

Parallel Seq Scan - New Nodes •



Funnel – Has one child, runs multiple copies in parallel. –

Combines the results into a single tuple stream.



Can run the child itself if no workers available.

Partial Seq Scan – Scans part of a relation sequentially. –

Specifically, the part not scanned by any other copy of the same partial seq scan.

© 2015 EDB All rights reserved.

8

Parallel Seq Scan – Example Plan



Funnel Number of Workers: 4 -> Partial Seq Scan on tbl_parallel



Each worker will scan part of the tbl_parallel; together, they will scan the whole thing.

© 2015 EDB All rights reserved.

9

Parallel Seq Scan – Information Sharing •

To perform parallel scan master and worker backend needs to share some information – Planned Stmt which needs to be executed by each worker –

Bind Parameters



PARAMS_EXEC parameters (Execution time params required for evaluation of subselects)



Tuple Queues, to send tuples from worker to master backend



Instrumentation information required by Explain or other stats required by external utilities like pg_stat_statements

© 2015 EDB All rights reserved.

10

Parallel Seq Scan – Tuning Parameters •





parallel_degree - Maximum number of parallel workers that can be allocated to a particular parallel operation cpu_tuple_comm_cost - Cost of CPU time to pass a tuple from worker to master backend. parallel_setup_cost - Cost of setting up shared memory for parallelism, and launching workers.

© 2015 EDB All rights reserved.

11

Parallel Workers •





Parallel workers are launched at the start of funnel node execution Parallel workers will be stopped – As soon as last tuple is retrieved –

During rescan



At end of execution

Parallel workers will execute Partial Seq Scan node and produce tuples which are sent back to master backend

© 2015 EDB All rights reserved.

12

Parallel Workers – Work Allocation •





Two different strategies have been considered to allocate work for backend workers – Block-By-Block and Fixed Chunks Performance measurements didn't show much difference between the approaches, at least on the machines we tested. Preferred Block-By-Block, as that will allow work to be distributed dynamically based on the work finished by individual worker.

© 2015 EDB All rights reserved.

13

Performance Data Common non-default settings shared_buffers=8GB; min_wal_size=5GB; max_wal_size=10GB checkpoint_timeout =30min; max_connections=300; max_worker_processes=100;

Test setup create table tbl_perf(c1 int, c2 char(1000)); insert into tbl_perf values(generate_series(1,30000000),'aaaaa'); Explain analyze select c1 from tbl_perf where c1 > calc_factorial($1,10) and c2 like '%aa%'; teststatement[ntests]="Explain analyze select c1 from tbl_perf where c1 > calc_factorial(22500000,10) and c2 like '%aa%'; " let "ntests++"

Script used to take data is attached. © 2015 EDB All rights reserved.

14

create_calc_factorial="create or replace function calc_factorial(a integer, fact_val integer) returns integer as \$\$ begin

Performance Data Explain analyze select c1 from tbl_perf where c1 > calc_factorial($1,10) and c2 like '%aa%';

median of 3 runs 300000

Time in millisec

250000 200000

1% Rows Qualified 10% Rows Qualified 25% Rows Qualified

150000 100000 50000 0 0

2

4

8

16

32

64

Degree of parallelism



With increase in degree of parallelism (more parallel workers), the time to complete the execution reduces.



Along with workers, master backend also participates in execution due to which you can see more time reduction in some cases.



After certain point, increasing number of workers won't help. © 2015 EDB All rights reserved.

15

Future Work

Wondering why 2 new nodes (Funnel and PartialSeqScan) have been added?

© 2015 EDB All rights reserved.

16

Future Work – Join Pushdown (1 of 2) •





Example for Join Evaluation Nested Loop -> Seq Scan on foo -> Index Scan on bar Index Cond: bar.x = foo.x Now, if a parallel sequential scan is cheaper than a regular sequential scan, we can instead do this: Nested Loop -> Funnel -> Partial Seq Scan on foo -> Index Scan on bar Index Cond: bar.x = foo.x The problem with this is that the nested loop/index scan is happening entirely in the master.

© 2015 EDB All rights reserved.

17

Future Work – Join Pushdown (2 of 2) •



We can transform the plan to Funnel -> Nested Loop -> Partial Seq Scan on foo -> Index Scan on bar Index Cond: bar.x = foo.x The will allow the workers to execute the nested loop/index scan in parallel; we merge the results afterwards.

© 2015 EDB All rights reserved.

18

Future Work – Aggregate Pushdown •



We can push the Aggregates below the Funnel HashAggregate -> Funnel -> Partial Seq Scan on foo Filter: x = 1 Assuming we have infrastructure to push the HashAggregates, we can convert it to HashAggregateFinish -> Funnel -> HashAggregatePartial -> Partial Seq Scan on foo Filter: x = 1

© 2015 EDB All rights reserved.

19

Thanks. •

Any questions?

© 2015 EDB All rights reserved.

20

Parallel Seq Scan

PARAMS_EXEC parameters (Execution time params required for evaluation of subselects). – Tuple Queues, to send tuples from worker to master backend.

193KB Sizes 0 Downloads 276 Views

Recommend Documents

Workflow Demo for Ribo-Seq and polyRibo-Seq - GitHub
Jun 22, 2016 - most important difference among the two is that polyRibo-Seq utilizes polyribosomal .... results are written to a PDF file named fastqReport.png.

Scan Inter
Jul 11, 2017 - commercial operations in 4Q 17-1Q 18 (Figure 3), up 68% from the current capacity. ..... Telephone 852.2878.6888 Facsim ile 852.2878.6800.

SEQ Brochure w testimonials.pdf
Page 1 of 1. “The SEQ is an effective way to deepen your EQ practice. We are by nature social beings and seek connection. SEQ. helped me explore some of the big questions of how I can fully embrace myself and others using emotional. intelligence.â€

Cheap Automatic Rotating Scan Free Scan 3D Scanner .pdf ...
Cheap Automatic Rotating Scan Free Scan 3D Scanner .pdf. Cheap Automatic Rotating Scan Free Scan 3D Scanner .pdf. Open. Extract. Open with. Sign In.

scan electromechanical.pdf
Page 1 of 13. Page 1 of 13. Page 2 of 13. Page 2 of 13. Page 3 of 13. Page 3 of 13. Page 4 of 13. Page 4 of 13. scan electromechanical.pdf. scan electromechanical.pdf. Open. Extract. Open with. Sign In. Details. Comments. General Info. Type. Dimensio

Scan Inter - SETTRADE.COM
Jul 11, 2017 - addition, the company is waiting to acquire one NG V-station license with .... other businesses such as auto dealership, solar farm, and C O 2 ...

Scan 123220001.jpg - Groups
Model RBW -80 Fold-Back Surface Units, Figure 1, are assembled in a 241 wide housing and consist of one (1) 1450 watt unit and one (1) 2050 watt unit ...

Scan 33.pdf
Sign in. Page. 1. /. 3. Loading… Page 1 of 3. Page 1 of 3. Page 2 of 3. Page 2 of 3. Page 3 of 3. Page 3 of 3. Scan 33.pdf. Scan 33.pdf. Open. Extract. Open with.

CT scan - iScan Imaging Centre
Fastest tube rotation speed of 0.33 s with 64 slices per rotation. • High-matrix Multiplanar images ... and coronary calcium burden measurement. • Extremely fast ...

Google Drive - Virus scan warning
Google Drive can't scan this file for viruses. BMMA - Seeking Justice Final.pdf (34M) exceeds the maximum size that Google can scan. Would you still like to ...

Google Drive - Virus scan warning
Google Drive can't scan this file for viruses. We are experiencing technical difficulties. Would you still like to download this file? Mateo-Finn.pdf (319k). Download ...

Pdf scan split
Microsoft office volume. ... propaganda, on the other hand C pdfscan split definitely not propaganda has it is taken in 1988,and unlikea pictureit ... Pdf scan split.

scan 10001.pdf
Sign in. Page. 1. /. 2. Loading… Page 1 of 2. Page 1 of 2. Page 2 of 2. Page 2 of 2. scan 10001.pdf. scan 10001.pdf. Open. Extract. Open with. Sign In.

Google Drive - Virus scan warning
Account Options. Sign in · Settings · Help. Google Drive can't scan this file for viruses. We are experiencing technical difficulties. Would you still like to download this file? LaurResMeHeaProm.pdf (4.0M). Download anyway ...

scan of flower.pdf
Whoops! There was a problem loading this page. Retrying... Page 3 of 8. скачать в конце поста. Апро установку драйверов читайтездесь или скачайте SAMSUNG_Mobile_USB_Driver_1_5_9_0. Вот

Google Drive - Virus scan warning
IPM Strategies for New Warm-Season Cultivars seminar CGCSA 2014.pdf (36M) exceeds the maximum size that Google can scan. Would you still like to ...