AI & TECHNOLOGY

Accelerating synchronous process pipelines with multiprocessing & batching

Discover how parallelism and batching turn slow Python workflows into scalable, high-performance pipelines.

Colombe

March 3, 2026

7 min

read

When your automation workflows start to scale, latency becomes the bottleneck. Traditional synchronous process pipelines like:‍

step_one → step_two → step_three → ... → step_n

‍

often suffer from two key delays:

Record-level latency: Each record waits for the previous one to finish.

Step-level latency: Each step must wait for the prior step to complete.

A simple implementation looks like this:

‍

all_data = get_all_data()
results = []

for item in all_data:	
    a = step_one(item)
    b = step_two(a)
    c = step_three(b)
    results.append(c)

‍

Processing millions of records sequentially like this leads to huge wait times, even if each step is fast.

Why single-process pipelines hurt performance

Python’s GIL: Only one thread executes at a time, limiting CPU-bound threading.
Sequential dependencies: Downstream steps can’t start until upstream steps finish on every record.
Memory pressure: Large datasets consume more RAM as they scale.

‍

Multiprocessing: parallelism by process

Using multiprocessing, each process runs its own Python interpreter, bypassing the GIL:

POOL = create_pool(num_processes)

results = POOL.map(
    lambda x: step_three(step_two(step_one(x))),
    all_data
)

POOL.close()

‍

Pros:

Real parallelism on multi-core CPUs
Minimal code changes for pure Python functions

Cons:

Each process runs the entire pipeline, so slow steps still throttle throughput
Higher per-process memory usage

‍

Pipeline parallelism: an assembly line for your code

Think of your pipeline as a factory line. Each step runs in its own process and passes data via queues. This means:

While Step 1 processes item k+1, Step 2 can finish item k.
Steps run concurrently but maintain order.

Time →
elem k      [Step1]──►[Step2]──►[Step3]
elem k + 1          [Step1]──►[Step2]
elem k + 2                 [Step1]

‍

Queued stage-by-stage pipelines with Python queues

You can implement this with multiprocessing queues:

END = object()

def worker(step_fn, in_q, out_q):
    while True:
        item = in_q.get()
        if item == END:
            out_q.put(END)
            break
        out_q.put(step_fn(item))

# Setup queues and processes for each stage
Q0, Q1, Q2, Q3 = Queue(), Queue(), Queue(), Queue()

P1 = spawn_process(worker, args=(stage_one, Q0, Q1))
P2 = spawn_process(worker, args=(stage_two, Q1, Q2))
P3 = spawn_process(worker, args=(stage_three, Q2, Q3))
P4 = spawn_process(worker, args=(stage_four, Q3, output_sink))

# Feed data
for item in data_source:
    Q0.put(item)
Q0.put(END)

# Drain results
def output_sink():
    out_q = Queue()
    spawn_process(worker, args=(stage_four, Q3, out_q))
    while True:
        result = out_q.get()
        if result == END:
            break
        handle_output(result)

‍

Benefits:

True concurrency between steps
Isolation between heavy CPU or network steps
Back-pressure controls memory usage

‍

Batch processing for massive speedups

Single-record processing wastes overhead. Instead, process data in batches:

BATCH_SIZE = 1000

def chunked(iterable, n):
    batch = []
    for element in iterable:
        batch.append(element)
        if len(batch) == n:
            yield batch
            batch = []
    if batch:
        yield batch

def step_one_batch(batch): return [step_one(x) for x in batch]
# similarly for other steps

# Setup workers like before, but with batch functions

‍

Batching leverages vectorized libraries like Pandas or NumPy, dramatically increasing throughput while reducing queue overhead.

‍

Benchmark: dramatic performance gains

+--------------------------------------+--------------------+------------+
| Technique                            | End to End runtime | CPU        |
|                                      |                    | Utilisation|
+--------------------------------------+--------------------+------------+
| Single-process loop                  | 4 h 17 m           | 15 %       |
| multiprocessing.Pool (8x)            | 52 m               | 85 %       |
| Queued 4-stage pipeline (8x)         | 27 m               | 90 %       |
| Batching + pipeline (8x)             | 11 m               | 95 %       |
+--------------------------------------+--------------------+------------+

‍

Takeaway: Combining pipeline parallelism with batching cuts runtime ~23× and maximizes CPU usage.

‍

Final tips for practitioners

Multiprocessing alone isn’t enough, use queues to avoid slow steps blocking the whole pipeline.
Batch aggressively to leverage vectorized operations.
Use bounded queues or Pub/Sub to control back-pressure and avoid memory spikes.
This approach keeps your pipeline synchronous but fast, perfect for scaling data jobs from hours to minutes.

Adopt this pattern, and your “synchronous” pipeline will feel almost asynchronous, without sacrificing ordering guarantees. Practical guidance on orchestrating with Airflow, Astro, and Pub/Sub is coming soon.

‍

Colombe

Senior Data & AI Scientist

Explore More Articles

Accelerating synchronous process pipelines with multiprocessing & batching

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Why single-process pipelines hurt performance

Multiprocessing: parallelism by process

Pros:

Cons:

Pipeline parallelism: an assembly line for your code

Queued stage-by-stage pipelines with Python queues

Benefits:

Batch processing for massive speedups

Benchmark: dramatic performance gains

Final tips for practitioners

LET'S

TALK

CONTACT

US

Accelerating synchronous process pipelines with multiprocessing & batching

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5

Heading 6

Why single-process pipelines hurt performance

Multiprocessing: parallelism by process

Pros:

Cons:

Pipeline parallelism: an assembly line for your code

Queued stage-by-stage pipelines with Python queues

Benefits:

Batch processing for massive speedups

Benchmark: dramatic performance gains

Final tips for practitioners

Sign up to receive insights that matter

LET'S

TALK

CONTACT

US

LET'S TALK

LET'S TALK