Menu

Gunicorn: Complete Definition and Guide

6 min read Mis à jour le 03 Apr 2026

Définition

Gunicorn (Green Unicorn) is a pre-fork Python WSGI application server, designed to serve Django and Flask web applications in production with performance and reliability.

What is Gunicorn?

Gunicorn, short for "Green Unicorn", is a WSGI (Web Server Gateway Interface) application server for Python. Created in 2010 and inspired by Ruby's Unicorn server, it serves as the bridge between the web server (Nginx) and your Python application (Django, Flask, FastAPI). Its role is to receive HTTP requests forwarded by the reverse proxy, have them processed by your Python code, and return the responses.

WSGI is the standard specification defining how a web server communicates with a Python application. When developing with Django, the built-in development server (manage.py runserver) is perfect for local development, but it is absolutely not designed for production: it is single-threaded, does not handle concurrency, and offers none of the reliability guarantees required. Gunicorn replaces this development server in production.

Gunicorn's pre-fork architecture means a master process creates multiple worker processes in advance, each capable of handling a request independently. This model offers an excellent balance between performance and simplicity. Unlike more complex asynchronous servers, Gunicorn is extremely simple to configure and maintain while offering performance more than sufficient for the majority of web applications.

Why Gunicorn Matters

Gunicorn is the indispensable link between your web infrastructure (Nginx, Linux) and your application code (Django). Its importance lies in several factors critical for production.

  • Production reliability: Gunicorn has been battle-tested for over 14 years. It correctly handles UNIX signals, graceful worker restarts, and child process supervision. If a worker crashes due to an application bug, the master automatically restarts it without impacting other requests.
  • Concurrency management: with its multiple workers, Gunicorn can process several requests simultaneously. The number of workers is configurable and adjusts based on server resources (rule of thumb: 2 * number of CPUs + 1).
  • Configuration simplicity: Gunicorn launches with a single command and a few parameters. This simplicity reduces configuration error risks and facilitates debugging.
  • Universal compatibility: any Python framework implementing WSGI works with Gunicorn: Django, Flask, Pyramid, Falcon. It is a choice that does not lock you into any framework.
  • Natural integration with Nginx: Gunicorn and Nginx form a proven tandem. Nginx handles what it does best (static files, SSL, compression) while Gunicorn focuses on executing Python code.

How It Works

At startup, Gunicorn's master process loads the configuration, binds to a socket (Unix or TCP), then creates the configured number of workers via fork(). Each worker is an independent process that inherits the master's socket and enters a request listening loop.

When an HTTP request arrives (forwarded by Nginx), an available worker takes it on. It translates the raw HTTP request into a WSGI environ dictionary — a standardised set of variables describing the request — then calls your Django application's WSGI callable. Django processes the request through its middlewares, views, and templates, then returns a response object. Gunicorn translates this response to HTTP and sends it back to Nginx.

The master process continuously monitors worker status. If a worker exceeds the configured timeout (30 seconds by default), the master kills it and creates a new one. If a worker crashes, it is automatically replaced. This supervision mechanism ensures system resilience: even if an application bug causes a crash, only one request is affected and the service remains available.

Gunicorn offers several worker types. "Sync" workers (default) process one request at a time synchronously — the simplest and most suitable choice for standard Django applications. "Gthread" workers add multi-threading to handle more connections per worker. "Gevent" or "eventlet" workers use green threads for asynchronous I/O handling, useful for applications with heavy network waiting.

Concrete Example

At Kern-IT, Gunicorn is at the core of our Django deployment stack. For this Wagtail site, our typical Gunicorn configuration uses sync workers with a 30-second timeout, binds to a Unix socket for communication with Nginx, and sets the number of workers based on server resources.

Our Fabric deployment process perfectly illustrates Gunicorn's integration into the chain. During deployment, Fabric connects to the Linux server via SSH, updates the code via git pull, installs dependencies in the pyenv environment, runs Django migrations, then restarts Gunicorn via Supervisor with a graceful signal (SIGHUP). This signal tells Gunicorn to finish processing current requests before reloading workers with the new code, ensuring zero-downtime deployment.

Implementation

  1. Install Gunicorn: add gunicorn to the project dependencies (pip install gunicorn) and verify your Django application properly exposes a WSGI callable (in wsgi.py).
  2. Configure the number of workers: apply the formula 2 * CPU + 1 as a starting point. For a 2-core server, 5 workers. Then adjust based on actual load metrics.
  3. Choose the bind: use a Unix socket (--bind unix:/run/gunicorn.sock) rather than a TCP port for Nginx communication on the same server — it is more performant and more secure.
  4. Configure the timeout: 30 seconds by default is a good starting point. Increase if your application has legitimately long-running views (PDF generation, exports).
  5. Supervise with Supervisor: create a Supervisor configuration file so Gunicorn automatically restarts in case of a crash or server reboot.
  6. Configure logging: direct access and error logs to dedicated files to facilitate debugging and monitoring.

Associated Technologies and Tools

  • Nginx: reverse proxy that works in tandem with Gunicorn to serve Django applications.
  • Django: the most popular Python web framework, for which Gunicorn is the reference application server.
  • Supervisor: process manager to keep Gunicorn running permanently.
  • Fabric: SSH deployment tool that automates graceful Gunicorn restarts.
  • uWSGI: alternative to Gunicorn, richer in features but more complex to configure.
  • Uvicorn: ASGI server for asynchronous Python applications (FastAPI, Django Channels).

Conclusion

Gunicorn is the reference WSGI application server for deploying Django applications in production. Its configuration simplicity, proven reliability, and natural integration with Nginx make it the obvious choice for the majority of Python web projects. At Kern-IT, the Nginx/Gunicorn tandem is the deployment standard for all our Django and Wagtail applications. This combination, orchestrated by Supervisor and deployed via Fabric on Linux servers, constitutes a proven, performant, and easy-to-maintain production stack.

Conseil Pro

Use the 2 * CPU + 1 formula for worker count as a starting point, but monitor RAM usage. Each Django worker loads the complete application in memory — if your application consumes 200MB per worker, 9 workers on a 2GB RAM server may exhaust memory.

Un projet en tête ?

Discutons de comment nous pouvons vous aider à concrétiser vos idées.