• Stars
    star
    357
  • Rank 116,317 (Top 3 %)
  • Language
    Python
  • Created almost 10 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Transparent Squid in a container

This is a trivial Dockerfile to build a proxy container. It will use the famous Squid proxy, configured to work in transparent mode.

Why?

If you build a lot of containers, and have a not-so-fast internet link, you might be spending a lot of time waiting for packages to download. It would be nice if all those downloads could be automatically cached, without tweaking your Dockerfiles, right?

Or, maybe your corporate network forbids direct outside access, and require you to use a proxy. Then you can edit this recipe so that it cascades to the corporate proxy. Your containers will use the transparent proxy, which itself will pass along to the corporate proxy.

How?

You can use the squid proxy directly via docker and iptables rules, there is also a docker-compose.yml for convenience to use docker-compose up command to launch the system. For more information on tuning parameters see below.

Using Docker and iptables directly.

You can manually run these commands

docker run --net host -d jpetazzo/squid-in-a-can
iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

After you stop you will need to cleanup the iptables rules:

iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

Using Compose

There is a docker-compose.yml file to enable launching via docker compose and a separate container which will setup the iptables rules for you. To use this you will need a local checkout of this repo and have docker and compose installed.

Run the following command in the same directory as the docker-compose.yml file:

docker-compose up

Result

That's it. Now all HTTP requests going through your Docker host will be transparently routed through the proxy running in the container.

If you your tproxy instance goes down hard without cleaning up use the following command:

iptables -t nat -D PREROUTING -p tcp --dport 80 -j REDIRECT --to 3129 -w

Note: it will only affect HTTP traffic on port 80.

Note: traffic originating from the host will not be affected, because the PREROUTING chain is not traversed by packets originating from the host.

Note: if your Docker host is also a router for other things (e.g. if it runs various virtual machines, or is a VPN server, etc), those things will also see their HTTP traffic routed through the proxy. They have to use internal IP addresses, though.

Note: if you plan to run this on EC2 (or any kind of infrastructure where the machine has an internal IP address), you should probably tweak the ACLs, or make sure that outside machines cannot access ports 3128 and 3129 on your host.

Note: It will be available to as a proxy on port 3128 on your local machine if you would like to setup local proxies yourself.

What?

The jpetazzo/squid-in-a-can container runs a really basic Squid3 proxy. Rather than writing my own configuration file, I patch the default Debian configuration. The main thing is to enable intercept on another port (here, 3129). To update the iptables for the intercept the command needs the --privileged flag.

Then, this container should be started using the network namespace of the host (that's what the --net host option is for). Another strategy would be to start the container with its own namespace. Then, the HTTP traffic can be directed to it with a DNAT rule. The problem with this approach, is that Squid will "see" the traffic as being directed to its own IP address, instead of the destination HTTP server IP address; and since Squid 3.3, it refuses to honor such requests.

(The reasoning is, that it would then have to trust the HTTP Host: header to know where to send the request. You can check CVE-2009-0801 for details.)

Tuning

The docker image can be tuned using environment variables.

MAX_CACHE_OBJECT

Squid has a maximum object cache size. Often when caching debian packages vs standard web content it is valuable to increase this size. Use the -e MAX_CACHE_OBJECT=1024 to set the max object size (in MB)

DISK_CACHE_SIZE

The squid disk cache size can be tuned. use -e DISK_CACHE_SIZE=5000 to set the disk cache size (in MB)

SQUID_DIRECTIVES_ONLY

The contents of squid.conf will only be what's defined in SQUID_DIRECTIVES giving the user full control of squid.

SQUID_DIRECTIVES

This will append any contents of the environment variable to squid.conf. It is expected that you will use multi-line block quote for the contents.

Here is an example:

docker run -d \
    -e SQUID_DIRECTIVES="
    # hi ho hi ho
    # we're doing block I/O
    # hi ho hi ho
    " jpetazzo/squid-in-a-can

Persistent Cache

Being docker when the instance exits the cached content immediately goes away when the instance stops. To avoid this you can use a mounted volume. The cache location is /var/cache/squid3 so if you mount that as a volume you can get persistent caching. Use -v /home/user/persistent_squid_cache:/var/cache/squid3 in your command line to enable persistent caching.

If you do that, make sure that the persistent_squid_cache directory is writable by the right user. As I write these lines, the squid process runs as user and group proxy, and their UID and GID both are 13; so make sure that the directory is writable by UID 13, or by GID 13, or (if you really can't make otherwise) world-writable (but please don't).

Note that if you're using Docker Mac, all volume I/O is handled by the Docker Mac application, which runs as an ordinary process; so you won't have to deal with permissions as long as you have read/write access to a volume.

Notes

Ideas for improvement:

  • easy chaining to an upstream proxy

HTTPS support

It has been asked if this could support HTTPS. HTTPS is designed to prevent man-in-the middle attacks, and a transparent proxy is effectively a MITM. If you want to use squid for HTTPS proxying transparently you need to setup a private CA certificate and push it to all your users so they trust the proxy. An example of how to set this up can be found here.

Without a CA certificate configured, the default behavior is to tunnel HTTPS traffic using the CONNECT method. Squid makes the request on behalf of the client but cannot decrypt or cache the requests or responses.

More Repositories

1

pipework

Software-Defined Networking tools for LXC (LinuX Containers)
Shell
4,139
star
2

container.training

Slides and code samples for training, tutorials, and workshops about Docker, containers, and Kubernetes.
Shell
3,559
star
3

nsenter

Shell
2,582
star
4

ampernetacle

HCL
2,519
star
5

dind

Docker in Docker
Shell
2,474
star
6

dockvpn

Recipe to build an OpenVPN image for Docker
Shell
833
star
7

pxe

Dockerfile to build a PXE server in a Docker container
Shell
250
star
8

minimage

Minimal Docker images: a collection of Dockerfiles illustrating how to reduce container image size.
Shell
206
star
9

griode

Griode + Novation Launchpad + Raspberry Pi = a music instrument!
Python
139
star
10

registrish

Dirty hack to run a read-only, public Docker registry on almost any static file hosting service (e.g. NGINX, Netlify, S3...)
Shell
136
star
11

shpod

Container image to get a consistent training environment to work on Kubernetes.
Dockerfile
134
star
12

critmux

Docker + CRIU + tmux = magic!
Dockerfile
117
star
13

dockercoins

Python
98
star
14

docker-busybox

Busybox for Stackbrew
Shell
94
star
15

sekexe

Separate Kernel Execution: execute a process within user-mode-linux and retrieve its output and status code
Shell
79
star
16

dessine-moi-un-cluster

Instructions to build a Kubernetes control plane one piece at a time, for learning purposes.
Shell
78
star
17

gunsub

Get your github notifications under control!
Python
74
star
18

syslogdocker

70
star
19

stevedore

Containerize your development environments
Shell
68
star
20

hamba

Shell
67
star
21

obs-docker

OBS-Studio (and a few extra tools) in containers
Python
62
star
22

docker2docker

Shell
26
star
23

wordsmith

Java
23
star
24

intro-to-docker

CSS
23
star
25

jpetazzo.github.io

HTML
22
star
26

go-docker-

20
star
27

trainingwheels

HTML
18
star
28

snakedeck

Elgato StreamDeck controller for Linux, in Python.
Python
17
star
29

nsplease

Tiny Kubernetes operator to create Namespaces on demand (for CI/CD, for instance)
Shell
16
star
30

django

Django on DotCloud tutorial
Python
16
star
31

whisperfiles

A bunch of Dockerfiles for OpenAI Whisper, to illustrate various image optimization techniques
Shell
15
star
32

foundation-example

Shell
14
star
33

httpenv

Tiny HTTP server showing the environment variables
Go
14
star
34

buildkit-demos

Dockerfile
13
star
35

dctrl

Shell
13
star
36

orchestration-workshop

We have moved! We are now at → https://github.com/jpetazzo/container.training
HTML
12
star
37

decoup

Python
11
star
38

layeremove

Surgically remove layers from a Docker image (with a chainsaw)
Python
11
star
39

znc-on-dotcloud

Shell
11
star
40

tilestream-on-dotcloud

Python
10
star
41

littleboxes

Just for fun scripts to manage local cloud-like VMs with KVM
Shell
10
star
42

django-and-mongodb-on-dotcloud

Django on DotCloud tutorial, using MongoDB to store objects!
Python
10
star
43

kubercoins

8
star
44

sstk

Shell
8
star
45

geodjango-on-dotcloud

Python
7
star
46

meteor-on-dotcloud

7
star
47

solr-on-dotcloud

JavaScript
7
star
48

seleterm

Selenium for terminal applications
Python
6
star
49

snap-on-dotcloud

Shell
6
star
50

httplat

Minimalist Prometheus exporter to collect the latency of an HTTP target
Go
6
star
51

boggle

Solver for the Boggle Word Game
Python
6
star
52

mume

Python
6
star
53

gitorial

(Ab)use git history to write tutorials!
Python
6
star
54

postgresql-on-dotcloud

Python
5
star
55

django-on-gpaas

Django on GANDI PAAS
Python
5
star
56

scangraph

Retrieve point coordinates from a raster plot
JavaScript
5
star
57

hano

Online IDE for Node.js on dotCloud
Shell
5
star
58

pyramid-on-dotcloud

Python
5
star
59

traefik-compose

Quick demo showing how to run web sites (like Wordpress) on Docker with Traefik
4
star
60

plumber

Shell
4
star
61

consul

jpetazzo's Consul image
Shell
4
star
62

zwave-exporter

Prometheus exporter for Z-Wave sensors
JavaScript
4
star
63

jenkins-on-dotcloud

Shell
4
star
64

color

Go
4
star
65

busyhttp

A trivial HTTP server that eats CPU cycles at each request.
Python
4
star
66

tinydocgen

Tiny document generator using Jinja2, Markdown, and WeasyPrint.
Makefile
3
star
67

ngrok

3
star
68

charliebot

Python
3
star
69

prettypictures

3
star
70

usb-webcam-analyzer

Python
3
star
71

rickroll-in-docker

HTML
3
star
72

dnc

Domain Name Command-line tool
Python
3
star
73

python-simple-logging

Python
3
star
74

replay.container.training

Shell
3
star
75

riak-on-dotcloud-ALPHA

Shell
3
star
76

ucengine-on-dotcloud-ALPHA

Shell
3
star
77

django-r2d2

R2D2 (RRDDashboard) is a Django application to draw graphs from metrics coming from e.g. collectd.
Python
3
star
78

pieuvre

Distributed HTTP proxy in Node.js
JavaScript
2
star
79

tmp-sealedsecret-juin-2022

2
star
80

couchdb-on-dotcloud-ALPHA

Shell
2
star
81

escapehash

Python
2
star
82

dockerhubratelimit

Shell
2
star
83

workflows

2
star
84

python-worker-on-dotcloud

Shell
2
star
85

tcl-on-dotcloud-ALPHA

Shell
2
star
86

elastic-gke

HCL
2
star
87

dampmam

Docker-Apache-MySQL-PHP but without Apache and MySQL
JavaScript
2
star
88

watchdns

Shell
2
star
89

highfive

Dockerfile
2
star
90

memcached-on-dotcloud

2
star
91

pawd

PulseAudio Web Daemon
2
star
92

pingr

HTTP server to ping other servers and report their status
Go
2
star
93

dotfiles-old

Config files for various environments
Shell
2
star
94

pgpool-II-on-dotcloud

2
star
95

ls

An ls image for the Docker Fundamentals training
Shell
2
star
96

jetty-on-dotcloud

Reimplementation of dotCloud java service using the custom build API
Shell
2
star
97

dockage

Shell
2
star
98

jira-on-dotcloud

Shell
2
star
99

tornado-on-dotcloud

Python
2
star
100

aiguillage

Nginx
2
star