Skip to content

Instantly share code, notes, and snippets.

@mpasternacki
Last active March 23, 2020 14:05
Show Gist options
  • Save mpasternacki/6660958 to your computer and use it in GitHub Desktop.
Save mpasternacki/6660958 to your computer and use it in GitHub Desktop.
Perl script to build a Docker image from Dockerfile that creates the image in a single layer, without any intermediate images. Needs JSON CPAN module (available as `libjson-perl` Debian/Ubuntu package). Usage: run the script in the directory that contains a Dockerfile. More details in a blog post: http://3ofcoins.net/2013/09/22/flat-docker-images/
#!/usr/bin/env perl
use feature 'switch';
use strict;
use warnings;
use Data::Dumper;
use File::Basename;
use File::Copy;
use File::Path qw/make_path/;
use File::Temp qw/tempdir/;
use JSON;
our ( $from, $author, %metadata, @commands, $tmpdir, $tmpcount, $prefix );
$tmpdir = tempdir(CLEANUP => !$ENV{LEAVE_TMPDIR});
$tmpcount = 0;
$prefix = '';
print "*** Working directory: $tmpdir\n" if $ENV{LEAVE_TMPDIR};
open DOCKERFILE, '<Dockerfile' or die;
while ( <DOCKERFILE> ) {
chomp;
# handle long lines
$_ = "$prefix$_";
$prefix = '';
if ( /\\$/ ) {
s/\\$//;
$prefix="$_\n";
next;
}
s/^\s*//;
/^#/ and next;
/^$/ and next;
my ($cmd, $args) = split(/\s+/, $_, 2);
given ( uc $cmd ) {
# building the image
when ('FROM') { $from = $args }
when ('RUN') { push @commands, $args }
when ('ADD') {
$tmpcount++;
my ( $src, $dest ) = split ' ', $args, 2;
if ( $src =~ /^https?:/ ) {
my $basename = basename($src);
my $target = "$tmpdir/dl/$tmpcount/$basename";
make_path "$tmpdir/dl/$tmpcount";
system('wget', '-O', $target, $src) == 0 or die;
$src = $target;
}
my $local = "$tmpdir/$tmpcount";
given ( $src ) {
when ( /\.(tar(\.(gz|bz2|xz))?|tgz)$/ ) {
mkdir $local;
system('tar', '-C', $local, '-xf', $_) == 0 or die;
push @commands, "mkdir -p '$dest'", "( cd /.data/$tmpcount ; cp -a . '$dest' )";
}
when ( -f $_ ) {
$dest .= basename($_) if ( $dest =~ /\/$/ );
system('cp', '-a', $_, $local) == 0 or die;
push @commands, "mkdir -p '".dirname($dest)."'", "cp -a /.data/$tmpcount '$dest'";
}
when ( -d $_ ) {
# Handle trailing slash combinations properly:
# - `$src=/dir, $dest=/foo -> /foo`
# - `$src=/dir, $dest=/foo/ -> /foo/dir`
# - `$src=/dir/, $dest=/foo -> /foo`
# - `$src=/dir/, $dest=/foo/ -> /foo`
$dest .= basename($_) if ( $_ !~ /\/$/ && $dest =~ /\/$/ );
system('cp', '-a', $_, $local) == 0 or die;
push @commands, "mkdir -p '$dest'", "( cd /.data/$tmpcount ; cp -a . '$dest' )";
}
default { die }
}
}
# image metadata
when ('MAINTAINER') { $author = $args }
when ('CMD') { $metadata{Cmd} = eval { decode_json($args) } || ['sh', '-c', $args] }
when ('ENTRYPOINT') { $metadata{Entrypoint} = eval { decode_json($args) } || ['sh', '-c', $args] }
when ('WORKDIR') { $metadata{WorkingDir} = $args }
when ('USER') { $metadata{User} = $args }
when ('EXPOSE') { push @{ $metadata{PortSpecs} ||= [] }, split(' ',$args); }
when ('ENV') {
my ( $k, $v ) = split(/s+/, $args, 2);
push @commands, "export $k='$v'";
push @{ $metadata{Env} ||= [] }, "$k=$v";
}
when ('VOLUME') {
# This seems to be a NOP in `docker build`.
# push @{ $metadata{VolumesFrom} ||= [] }, $args
}
}
}
close DOCKERFILE;
open SETUP, ">$tmpdir/setup.sh" or die;
print SETUP join("\n", "#!/bin/sh", "set -e -x", @commands), "\ntouch /.data/FINI\n";
close SETUP;
chmod 0755, "$tmpdir/setup.sh";
our @run = ('docker', 'run', "-cidfile=$tmpdir/CID", '-v', "$tmpdir:/.data", $from, "/.data/setup.sh");
print "*** ", join(' ', @run), "\n";
system(@run) == 0 or die;
die "unfinished, not committing\n" unless -f "$tmpdir/FINI";
sleep 15; # docker container is not always immediately up to a commit, let's give it time to cool off.
open CID, "<$tmpdir/CID" or die;
our $cid = <CID>;
close CID;
our @commit = ( 'docker', 'commit' );
push @commit, "-author=$author" if defined $author;
push @commit, "-run=" . encode_json(\%metadata) if %metadata;
push @commit, $cid;
print "*** ", join(' ', @commit), "\n";
exec(@commit);
@atmoz
Copy link

atmoz commented Sep 22, 2013

Good stuff! This should be featured in Docker, or otherwise solved in a way that makes it easy to use without wasting unnecessary disk space.

@thaJeztah
Copy link

@mpasternacki Found some issues in your script;

  • ENV key/values are not split properly (splitting on s, not white-space)
  • Some docker arguments (cidfile, author, run) cause a deprecation-warning because a single dash is used.

I've created a modified version here; https://gist.github.com/thaJeztah/f5064ac9b285ba82b15c couldn't find an option to send a pull request for gists (not sure that's even possible) :)

@grossws
Copy link

grossws commented Oct 25, 2014

VOLUME command in docker file is not a NOP, since docker writes it to metadata. When container is started from built dockerfile volumes from it are mounted (and persisted while there's at least one container that use that volume).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment