This is the final part of an optimization that I've been refactoring towards for awhile.
If the Dockerfile consists of no RUN commands, or cached RUN commands, followed by metadata-only
operations, we can skip downloading and unpacking the base image.
When building Docker images, layers were previously stored in memory.
This caused obvious issues when manipulating large layers, which could
cause Kaniko to crash.
* Enable overwriting of links (solves #351)
* add integration test to check extraction of images with replaced hardlinks
* Prevent following symlinks during extracting normal files
This fixes#359, #361, #362.
To add layer caching to kaniko, I added two flags: --cache and
--use-cache.
If --use-cache is set, then the cache will be used, and if --cache is
specified then that repo will be used to store cached layers. If --cache
isn't set, a cache will be inferred from the destination provided.
Currently, caching only works for RUN commands. Before executing the
command, kaniko checks if the cached layer exists. If it does, it pulls
it and extracts it. It then adds those files to the snapshotter and
append a layer to the config history. If the cached layer does not exist, kaniko executes the command and
pushes the newly created layer to the cache.
All cached layers are tagged with a stable key, which is built based off
of:
1. The base image digest
2. The current state of the filesystem
3. The current command being run
4. The current config file (to account for metadata changes)
I also added two integration tests to make sure caching works
1. Dockerfile_test_cache runs 'date', which should be exactly the same
the second time the image is built
2. Dockerfile_test_cache_install makes sure apt-get install can be
reproduced
While looking into #345, we were seeing the error:
Error: error building image: chmod /etc/mtab: operation not permitted
during extraction of `amazonlinux:1`. I looked into why kaniko couldn't
extract this file properly, and found that it already existed as a
symlink pointing to /proc/mounts, which returned an error when we tried
to run chmod on it.
Confusingly, in the image the /etc/mtab is a regular file, not a
symlink.
I can think of two ways to solve this problem:
1. Whitelist /etc/mtab so that whatever already exists in the system
is used
2. Check if a regular file already exists, and hasn't been extracted yet,
before extracting
I went with option 1 because for option 2 we'd have to keep a list of
all files that had been extracted in memory.
The bug in #329 occurred because of a bug in matchSources, where the
filepath wasn't absolute, so the source "/kaniko-bug/*" wasn't being
matched to the file "kaniko-bug/test-file"
To fix this, I added logic for making filepaths absolute and added to
the unit test for the function to test that it works.
Extracting the layers of the filesystem in order will make it easier to
extract cached layers and deal with hardlinks.
This PR implements extracting in order and adds an integration tests to
make sure hardlinks are extracted properly.
It also fixes two bugs I found when extracting symlinks:
1. We'd get a "file exists" error when trying to symlink to an existing
file with a whiteout later in the layer tarball
2. We'd get a "file exists" error when trying to create a symlink from a
file that was created in a prior layer (perhaps as a regular file or as
a symlink pointing to someting else)
To fix both of these, we resolve all symlinks in a layer at the end. I
also added logic to delete any existing paths before creating the
symlink.
I changed UnpackLocalTarArchive to return a list of files that were
extracted, so that the list of snapshotted files for ADD is more
accurate. Previously, we used to add all files in the extracted dir to
be snapshotted, but this could result in preexisting files being
snapshotted again.
Before #289 was merged, when copying over directories for COPY kaniko
would get a list of all files at the destination specified and add them
to the list of files to be snapshotted. If the destination was root it
would add all files. This worked because the snapshotter made sure the
file had been changed before adding it to the layer.
After #289, we changed the logic to add all files snapshotted to a layer
without checking if the files had been changed. This created the bug in
got all the files at root and added them to the layer without checking
if they had been changed.
This change should fix this bug. Now, the CopyDir function returns a
list of files it copied over and only those files are added to the list
of files to be snapshotted.
Should fix#314
Issue 291 pointed out that symlink "../proc/self/mounts" in the fedora image wasn't being extracted properly and kaniko was erroring out.
This is because the file path wasn't absolute so kaniko wasn't recognizing it as a whitelisted path.
With this change, we first resolve a path to it's absolute path before checking the whitelist.
* Vendor changes for go-containerregistry switch.
* Manual changes for go-containerregistry switch.
The biggest change is refactoring the tarball unpacking.
* Pull more of container-diff out.
* More vendor removals.
* More unit tests.
* adding VOLUME command
* proper test project
* general fixes
* fixing project name
* fixing volume unit test
* fixing integration test
* adding tests
* adding util test
* fixing test
* actually create the volume mounted directory
* fix test