I added a KanikoStage to hold each stage of the Dockerfile along with
information about each stage that would be useful later on.
The new KanikoStage type holds the stage itself, along with some
additional information:
1. FinalStage -- whether the current stage is the final stage
2. BaseImageStoredLocally/BaseImageIndex -- whether the base image for
this stage is stored locally, and if so what the index of the base image
is
3. SaveStage -- whether this stage needs to be saved for use in a future
stage
This is the first part of a larger refactor for building stages, which
will later make it easier to add layer caching.
I changed UnpackLocalTarArchive to return a list of files that were
extracted, so that the list of snapshotted files for ADD is more
accurate. Previously, we used to add all files in the extracted dir to
be snapshotted, but this could result in preexisting files being
snapshotted again.
Before #289 was merged, when copying over directories for COPY kaniko
would get a list of all files at the destination specified and add them
to the list of files to be snapshotted. If the destination was root it
would add all files. This worked because the snapshotter made sure the
file had been changed before adding it to the layer.
After #289, we changed the logic to add all files snapshotted to a layer
without checking if the files had been changed. This created the bug in
got all the files at root and added them to the layer without checking
if they had been changed.
This change should fix this bug. Now, the CopyDir function returns a
list of files it copied over and only those files are added to the list
of files to be snapshotted.
Should fix#314
To make the logic a bit more clear, when snapshotting files, the
parent dirs are now snapshotted in a different loop from the files we
are actually trying to snapshot. Unfortunately this loop is nearly
duplicated but I did managed to group some fo the related logic
together:
- A function to check if the file should be snapshotted (e.g. isn't
whitelisted, etc.)
- Created a `Tar` type to handle some of the logic around tar-ing, e.g.
tracking hardlinks and stat-ing files before adding them
One side effect of this is that now when snapshoting the file system,
files will be stat-ed twice.
This test had previously (before #231) been making a change to a file in
the kaniko dir, then checking that it isn't being snapshotted. This was
to test the whitelisting logic, which makes sure that changes to /kaniko
aren't included in images. However the test creates a temporary dir, so
the kaniko dir is actually in /tmp/<some temp dir>/kaniko, and
in #231 the logic was simplified to no longer have a special case for
tests. The test continued to pass because `MaybeAdd` noticed that the
kaniko file wasn't changing, and didn't add it. After changing this to
always add the files, it revealed that this was left behind by accident.
I also opened #307 to add integration test coverage for this logic.
I also marked `CheckErrorAndDeepEqual` as a helper function so that when
it fails, the line number reported is where that was called.
Although we were able to reproduce this with the previous behaviour of
the COPY and ADD commands, we have fixed that issue and our attempts to
cause the issue to occur with RUN did not succeed, so it may be that in
practice this will never happen.
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.
To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.
It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.
This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.
Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.
An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.
Fixes#251
In this refactor I:
1. Created KanikoOptions to make it easier to pass around arguments
passed in through the command line
2. Reorganized executor.go by putting the logic for pushing the image in
a new file push.go
3. Made some error messages clearer
4. Fixed a mistake in the README for pushing to AWS
5. Marked the --bucket flag as hidden since we want people to use
--context instead, and marked an aws flag as hidden which is set in a
vendored directorya
This change should fix the bug in #294, where kaniko wasn't recognizing
that a stage would be used in a later build and so wasn't saving it as a
tarball.
Each stage of the Dockerfile has a Name and a BaseName (FROM BaseName as
Name), but if a Name isn't specified then it's set to the same value as
BaseName. Our test cases weren't complete enough to catch this
distinction, which is why this bug occurred.
I added more test cases to the unit tests to make sure this fix works.
Issue 291 pointed out that symlink "../proc/self/mounts" in the fedora image wasn't being extracted properly and kaniko was erroring out.
This is because the file path wasn't absolute so kaniko wasn't recognizing it as a whitelisted path.
With this change, we first resolve a path to it's absolute path before checking the whitelist.
The flag, `--no-push`, is added to allow building a container image
without pushing to a container registry. It can be common, especially
with multi-stage builds and `--target`, to build enough to run the tests,
and then perform a push in a separate CI step. This will facilitate these
workflows.
In #251 we are investigating test flakes due to layer offsets not
matching, this change will give us a bit more context so we can be sure
which image has which number of layers, and it will also include the
digest of the image, since kaniko always pushes images to a remote repo,
so if the test fails we can pull the digest and see what is up.
Also updated reproducible Dockerfile to be built with reproducible flag,
which I think was the original intent (without this change, there is no
difference between how `kaniko-dockerfile_test_copy_reproducible` and
`kaniko-dockerfile_test_copy` are built.
To allow contributors to run the integration tests with their own GCS
buckets and image repos (since not all contributors will have accesss to
the projects used by the kaniko maintainers) this updates the
integration tests so that these can be provided on the command line.
This allows tests to be run individually, without using `make
integration-test`. Previously, part of the test setup was done
in the shell script (creating the context tarball that is required
for the tests that build images with context). Instead it will be
done in the test iself, so we can use `go test` to run tests
individually if we want to.
If we are running only one individual test, we don't want to build
all of the images, so this commit creates a builder which tracks which
images it has built and can be used by a tests to check if it should
build an image before running, or it will use the images that have
already been built by a previous test.
The name of the context tarball has also been made unique (it includes
the unix timestamp) to avoid potential test flakes if two tests using
the same GCS bucket run simultaneously.