Currently, kaniko can only build a single image per container run, because the filesystem is full of the content of the first image.
When running kaniko in Jenkins, where we need to start the container "doing nothing" first (using the debug kaniko container), and then exec /kaniko/executor, this is a limitation because it means that if we want to build multiple images, we need to start multiple containers - see https://groups.google.com/forum/#!topic/kaniko-users/_7LivHdMdy0 for more details
A solution to fix this issue is to add a new flag to cleanup the filesystem at the end - the same way it is done between stages when building a multi-stages image. This way, the same (debug) container can be used to build multiple images.
* set default HOME env properly
* set HOME to / if user is set by uid
* fix test
* continue to skip user_run test
* fix unit test to match new functionality
* Enable overwriting of links (solves #351)
* add integration test to check extraction of images with replaced hardlinks
* Prevent following symlinks during extracting normal files
This fixes#359, #361, #362.
I changed RunE to Run so that usage wouldn't show upon error. Usage will
still show if PersistentPreRunE fails, which makes sense since those
functions check to make sure arguments passed in are valid.
Also changed logging of multi arg flags to Debugf so that output would
be cleaner.
To add layer caching to kaniko, I added two flags: --cache and
--use-cache.
If --use-cache is set, then the cache will be used, and if --cache is
specified then that repo will be used to store cached layers. If --cache
isn't set, a cache will be inferred from the destination provided.
Currently, caching only works for RUN commands. Before executing the
command, kaniko checks if the cached layer exists. If it does, it pulls
it and extracts it. It then adds those files to the snapshotter and
append a layer to the config history. If the cached layer does not exist, kaniko executes the command and
pushes the newly created layer to the cache.
All cached layers are tagged with a stable key, which is built based off
of:
1. The base image digest
2. The current state of the filesystem
3. The current command being run
4. The current config file (to account for metadata changes)
I also added two integration tests to make sure caching works
1. Dockerfile_test_cache runs 'date', which should be exactly the same
the second time the image is built
2. Dockerfile_test_cache_install makes sure apt-get install can be
reproduced
gometalinter is broken @ HEAD, and I looked into why that was. During
that process, I remembered that we took the linting scripts from
skaffold, and found that in skaffold gometalinter was replaced with
GolangCI-Lint:
https://github.com/GoogleContainerTools/skaffold/pull/619
The change made linting in skaffold faster, so I figured instead of
fixing gometalinter it made more sense to remove it and replace it with
GolangCI-Lint for kaniko as well.
As mentioned in #346, if only ENTRYPOINT is set in a stage then any
CMD inherited from a parent should be cleared.
If both entrypoint and cmd are set then nothing should change.
I added a function and unit test to review the config file after building a stage
which clears out config.Cmd if ENTRYPOINT was declared but CMD wasn't.
I also added an integration test to make sure this works, which should
be tested by the preexisting container-diff --metadata test.
While looking into #345, we were seeing the error:
Error: error building image: chmod /etc/mtab: operation not permitted
during extraction of `amazonlinux:1`. I looked into why kaniko couldn't
extract this file properly, and found that it already existed as a
symlink pointing to /proc/mounts, which returned an error when we tried
to run chmod on it.
Confusingly, in the image the /etc/mtab is a regular file, not a
symlink.
I can think of two ways to solve this problem:
1. Whitelist /etc/mtab so that whatever already exists in the system
is used
2. Check if a regular file already exists, and hasn't been extracted yet,
before extracting
I went with option 1 because for option 2 we'd have to keep a list of
all files that had been extracted in memory.
Refactoring builds by stage will make it easier to generate cache keys
for layers, since the stageBuilder type will contain everything required
to generate the key:
1. Base image with digest
2. Config file
3. Snapshotter (which will provide a key for the filesystem)
4. The current command (which will be passed in)
This will return a string representaiton of the current filesystem to be
used with caching.
Whenever a file is explictly added (via ADD or COPY), it will be stored
in "added" in the LayeredMap. The file will map to a hash created by
CacheHasher (which doesn't take into account mtime, since that will be
different with every build, making the cache useless)
Key() will returns a sha of the added files which will be used in
determining the overall cache key for a command.
CacheCommand returns true if the command should be cached. Currently,
it's only true for RUN but can be added to ADD/COPY later on (these are
different since the contents of files for ADD/COPY need to be included
in the cache key generation).
I also changed CreatedBy to String so that we can log each command
before cache extraction or regular execution takes place.
The bug in #329 occurred because of a bug in matchSources, where the
filepath wasn't absolute, so the source "/kaniko-bug/*" wasn't being
matched to the file "kaniko-bug/test-file"
To fix this, I added logic for making filepaths absolute and added to
the unit test for the function to test that it works.
Extracting the layers of the filesystem in order will make it easier to
extract cached layers and deal with hardlinks.
This PR implements extracting in order and adds an integration tests to
make sure hardlinks are extracted properly.
It also fixes two bugs I found when extracting symlinks:
1. We'd get a "file exists" error when trying to symlink to an existing
file with a whiteout later in the layer tarball
2. We'd get a "file exists" error when trying to create a symlink from a
file that was created in a prior layer (perhaps as a regular file or as
a symlink pointing to someting else)
To fix both of these, we resolve all symlinks in a layer at the end. I
also added logic to delete any existing paths before creating the
symlink.
I added a KanikoStage to hold each stage of the Dockerfile along with
information about each stage that would be useful later on.
The new KanikoStage type holds the stage itself, along with some
additional information:
1. FinalStage -- whether the current stage is the final stage
2. BaseImageStoredLocally/BaseImageIndex -- whether the base image for
this stage is stored locally, and if so what the index of the base image
is
3. SaveStage -- whether this stage needs to be saved for use in a future
stage
This is the first part of a larger refactor for building stages, which
will later make it easier to add layer caching.
I changed UnpackLocalTarArchive to return a list of files that were
extracted, so that the list of snapshotted files for ADD is more
accurate. Previously, we used to add all files in the extracted dir to
be snapshotted, but this could result in preexisting files being
snapshotted again.
Before #289 was merged, when copying over directories for COPY kaniko
would get a list of all files at the destination specified and add them
to the list of files to be snapshotted. If the destination was root it
would add all files. This worked because the snapshotter made sure the
file had been changed before adding it to the layer.
After #289, we changed the logic to add all files snapshotted to a layer
without checking if the files had been changed. This created the bug in
got all the files at root and added them to the layer without checking
if they had been changed.
This change should fix this bug. Now, the CopyDir function returns a
list of files it copied over and only those files are added to the list
of files to be snapshotted.
Should fix#314
To make the logic a bit more clear, when snapshotting files, the
parent dirs are now snapshotted in a different loop from the files we
are actually trying to snapshot. Unfortunately this loop is nearly
duplicated but I did managed to group some fo the related logic
together:
- A function to check if the file should be snapshotted (e.g. isn't
whitelisted, etc.)
- Created a `Tar` type to handle some of the logic around tar-ing, e.g.
tracking hardlinks and stat-ing files before adding them
One side effect of this is that now when snapshoting the file system,
files will be stat-ed twice.
This test had previously (before #231) been making a change to a file in
the kaniko dir, then checking that it isn't being snapshotted. This was
to test the whitelisting logic, which makes sure that changes to /kaniko
aren't included in images. However the test creates a temporary dir, so
the kaniko dir is actually in /tmp/<some temp dir>/kaniko, and
in #231 the logic was simplified to no longer have a special case for
tests. The test continued to pass because `MaybeAdd` noticed that the
kaniko file wasn't changing, and didn't add it. After changing this to
always add the files, it revealed that this was left behind by accident.
I also opened #307 to add integration test coverage for this logic.
I also marked `CheckErrorAndDeepEqual` as a helper function so that when
it fails, the line number reported is where that was called.
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.
To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.
It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.
This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.
Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.
An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.
Fixes#251
In this refactor I:
1. Created KanikoOptions to make it easier to pass around arguments
passed in through the command line
2. Reorganized executor.go by putting the logic for pushing the image in
a new file push.go
3. Made some error messages clearer
4. Fixed a mistake in the README for pushing to AWS
5. Marked the --bucket flag as hidden since we want people to use
--context instead, and marked an aws flag as hidden which is set in a
vendored directorya
This change should fix the bug in #294, where kaniko wasn't recognizing
that a stage would be used in a later build and so wasn't saving it as a
tarball.
Each stage of the Dockerfile has a Name and a BaseName (FROM BaseName as
Name), but if a Name isn't specified then it's set to the same value as
BaseName. Our test cases weren't complete enough to catch this
distinction, which is why this bug occurred.
I added more test cases to the unit tests to make sure this fix works.
Issue 291 pointed out that symlink "../proc/self/mounts" in the fedora image wasn't being extracted properly and kaniko was erroring out.
This is because the file path wasn't absolute so kaniko wasn't recognizing it as a whitelisted path.
With this change, we first resolve a path to it's absolute path before checking the whitelist.
* dep ensure and use k8schain
* checkpoint
* fix vendoring, stuff builds
* Use k8schain for pushes too
* Use NewNoClient
* update ggcr dep
* Move k8schain usage to image_util.go
When this test was originally created, an HTTP get to
`https://url.com/something/not/real` probably failed, but now it
will return a `503`, i.e. the `http.Get` call will succeed.
This test will now use a URL which should not reasonable ever
succeed (famous last words). Alternatively we could use dependency
injection and mock `http.Get` but it doesn't seem worth it.
This commit also updates the test to use `Run` to run each test
in the table test as a separate test so we can get a clear indication
which cases fail and which succeed.
* added switch to extract different sources as build context
* first rough implementation of aws s3
* added buildcontext package and interface
* added GetBuildContext func to buildcontext.go
added fallback to gcs
renamed GC struct to GCS
* improved the default behavior of build context retrieval
* renamed gc:// to gs:// in order to follow common standards
* renamed struct File to Dir and some cleanup work
* moved context.tar suffix to the buildcontext processors where it is needed
* added buildcontext retrieval as struct variable
added fallback if prefix in bucket specifier is present
* cleanup if structures
* added prefix to s3
* WIP
* Fixed build context bugs
* refactored build context
* adding reproducible test
* newer version of go-containerregistry
* new ImageOptions
* switch reproducible flag to default to false
* small fixes
* update dep
* adding metadata tests back to integration tests and fixing resulting bugs
* fix onbuild and default env
* removing old test files
* adding the ArgsEscaped boolean on CMD commands
* fix onbuild test
* ignore failing test until container-diff is fixed
* code comments
* adding todo to remove uncomment failing test
* Vendor changes for go-containerregistry switch.
* Manual changes for go-containerregistry switch.
The biggest change is refactoring the tarball unpacking.
* Pull more of container-diff out.
* More vendor removals.
* More unit tests.
* adding VOLUME command
* proper test project
* general fixes
* fixing project name
* fixing volume unit test
* fixing integration test
* adding tests
* adding util test
* fixing test
* actually create the volume mounted directory
* fix test