Commit Graph

25 Commits

Author SHA1 Message Date
Sharif Elgamal effac9dfc3
Persistent volume caching for base images (#383)
* comments

* initial commit for persisent volume caching

* cache warmer works

* general cleanup

* adding some debugging

* adding missing files

* Fixing up cache retrieval and cleanup

* fix tests

* removing auth since we only cache public images

* simplifying the caching logic

* fixing logic

* adding volume cache to integration tests. remove auth from cache warmer image.

* add building warmer to integration-test

* move sample yaml files to examples dir

* small test fix
2018-10-11 13:38:05 -07:00
Priya Wadhwa 589197b416 Fix travis @ HEAD
I merged a contributor's PR which modifed the HasFilepathPrefix function
to take an additional argument, but the PR hadn't been rebased. One of the
liting tests in Travis caught this bug.
2018-10-01 14:56:46 -07:00
priyawadhwa 8f0d257134
Merge pull request #334 from peter-evans/fix-volume-cmd
Fix handling of the volume directive
2018-10-01 14:49:33 -07:00
dlorenc e1b0f7732e
Fixes a whitelist issue when untarring files in ADD commands. (#371)
* Fixes a whitelist issue when untarring files in ADD commands.

* Add go-cmp test tool.

* Make the integration test tolerate some file differences.
2018-09-28 11:42:07 -07:00
peter-evans b1e28ddb4f Fix handling of volume directive 2018-09-28 11:16:25 +09:00
Priya Wadhwa eb7194a165 Fix linting errors 2018-09-13 22:06:38 -07:00
Priya Wadhwa c216fbf91b Add layer caching to kaniko
To add layer caching to kaniko, I added two flags: --cache and
--use-cache.

If --use-cache is set, then the cache will be used, and if --cache is
specified then that repo will be used to store cached layers. If --cache
isn't set, a cache will be inferred from the destination provided.

Currently, caching only works for RUN commands. Before executing the
command, kaniko checks if the cached layer exists. If it does, it pulls
it and extracts it. It then adds those files to the snapshotter and
append a layer to the config history.  If the cached layer does not exist, kaniko executes the command and
pushes the newly created layer to the cache.

All cached layers are tagged with a stable key, which is built based off
of:

1. The base image digest
2. The current state of the filesystem
3. The current command being run
4. The current config file (to account for metadata changes)

I also added two integration tests to make sure caching works

1. Dockerfile_test_cache runs 'date', which should be exactly the same
the second time the image is built
2. Dockerfile_test_cache_install makes sure apt-get install can be
reproduced
2018-09-13 18:32:53 -07:00
Priya Wadhwa 7635421ae9 Add t.Helper() call to checkLayers for better error logging 2018-09-11 16:24:21 -07:00
Priya Wadhwa 99ab68e7f4 Replace gometalinter with GolangCI-Lint
gometalinter is broken @ HEAD, and I looked into why that was. During
that process, I remembered that we took the linting scripts from
skaffold, and found that in skaffold gometalinter was replaced with
GolangCI-Lint:

https://github.com/GoogleContainerTools/skaffold/pull/619

The change made linting in skaffold faster, so I figured instead of
fixing gometalinter it made more sense to remove it and replace it with
GolangCI-Lint for kaniko as well.
2018-09-11 13:30:42 -07:00
Priya Wadhwa 5bdb87e0e7 Extract filesystem in order rather than in reverse
Extracting the layers of the filesystem in order will make it easier to
extract cached layers and deal with hardlinks.

This PR implements extracting in order and adds an integration tests to
make sure hardlinks are extracted properly.

It also fixes two bugs I found when extracting symlinks:

1. We'd get a "file exists" error when trying to symlink to an existing
file with a whiteout later in the layer tarball
2. We'd get a "file exists" error when trying to create a symlink from a
file that was created in a prior layer (perhaps as a regular file or as
a symlink pointing to someting else)

To fix both of these, we resolve all symlinks in a layer at the end. I
also added logic to delete any existing paths before creating the
symlink.
2018-08-29 15:44:38 -07:00
Christie Wilson 7f64037a8c Separate snapshotting of parent dirs from files
To make the logic a bit more clear, when snapshotting files, the
parent dirs are now snapshotted in a different loop from the files we
are actually trying to snapshot. Unfortunately this loop is nearly
duplicated but I did managed to group some fo the related logic
together:
- A function to check if the file should be snapshotted (e.g. isn't
whitelisted, etc.)
- Created a `Tar` type to handle some of the logic around tar-ing, e.g.
tracking hardlinks and stat-ing files before adding them

One side effect of this is that now when snapshoting the file system,
files will be stat-ed twice.
2018-08-24 16:34:59 -07:00
Christie Wilson 607af5f7a6 Always snapshot files in COPY and RUN commands
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.

To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.

It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.

This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.

Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.

An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.

Fixes #251
2018-08-23 18:23:39 -07:00
priyawadhwa 52e9863810
fix add command bug when adding remote URLs (#277) 2018-08-07 17:10:27 -07:00
Christie Wilson 57b1159951 Add a bit more context to layer offset failures
In #251 we are investigating test flakes due to layer offsets not
matching, this change will give us a bit more context so we can be sure
which image has which number of layers, and it will also include the
digest of the image, since kaniko always pushes images to a remote repo,
so if the test fails we can pull the digest and see what is up.

Also updated reproducible Dockerfile to be built with reproducible flag,
which I think was the original intent (without this change, there is no
difference between how `kaniko-dockerfile_test_copy_reproducible` and
`kaniko-dockerfile_test_copy` are built.
2018-07-31 15:33:47 -07:00
Christie Wilson b5a4d7636f Pass bucket and repo as args to tests
To allow contributors to run the integration tests with their own GCS
buckets and image repos (since not all contributors will have accesss to
the projects used by the kaniko maintainers) this updates the
integration tests so that these can be provided on the command line.

This allows tests to be run individually, without using `make
integration-test`. Previously, part of the test setup was done
in the shell script (creating the context tarball that is required
for the tests that build images with context). Instead it will be
done in the test iself, so we can use `go test` to run tests
individually if we want to.

If we are running only one individual test, we don't want to build
all of the images, so this commit creates a builder which tracks which
images it has built and can be used by a tests to check if it should
build an image before running, or it will use the images that have
already been built by a previous test.

The name of the context tarball has also been made unique (it includes
the unix timestamp) to avoid potential test flakes if two tests using
the same GCS bucket run simultaneously.
2018-07-31 09:53:59 -07:00
priyawadhwa cac00b9cb2
Add --target flag for multistage builds (#255)
* Add --target flag for multistage builds

* change validate to validateTarget
2018-07-30 09:43:23 -07:00
Christian Jantz 65d7b0a9aa Feature/contextsources (#195)
* added switch to extract different sources as build context

* first rough implementation of aws s3

* added buildcontext package and interface

* added GetBuildContext func to buildcontext.go
added fallback to gcs
renamed GC struct to GCS

* improved the default behavior of build context retrieval

* renamed gc:// to gs:// in order to follow common standards

* renamed struct File to Dir and some cleanup work

* moved context.tar suffix to the buildcontext processors where it is needed

* added buildcontext retrieval as struct variable

added fallback if prefix in bucket specifier is present

* cleanup if structures

* added prefix to s3

* WIP

* Fixed build context bugs

* refactored build context
2018-07-06 06:24:50 -07:00
Priya Wadhwa 28f3ec30c2
update vendored go-containerregistry 2018-06-22 14:45:46 -07:00
Sharif Elgamal a7c82cf6f6
adding reproducible flag (#205)
* adding reproducible test

* newer version of go-containerregistry

* new ImageOptions

* switch reproducible flag to default to false

* small fixes

* update dep
2018-06-22 12:00:44 -07:00
Priya Wadhwa 2adf82cd66
update integration test 2018-06-21 15:27:05 -07:00
Priya Wadhwa b09526bc62 added integration test to check layers 2018-06-14 08:44:49 -07:00
Priya Wadhwa 89c9f15bde Add --single-snapshot flag to snapshot once after the build 2018-06-13 11:22:12 -07:00
Priya Wadhwa 44d7266058
Resolve env replacement for FROM command 2018-06-04 11:51:33 -07:00
Sharif Elgamal 5e6b60f46e
adding metadata tests back to integration tests (#185)
* adding metadata tests back to integration tests and fixing resulting bugs

* fix onbuild and default env

* removing old test files

* adding the ArgsEscaped boolean on CMD commands

* fix onbuild test

* ignore failing test until container-diff is fixed

* code comments

* adding todo to remove uncomment failing test
2018-05-24 11:28:32 -07:00
Sharif Elgamal f8aa88b119
Integration test refactoring (#126)
* integration test refactoring

* config file cleanup

* more test refactoring

* remove debug file

* moving around more files

* fixing up integration tests

* integration tests work

* some housekeeping

* fixing tests

* addressing comments

* debugging

* debugging

* actual debugging

* skip integration tests for travis

* install container-diff before integration tests

* syntax

* make test failures less noisy

* fixing tests

* hopefully fixing CI?

* fixes

* more fixes

* let's actually fix CI

* more testing

* testing

* proper auth

* typos

* adding support for args in integration tests

* formatting

* formatting

* adding support for testing bucket context

* adding bucket test dockerfile

* addressing comments

* syntax
2018-05-15 13:42:35 -07:00