Commit Graph

143 Commits

Author SHA1 Message Date
Priya Wadhwa b0b36ed85a Re-add support for .dockerignore file
This PR adds support for the dockerignore file. Previously when kaniko
had support for the dockerignore file, kaniko first went through the
build context and deleted files that were meant to be ignored. This
resulted in a really bad bug where files in user mounted volumes would
be deleted (my bad).

This time around, instead of modifying the build context at all, kaniko
will check if a file should be excluded when executing ADD/COPY
commands. If a file should be excluded (based on the .dockerignore) it
won't be copied over from the buildcontext and shouldn't end up in the
final image.

I also added a .dockerignore file and Dockerfile as an integration test,
which should fail if the dockerignore is not being processed correctly or if files aren't being excluded correctly.
Also, I removed all the integration testing from the previous version of the
dockerignore support.
2018-12-10 15:20:25 -08:00
priyawadhwa 0b7fa58ca2
Merge pull request #459 from aduong/symlinks
Overwrite existing dest when copying symlink and preserve link target
2018-11-30 09:28:26 -08:00
Priya Wadhwa edd873d1fe remove unnecessary filepath.Join since dest is always an absolute path 2018-11-19 15:07:56 -05:00
Priya Wadhwa 2cf6f52517 Fix hardlink unit test
Now that hardlink destinations take into account the directory that they are being extracted to, the unit test had to be updated to make sure that two hardlinks were extracted to /tmp/hardlink correctly.
2018-11-19 15:03:31 -05:00
Adrian Duong f23cc32c42 Overwrite existing dest when copying symlink and preserve link target 2018-11-17 23:06:34 -08:00
Adrian Duong 3367268ef9 Test for fs_util.CopySymlink 2018-11-17 23:05:52 -08:00
Priya Wadhwa 9d67953ed3 Fix bug in extracting hardlinks
When we execute multistage builds, we store the fs of each intermediate
stage at /kaniko/<stage number> if it's used later in the build. This
created a bug when extracting hardlinks, because we weren't appending
the new directory to the link path.

So, if `/tmp/file1` and `/tmp/file2` were hardlinked, kaniko was trying
to link `/kaniko/0/tmp/file1` to `/tmp/file2` instead of
`/kaniko/0/tmp/file2`. This change will append the correct directory to
the link, and fixes #437 #362 #352 #342.
2018-11-16 16:18:49 -08:00
dlorenc 0c294138b8
Make snapshotting faster by using filepath.SkipDir. (#451)
filepath.Walk has a special error you can return from your walkFn
indicating it should skip directories. This change makes use of that
to skip whitelisted directories.
2018-11-14 17:44:38 -06:00
dlorenc 063663e17b
Skip unpacking the base FS if there are no run commands (or only cached ones). (#440)
This is the final part of an optimization that I've been refactoring towards for awhile.
If the Dockerfile consists of no RUN commands, or cached RUN commands, followed by metadata-only
operations, we can skip downloading and unpacking the base image.
2018-11-12 12:51:45 -06:00
Sharif Elgamal 224b7e2b41
parse arg commands at the top of dockerfiles (#404)
* parse arg commands at the top of dockerfiles

* fix pointer reference bug and remove debugging

* fixing tests

* account for meta args with no value

* don't take fs snapshot if / is the only changed path

* move metaArgs inside KanikoStage

* removing unused property

* check for any directory instead of just /

* remove unnecessary check
2018-11-06 15:27:09 -08:00
dlorenc fc43e218f0
Buffer layers to disk as they are created. (#428)
When building Docker images, layers were previously stored in memory.
This caused obvious issues when manipulating large layers, which could
cause Kaniko to crash.
2018-11-06 09:26:54 -06:00
priyawadhwa 632bedf75c
Merge pull request #413 from priyawadhwa/auth
Use remoteImage function when getting digest for cache
2018-10-29 10:39:58 -07:00
Priya Wadhwa 9908eeb30a Use remoteImage function when getting digest for cache
Issue #410 experienced an error with base image caching where they were
"Not Authorized" to get information for a remote image, but later were
able to download and extract the base image.

To fix this, we can switch to using the remoteImage function for getting
information about the digest, which is the same function used for
downloading base images. This way we can also take advantage of the
--insecure and --skip-tls-verify flags if users pass those in when
trying to get digests for the cache as well.
2018-10-26 11:38:32 -07:00
Daisuke Taniwaki e04a922dc3
Separate insecure pull options 2018-10-25 06:33:58 +09:00
Daisuke Taniwaki 05e3250043 Support insecure pull (#401) 2018-10-22 14:33:41 -07:00
priyawadhwa f4612404c4
Merge pull request #389 from peter-evans/fix-symlink-extraction
Fix symlink extraction
2018-10-18 10:21:23 -07:00
Deniz Zoeteman 129eb9b8a8
Change loglevel for copying files to debug (#303) 2018-10-12 16:16:48 +02:00
priyawadhwa aabb97b944
Merge pull request #390 from Zetten/enhance-is-dest-dir
Improve IsDestDir functionality with filesystem info
2018-10-11 18:15:59 -07:00
Sharif Elgamal effac9dfc3
Persistent volume caching for base images (#383)
* comments

* initial commit for persisent volume caching

* cache warmer works

* general cleanup

* adding some debugging

* adding missing files

* Fixing up cache retrieval and cleanup

* fix tests

* removing auth since we only cache public images

* simplifying the caching logic

* fixing logic

* adding volume cache to integration tests. remove auth from cache warmer image.

* add building warmer to integration-test

* move sample yaml files to examples dir

* small test fix
2018-10-11 13:38:05 -07:00
Peter van Zetten 073abff176 Improve IsDestDir functionality with filesystem info
Add a check for FileInfo to determine whether a given string is a
directory path. If any error occurs, fall back to the naive string
check.

Fixes #365
2018-10-11 11:11:12 +01:00
peter-evans 38e8dc2cdd Remove all at path to make way for new reg files and links 2018-10-11 15:33:25 +09:00
peter-evans 5695ebc3d5 Remove all at path to make way for new symlink 2018-10-11 09:28:55 +09:00
priyawadhwa 8f0d257134
Merge pull request #334 from peter-evans/fix-volume-cmd
Fix handling of the volume directive
2018-10-01 14:49:33 -07:00
dlorenc e1b0f7732e
Fixes a whitelist issue when untarring files in ADD commands. (#371)
* Fixes a whitelist issue when untarring files in ADD commands.

* Add go-cmp test tool.

* Make the integration test tolerate some file differences.
2018-09-28 11:42:07 -07:00
peter-evans b1e28ddb4f Fix handling of volume directive 2018-09-28 11:16:25 +09:00
xanonid 59cb0ebec9 Enable overwriting of links (solves #351) (#360)
* Enable overwriting of links (solves #351)

* add integration test to check extraction of images with replaced hardlinks

* Prevent following symlinks during extracting normal files

This fixes #359, #361, #362.
2018-09-26 07:14:35 -07:00
Priya Wadhwa c216fbf91b Add layer caching to kaniko
To add layer caching to kaniko, I added two flags: --cache and
--use-cache.

If --use-cache is set, then the cache will be used, and if --cache is
specified then that repo will be used to store cached layers. If --cache
isn't set, a cache will be inferred from the destination provided.

Currently, caching only works for RUN commands. Before executing the
command, kaniko checks if the cached layer exists. If it does, it pulls
it and extracts it. It then adds those files to the snapshotter and
append a layer to the config history.  If the cached layer does not exist, kaniko executes the command and
pushes the newly created layer to the cache.

All cached layers are tagged with a stable key, which is built based off
of:

1. The base image digest
2. The current state of the filesystem
3. The current command being run
4. The current config file (to account for metadata changes)

I also added two integration tests to make sure caching works

1. Dockerfile_test_cache runs 'date', which should be exactly the same
the second time the image is built
2. Dockerfile_test_cache_install makes sure apt-get install can be
reproduced
2018-09-13 18:32:53 -07:00
priyawadhwa c814466e15
Merge pull request #347 from priyawadhwa/amazon
Whitelist /etc/mtab
2018-09-12 16:08:12 -07:00
Priya Wadhwa ccb6259b06 More linting errors 2018-09-11 14:58:25 -07:00
Priya Wadhwa 99ab68e7f4 Replace gometalinter with GolangCI-Lint
gometalinter is broken @ HEAD, and I looked into why that was. During
that process, I remembered that we took the linting scripts from
skaffold, and found that in skaffold gometalinter was replaced with
GolangCI-Lint:

https://github.com/GoogleContainerTools/skaffold/pull/619

The change made linting in skaffold faster, so I figured instead of
fixing gometalinter it made more sense to remove it and replace it with
GolangCI-Lint for kaniko as well.
2018-09-11 13:30:42 -07:00
Tejal Desai 06defa6552
Merge pull request #337 from priyawadhwa/hasher
Add Key() to LayeredMap and Snapshotter
2018-09-11 09:29:50 -07:00
Priya Wadhwa c13f6e84ed Fixed unit test 2018-09-10 18:20:00 -07:00
Priya Wadhwa 63cecbff74 Whitelist /etc/mtab
While looking into #345, we were seeing the error:

Error: error building image: chmod /etc/mtab: operation not permitted

during extraction of `amazonlinux:1`. I looked into why kaniko couldn't
extract this file properly, and found that it already existed as a
symlink pointing to /proc/mounts, which returned an error when we tried
to run chmod on it.

Confusingly, in the image the /etc/mtab is a regular file, not a
symlink.

I can think of two ways to solve this problem:
  1. Whitelist /etc/mtab so that whatever already exists in the system
  is used
  2. Check if a regular file already exists, and hasn't been extracted yet,
  before extracting

I went with option 1 because for option 2 we'd have to keep a list of
all files that had been extracted in memory.
2018-09-10 17:06:09 -07:00
priyawadhwa 4dc34343b6
Merge pull request #320 from priyawadhwa/stages
Added a KanikoStage type for each stage of a Dockerfile
2018-09-07 16:19:40 -07:00
Priya Wadhwa 80a449f541 code review comments 2018-09-07 16:03:56 -07:00
Priya Wadhwa 13accbaf32 Add Key() to LayeredMap and Snapshotter
This will return a string representaiton of the current filesystem to be
used with caching.

Whenever a file is explictly added (via ADD or COPY), it will be stored
in "added" in the LayeredMap. The file will map to a hash created by
CacheHasher (which doesn't take into account mtime, since that will be
different with every build, making the cache useless)

Key() will returns a sha of the added files which will be used in
determining the overall cache key for a command.
2018-09-04 13:42:33 -07:00
Priya Wadhwa 1513295103 Make sure paths are absolute before matching files to wildcard sources
The bug in #329 occurred because of a bug in matchSources, where the
filepath wasn't absolute, so the source "/kaniko-bug/*" wasn't being
matched to the file "kaniko-bug/test-file"

To fix this, I added logic for making filepaths absolute and added to
the unit test for the function to test that it works.
2018-08-31 11:57:20 -07:00
Priya Wadhwa 0636fe6040 Merge branch 'master' of github.com:GoogleContainerTools/kaniko into stages 2018-08-30 16:17:44 -07:00
priyawadhwa deda0ea04d
Merge pull request #326 from priyawadhwa/fs-forward
Extract filesystem in order rather than in reverse
2018-08-30 10:49:13 -07:00
Priya Wadhwa 1db7fc2a61 Rebased 2018-08-30 10:16:08 -07:00
Priya Wadhwa e0130c5942 Remove extra symlink check 2018-08-30 10:13:06 -07:00
Priya Wadhwa 85393a60c2 Fixed unit tests 2018-08-29 16:11:03 -07:00
Priya Wadhwa 15db85e36a Configure logs to show colors 2018-08-29 16:08:09 -07:00
Priya Wadhwa 5bdb87e0e7 Extract filesystem in order rather than in reverse
Extracting the layers of the filesystem in order will make it easier to
extract cached layers and deal with hardlinks.

This PR implements extracting in order and adds an integration tests to
make sure hardlinks are extracted properly.

It also fixes two bugs I found when extracting symlinks:

1. We'd get a "file exists" error when trying to symlink to an existing
file with a whiteout later in the layer tarball
2. We'd get a "file exists" error when trying to create a symlink from a
file that was created in a prior layer (perhaps as a regular file or as
a symlink pointing to someting else)

To fix both of these, we resolve all symlinks in a layer at the end. I
also added logic to delete any existing paths before creating the
symlink.
2018-08-29 15:44:38 -07:00
Priya Wadhwa 935d322f1d Rebased on master 2018-08-27 14:18:24 -07:00
Priya Wadhwa 64a0b1d75f Added a KanikoStage type for each stage of a Dockerfile
I added a KanikoStage to hold each stage of the Dockerfile along with
information about each stage that would be useful later on.

The new KanikoStage type holds the stage itself, along with some
additional information:

1. FinalStage -- whether the current stage is the final stage
2. BaseImageStoredLocally/BaseImageIndex -- whether the base image for
this stage is stored locally, and if so what the index of the base image
is
3. SaveStage -- whether this stage needs to be saved for use in a future
stage

This is the first part of a larger refactor for building stages, which
will later make it easier to add layer caching.
2018-08-27 14:15:04 -07:00
Priya Wadhwa 7080a8dd69 Add specific files from tar archives to list of snapshotted filesa
I changed UnpackLocalTarArchive to return a list of files that were
extracted, so that the list of snapshotted files for ADD is more
accurate. Previously, we used to add all files in the extracted dir to
be snapshotted, but this could result in preexisting files being
snapshotted again.
2018-08-27 13:44:39 -07:00
Priya Wadhwa 9a93f5bad9 Snapshot only specific files for COPY
Before #289 was merged, when copying over directories for COPY kaniko
would get a list of all files at the destination specified and add them
to the list of files to be snapshotted. If the destination was root it
would add all files. This worked because the snapshotter made sure the
file had been changed before adding it to the layer.

After #289, we changed the logic to add all files snapshotted to a layer
without checking if the files had been changed. This created the bug in
got all the files at root and added them to the layer without checking
if they had been changed.

This change should fix this bug. Now, the CopyDir function returns a
list of files it copied over and only those files are added to the list
of files to be snapshotted.

Should fix #314
2018-08-27 11:39:00 -07:00
Christie Wilson 7f64037a8c Separate snapshotting of parent dirs from files
To make the logic a bit more clear, when snapshotting files, the
parent dirs are now snapshotted in a different loop from the files we
are actually trying to snapshot. Unfortunately this loop is nearly
duplicated but I did managed to group some fo the related logic
together:
- A function to check if the file should be snapshotted (e.g. isn't
whitelisted, etc.)
- Created a `Tar` type to handle some of the logic around tar-ing, e.g.
tracking hardlinks and stat-ing files before adding them

One side effect of this is that now when snapshoting the file system,
files will be stat-ed twice.
2018-08-24 16:34:59 -07:00
Christie Wilson 607af5f7a6 Always snapshot files in COPY and RUN commands
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.

To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.

It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.

This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.

Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.

An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.

Fixes #251
2018-08-23 18:23:39 -07:00