Commit Graph

54 Commits

Author SHA1 Message Date
Tejal Desai cbf3073fda rename whitelist to ignorelist 2020-06-02 15:56:27 -07:00
Tejal Desai 41bc04dc12 add timings for resolving pahts 2020-05-23 14:29:46 -07:00
Gilbert Gilb's fd8a2d6dd8 Merge branch 'master' into snapshot-directories 2020-03-31 14:25:04 +02:00
Gilbert Gilb's e5585fded8 Always add parent directories of files to snapshots.
During a snapshot, when a file changed and not its parent directories,
the parent directories weren't added to the layer. This is inconsistent
with Docker's behavior which always add parent directories to the layer.
In some edge-cases, it could lead to problems with docker considering
that parent directories where owned by root in forthcoming layers
although they shouldn't (see #1163).

Also, Docker seems to be POSIX compliant regarding the name of
directories in the archive, which always have a slash appended. This
commit also fixes this.

Fixes #1163
2020-03-29 18:25:37 +02:00
Tejal Desai ffc372a63b refactor to add unit tests 2020-03-23 17:48:49 -07:00
Tejal Desai 6c14d202a3 better error wrapping and add more tests for copy 2020-03-06 17:18:36 -08:00
cvgw 965b606720 remove cruft and unneeded loop 2020-02-23 13:38:08 -08:00
cvgw a675ad998a Resolve filepaths before scanning for changes 2020-02-20 09:45:44 -08:00
tinkerborg cc2c9a0663 sort filesToAdd in TakeSnapshot
filesToAdd is sorted in TakeSnapshotFS, but not here. This makes ordering unpredictable within the layer's tarball,
causing the SHA to differ even if layer contents haven't changed
2020-02-06 13:36:21 -05:00
Thomas Bonfort 6b6742fd9d Changle loglevel for whiteouts to debug 2020-01-31 12:02:32 +01:00
Tejal Desai 3e5d0a6334 add unit tests 2020-01-23 11:12:54 -08:00
Tejal Desai bb129e9c88 code review comments 2020-01-22 16:27:06 -08:00
Tejal Desai 478205e5ca fix adding symlinks to FS which do not exists 2020-01-22 15:27:01 -08:00
Tejal Desai f1f7297478 fix tests 2020-01-22 11:47:10 -08:00
Tejal Desai da7e9928e4 Fix Symlinks not being copies across stages 2020-01-22 11:47:10 -08:00
Prashant Arya 85f1a5db00 Merge branch 'master' of https://github.com/GoogleContainerTools/kaniko into log 2019-12-19 03:20:56 +00:00
Cole Wippern 697037cbcf Add unit tests for compositecache and stagebuilder
* add mock types for testing
* enhance error messaging
* add tests
2019-11-27 21:47:00 -08:00
Prashant Arya 857715012f changing log level 2019-11-26 17:52:11 +00:00
priyawadhwa 7adf2fcb50
Merge pull request #714 from MJDSys/reproducible_add
Make container layers captured using FS snapshots reproducible
2019-08-19 14:39:18 -07:00
Taylor Barrella 3422d5572a Misc. small changes/refactoring (#712) 2019-07-23 15:10:22 -07:00
Matthew Dawson 619fc5e59b Make container layers captured using FS snapshots reproducible
When a Dockerfile command requires using the TakeSnapshotFS function,
the resulting layer has a random ordering of files.  This causes the
layer to have a non-deterministic hash defeating the reproducible flag.
Issue #710 appears to document this issue as well.

To fix, always sort the list of files to be added in scanFullFilesystem.
This avoids trying to sort the file list during execution, and takes
almost no time to complete.
2019-07-11 21:58:42 -04:00
Dirk Gustke dd9d081447 this is quite spammy in my multistage build (#640)
.. and as i am surely not the only one, move it down to debug.
2019-04-15 13:22:46 -07:00
Daisuke Taniwaki 1bf4421047 Fix parent directory permissions (#619)
* Add parent directories of adding files

* Add integration Dockerfile to test parent directory permissions

* Remove unnecessary helper method

* Use a file on the internet for integration Dockerfile
2019-03-19 12:40:15 -05:00
Jason Hall faadb2af86 Log "Skipping paths under..." to debug (#571)
This reduces noise in the log output, since it isn't terribly useful to most end users.
2019-02-19 13:54:26 -06:00
dlorenc 8d78db4842
Refactor snapshotting (#561) 2019-02-14 12:14:28 -06:00
dlorenc 877abd30ed
Refactor whitelist handling. (#559)
Also speed up stage deletion.
2019-02-13 11:17:56 -06:00
dlorenc 170e0a2d94
Add a lot more timing data. (#518) 2019-01-10 13:27:55 -07:00
dlorenc 9ab66560db
Simplify snapshotting. (#517) 2019-01-09 15:31:02 -07:00
dlorenc a044e2b6e4
Even faster snapshotting with godirwalk. (#504)
This switches from filepath.Walk to godirwalk.Walk for even faster snapshotting.
A quick test shows a 40% improvement on the dockerfile_mv_add build.
2019-01-03 13:10:18 -06:00
Priya Wadhwa 2a359f547c Only return filepath.SkipDir for directories
From the docs on filepath.SkipDir:

> If the function returns SkipDir when invoked on a non-directory file, Walk skips the remaining files in the containing directory

This was causing the bug in #457. Since the file `/etc/hosts` was in the whitelist, when filepath.SkipDir was called the entire etc directory was skipped.

This change only returns filepath.SkipDir on directories.
2018-11-19 15:56:11 -05:00
dlorenc 0c294138b8
Make snapshotting faster by using filepath.SkipDir. (#451)
filepath.Walk has a special error you can return from your walkFn
indicating it should skip directories. This change makes use of that
to skip whitelisted directories.
2018-11-14 17:44:38 -06:00
dlorenc fc43e218f0
Buffer layers to disk as they are created. (#428)
When building Docker images, layers were previously stored in memory.
This caused obvious issues when manipulating large layers, which could
cause Kaniko to crash.
2018-11-06 09:26:54 -06:00
peter-evans b1e28ddb4f Fix handling of volume directive 2018-09-28 11:16:25 +09:00
Priya Wadhwa 13accbaf32 Add Key() to LayeredMap and Snapshotter
This will return a string representaiton of the current filesystem to be
used with caching.

Whenever a file is explictly added (via ADD or COPY), it will be stored
in "added" in the LayeredMap. The file will map to a hash created by
CacheHasher (which doesn't take into account mtime, since that will be
different with every build, making the cache useless)

Key() will returns a sha of the added files which will be used in
determining the overall cache key for a command.
2018-09-04 13:42:33 -07:00
Christie Wilson 7f64037a8c Separate snapshotting of parent dirs from files
To make the logic a bit more clear, when snapshotting files, the
parent dirs are now snapshotted in a different loop from the files we
are actually trying to snapshot. Unfortunately this loop is nearly
duplicated but I did managed to group some fo the related logic
together:
- A function to check if the file should be snapshotted (e.g. isn't
whitelisted, etc.)
- Created a `Tar` type to handle some of the logic around tar-ing, e.g.
tracking hardlinks and stat-ing files before adding them

One side effect of this is that now when snapshoting the file system,
files will be stat-ed twice.
2018-08-24 16:34:59 -07:00
Christie Wilson 607af5f7a6 Always snapshot files in COPY and RUN commands
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.

To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.

It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.

This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.

Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.

An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.

Fixes #251
2018-08-23 18:23:39 -07:00
Priya Wadhwa d8ae5618af Get absolute path of file before checking whitelist
Issue 291 pointed out that symlink "../proc/self/mounts" in the fedora image wasn't being extracted properly and kaniko was erroring out.
This is because the file path wasn't absolute so kaniko wasn't recognizing it as a whitelisted path.
With this change, we first resolve a path to it's absolute path before checking the whitelist.
2018-08-17 18:29:11 -04:00
priyawadhwa 52e9863810
fix add command bug when adding remote URLs (#277) 2018-08-07 17:10:27 -07:00
priyawadhwa 71c83e369c
Only add whiteout files once (#270)
* Only add whiteout files once

* Updated vars
2018-08-01 17:27:20 -07:00
priyawadhwa 31b7cd3732
Fix bug in copy command by refactoring whitelist checks (#231)
* Fixed bug

* WIP

* fix unit tests
2018-07-10 08:23:35 -07:00
Priya Wadhwa 7fbc21ec73
Merged master, fixed merge conflict 2018-05-07 09:14:17 -07:00
dlorenc cd5b744904
Switch from containers/image to go-containerregistry (#140)
* Vendor changes for go-containerregistry switch.

* Manual changes for go-containerregistry switch.

The biggest change is refactoring the tarball unpacking.

* Pull more of container-diff out.

* More vendor removals.

* More unit tests.
2018-04-25 19:21:05 -07:00
Priya Wadhwa a211c1ec71
Make sure to snapshot parent directories of specific files for add/copy 2018-04-24 16:22:37 -07:00
dlorenc 844d9ef0d9
Add whiteout handling by switching to a two-phase approach. (#139)
* Add whiteout handling by switching to a two-phase approach.

Also only handle hardlinks within one layer

* Simplify the run test.
2018-04-23 12:50:21 -07:00
Matt Rickard cff201dee6 org rename from GoogleCloudPlatform to GoogleContainerTools 2018-04-17 11:45:39 -07:00
priyawadhwa 0ddc2115a5
Merge pull request #78 from priyawadhwa/trigger
kaniko build trigger
2018-04-16 10:21:21 -07:00
priyawadhwa cebb4031b3 copy symlinks (#90) 2018-04-14 08:00:20 -07:00
Priya Wadhwa ec510a161b
change imports from k8s-container-builder to kaniko 2018-04-12 15:35:54 -07:00
Priya Wadhwa 954b1382d2
change k8s to kaniko 2018-04-12 15:30:32 -07:00
Priya Wadhwa 50ef6fe9c1
Build trigger for building kaniko executor image 2018-04-12 15:25:40 -07:00