CacheCommand returns true if the command should be cached. Currently,
it's only true for RUN but can be added to ADD/COPY later on (these are
different since the contents of files for ADD/COPY need to be included
in the cache key generation).
I also changed CreatedBy to String so that we can log each command
before cache extraction or regular execution takes place.
I changed UnpackLocalTarArchive to return a list of files that were
extracted, so that the list of snapshotted files for ADD is more
accurate. Previously, we used to add all files in the extracted dir to
be snapshotted, but this could result in preexisting files being
snapshotted again.
Before #289 was merged, when copying over directories for COPY kaniko
would get a list of all files at the destination specified and add them
to the list of files to be snapshotted. If the destination was root it
would add all files. This worked because the snapshotter made sure the
file had been changed before adding it to the layer.
After #289, we changed the logic to add all files snapshotted to a layer
without checking if the files had been changed. This created the bug in
got all the files at root and added them to the layer without checking
if they had been changed.
This change should fix this bug. Now, the CopyDir function returns a
list of files it copied over and only those files are added to the list
of files to be snapshotted.
Should fix#314
Kaniko uses mtime (as well as file contents and other attributes) to
determine if files have changed. COPY and ADD commands should _always_
update the mtime, because they actually overwrite the files. However it
turns out that the mtime can lag, so kaniko would sometimes add a new
layer when using COPY or ADD on a file, and sometimes would not. This
leads to a non-deterministic number of layers.
To fix this, we have updated the kaniko commands to be more
authoritative in declaring when they have changed a file (e.g. WORKDIR
will now only create the directory when it doesn't exist) and we will
trust those files and _always_ add them, instead of only adding them if
they haven't changed.
It is possible for RUN commands to also change the filesystem, in which
case kaniko has no choice but to look at the filesystem to determine
what has changed. For this case we have added a call to `sync` however
we still cannot guarantee that sometimes the mtime will not lag, causing the
number of layers to be non-deterministic. However when I tried to cause
this behaviour with the RUN command, I couldn't.
This changes the snapshotting logic a bit; before this change, the last
command of the last stage in a Dockerfile would always scan the whole
file system and ignore the files returned by the kaniko command. Instead
we will now trust those files and assume that the snapshotting
performed by previous commands will be adequate.
Docker itself seems to rely on the storage driver to determine when
files have changed and so doesn't have to deal with these problems
directly.
An alternative implementation would use `inotify` to track which files
have changed. However that would mean watching every file in the
filesystem, and adding new watches as files are added. Not only is there
a limit on the number of files that can be watched, but according to the
man pages a) this can take a significant amount of time b) there is
complication around when events arrive (e.g. by the time they arrive,
the files may have changed) and lastly c) events can be lost, which
would mean we'd run into this non-deterministic behaviour again anyway.
Fixes#251
* adding metadata tests back to integration tests and fixing resulting bugs
* fix onbuild and default env
* removing old test files
* adding the ArgsEscaped boolean on CMD commands
* fix onbuild test
* ignore failing test until container-diff is fixed
* code comments
* adding todo to remove uncomment failing test
* Vendor changes for go-containerregistry switch.
* Manual changes for go-containerregistry switch.
The biggest change is refactoring the tarball unpacking.
* Pull more of container-diff out.
* More vendor removals.
* More unit tests.
* adding VOLUME command
* proper test project
* general fixes
* fixing project name
* fixing volume unit test
* fixing integration test
* adding tests
* adding util test
* fixing test
* actually create the volume mounted directory
* fix test