The zero value of time.Time is not a valid argument to os.Chtimes
because of the syscall that os.Chtimes calls. Instead we update the zero
value of time.Time to the zero value of Unix Epoch
When using cache, the rootfs may not have been extracted. This prevents uname/gname from resolving
as there is no /etc/password or /etc/group. This makes this layer unnecessarily differ from
a cached layer which does contain this information. Omitting these should be consistent with Docker's
behavior.
filesToAdd is sorted in TakeSnapshotFS, but not here. This makes ordering unpredictable within the layer's tarball,
causing the SHA to differ even if layer contents haven't changed
Add a new option additonal-whitelist which defaults
to a single entry, "/var/run". This will allow users to
remove "/var/run" from the whitelist or retain the current
behavior with no change.
Add unit tests that check the behaviour of CompositeCache
on adding filesytem resources. It checks that
* 2 identical directory trees produces the same hash
* an extra file produces a different hash
* an extra directry produces a different hash
* an extra file that is excluded does not alter the hash
* an extra directory that is excluded does not alter the hash
Previously kaniko would compute the cache key for any copy command by computing
the combined hash of all files in a directory, even if they were listed
as ignored.
With this change, the cache key creation was updated to be aware of ignored
files.
Related issues:
* https://github.com/GoogleContainerTools/kaniko/issues/594
update stage code so that when comparing the BaseName of
a stage against the recorded, lowercase version of a Stage name
the BaseName is also lowercased.
Certain utilities like Apt depend on modtime
for certain files. Kaniko was not setting modtime when
extracting files and so this broke things like apt.
Kaniko now sets the file mod time to the value from the tar
header.
I needed this for my arm64 k8s cluster. I have zero Go experience but
enough experience with other things to fix the rebase (I think!). This
patch is working fine on my cluster.
Update caching run and copy commands to use the new
GetFSFromLayers method and include the whiteout option so that
whiteout files are extracted and included in extractedFiles
* add util.GetFSFromLayers
* GetFSFromImage delegates to GetFSFromLayers
* add FSOpts and FSConfig for GetFSFromLayers
* add tests for GetFSFromLayers
* add gomock for test support
* add mock_v1 for layers
ArgsEscaped according to Docker docs should only be set in Windows
environments: https://docs.docker.com/engine/api/v1.30/
It was causing integration test to fail with following message:
```
FAIL: TestRun/test_Dockerfile_test_metadata (8.48s)
"Diff": {
"Adds": [
"ArgsEscaped: true"
],
"Dels": [
"ArgsEscaped: false"
]
```
However docker 18.xx returns ArgsEscaped: true
whereas docker 19.xx returns ArgsEscaped: false
Hence this patch also adds the docker version check to the integration
to ignore ArgsEscaped being different when 18.xx is used.
* Update cached copy command to return the same result for
files used from context so that cached and uncached copy
commands produce the same cache key
* Update tests for fix
* Add test for cached run command key consistency
* use the cachekey of the src stage rather than the digest
for COPY --from commands as they are reproducible unlike digests
* track digest to cache keys and stage indexes to digest
* add extra debug logging for troubleshooting cachekey building issues
* convert Sha256 hashes to hex encoded strings rather than plain strings
for easier human reading
When using the COPY command, if the source and destination have the same
the file should be skipped rather than copied. This is to prevent the
file from being overwritten and therefore producing an empty file.
fixes#904
The third library moby/buildkit lowers the image alias used in 'FROM .. AS' instruction.
This change fixes this issue by making the resolve of dependencies agnostic to case.
Fixes#592Fixes#770
Store the last cachekey generated for each stage
If the base image for a stage is present in the map of digest
and cachekeys use the retrieved cachekey instead of the base image
digest in the compositecache
Previously we returned a low level file system error when checking for
a cached image. By adding a more human friendly log message and explicit
error handling we improve upon the user experience.
Fixed#296.
The output manifests may have `application/vnd.docker.distribution.manifest.v2+json`
as their media types instead of `application/vnd.oci.image.manifest.v1+json`.
This changes allow to use kaniko-warmer multiple times without unnecessary docker image downloads.
To check image presence in cache directory I'm using existing cache function that is used by kaniko-executor.
I've considered building separate function to only check image presence, but it will have pretty much the same code.
Questionable decision is to embed CacheOptions type to KanikoOptions and WarmerOptions. Probably this should be resolved by creating interface providing needed options and implement it both mentioned structs. But I've struggled to get a meaningfull name to it.
To replicate previous behaviour of downloading regardless of cache state I've added --force(-f) option.
This changes provides crucial speed-up when downloading images from remote registry is slow.
Closes#722
When a Dockerfile command requires using the TakeSnapshotFS function,
the resulting layer has a random ordering of files. This causes the
layer to have a non-deterministic hash defeating the reproducible flag.
Issue #710 appears to document this issue as well.
To fix, always sort the list of files to be added in scanFullFilesystem.
This avoids trying to sort the file list during execution, and takes
almost no time to complete.
Resolves#607
* Deleted a duplicate Gopkg.lock block for github.com/otiai10/copy to
prevent `dep ensure` from deleting it from vendor/
* Searched for breaking changes. Only found ones for
remote.Delete/List/Write/WriteIndex. Searched for those and fixed
* Noticed that NewInsecureRegistry was deprecated and replaced it
* Revert "Change cache key calculation to be more reproducible. (#525)"
This reverts commit 1ffae47fdd.
* Add logging of composition key back
* Do not include build args in cache key
This should be save, given that the commands will have the args included
when the cache key gets built.
This flag, when set, takes a file in the container and writes the image digest to it. This can be used to extract the exact digest of the built image by surrounding tooling without having to parse the logs from Kaniko, for example by pointing the file to a mounted volume or to a file used durint exit status, such as with Kubernetes' [Termination message policy](https://kubernetes.io/docs/tasks/debug-application-cluster/determine-reason-pod-failure/)]
When the flag is not set, the digest is not written to file and the executor behaves as before. The digest is also written to file in case of a tarball or a `--no-push`.
Closes#654
* Add parent directories of adding files
* Add integration Dockerfile to test parent directory permissions
* Remove unnecessary helper method
* Use a file on the internet for integration Dockerfile
This change calculates the exact files and directories needed between
stages used in the COPY command. Instead of saving the entire
stage as a tarball, we now save only the necessary files.
- We were validating usernames/groupnames existed in etc/passwd. Docker does not do this
- We were incorrectly caching USER commands. This was fixed automatically by fixing the first part.
Calculating a manifest from a v1.tarball is very expensive. We can
store those locally as well, and use them if they exist.
This should eventually be replaced with oci layout support once that exists
in ggcr.
and our snapshot optimizations.
If a previous base image has a volume, the directory is added to the
list of files to snapshot. That directory may not actually exist in the image.
* Set TarPath to empty when pushing a layer
* Fix issues with layer caching, noPush and tarPath.
- Layer caching should work even when tarPath is specified, so this
commit changes the value of tarPath to empty when caching layers.
- When an image is built with just the tarPath and noPush
is true, we should still create the tarBall (which wasn't happening
before this commit).
* Set no-push to false for cache layers
* Remove extra log
* go-imports fix
We previously had an optimization that would skip snapshotting mutli-stage images
when in an intermediate stage, until the very end.
This conflicted with another optimization to avoid snapshotting when no files had changed.
Before we were using the full image digest, but that contains a timestamp. Now
we only use the layers themselves and the image config (env vars, etc.).
Also fix a bug in unpacking the layers themselves. mtimes can change during unpacking,
so set them all once at the end.
Right now when we find a v1.Tarball in the local disk cache, we
recompute the digest. This is very expensive and redundant, because
we store tarballs by their digest and use that as a key to look them up.
* Adds COPY --from=previous stage name/number validation
This fixes an issue in which COPY --from=previous-stage-name would try to download docker image previous-stage-name instead of checking that previous-stage-name could be a named stage.
* Fix linting issues
goimports is implemented as 'gofmt + extras', so this should fix import warnings as well.
* Fix linting issues
Fixes linting issues introduced in the merge
* Fix linting issues.
This PR adds support for the dockerignore file. Previously when kaniko
had support for the dockerignore file, kaniko first went through the
build context and deleted files that were meant to be ignored. This
resulted in a really bad bug where files in user mounted volumes would
be deleted (my bad).
This time around, instead of modifying the build context at all, kaniko
will check if a file should be excluded when executing ADD/COPY
commands. If a file should be excluded (based on the .dockerignore) it
won't be copied over from the buildcontext and shouldn't end up in the
final image.
I also added a .dockerignore file and Dockerfile as an integration test,
which should fail if the dockerignore is not being processed correctly or if files aren't being excluded correctly.
Also, I removed all the integration testing from the previous version of the
dockerignore support.
Right now kaniko only supports COPY --from=<another stage>.
This commit adds support for the case where the referenced image is a remote image
in a registry that has not been used as a stage yet in the build.
* adding benchmarking code
* enable writing to file
* fix build
* time more stuff
* adding benchmarking to integration tests
* compare docker and kaniko times in integration tests
* Switch to setting benchmark file with an env var
* close file at the right time
* fix integration test with environment variables
* fix integration tests
* Adding benchmarking documentation to DEVELOPEMENT.md
* human readable benchmarking steps
From the docs on filepath.SkipDir:
> If the function returns SkipDir when invoked on a non-directory file, Walk skips the remaining files in the containing directory
This was causing the bug in #457. Since the file `/etc/hosts` was in the whitelist, when filepath.SkipDir was called the entire etc directory was skipped.
This change only returns filepath.SkipDir on directories.
Now that hardlink destinations take into account the directory that they are being extracted to, the unit test had to be updated to make sure that two hardlinks were extracted to /tmp/hardlink correctly.
When we execute multistage builds, we store the fs of each intermediate
stage at /kaniko/<stage number> if it's used later in the build. This
created a bug when extracting hardlinks, because we weren't appending
the new directory to the link path.
So, if `/tmp/file1` and `/tmp/file2` were hardlinked, kaniko was trying
to link `/kaniko/0/tmp/file1` to `/tmp/file2` instead of
`/kaniko/0/tmp/file2`. This change will append the correct directory to
the link, and fixes#437#362#352#342.
filepath.Walk has a special error you can return from your walkFn
indicating it should skip directories. This change makes use of that
to skip whitelisted directories.
This change only uploads layers that were created from cache misses on RUN commands.
It also improves the cache-checking logic to handle this case.
Finally, it makes cache layer uploads happen in parallel with the rest of the build, logging
a warning if any fail.
This is the final part of an optimization that I've been refactoring towards for awhile.
If the Dockerfile consists of no RUN commands, or cached RUN commands, followed by metadata-only
operations, we can skip downloading and unpacking the base image.
This change fixes that by properly "replaying" the Dockerfile and mutating the config when
calculating cache keys. Previously we were looking at the wrong cache key for each command
when there was more than one.
* parse arg commands at the top of dockerfiles
* fix pointer reference bug and remove debugging
* fixing tests
* account for meta args with no value
* don't take fs snapshot if / is the only changed path
* move metaArgs inside KanikoStage
* removing unused property
* check for any directory instead of just /
* remove unnecessary check
When building Docker images, layers were previously stored in memory.
This caused obvious issues when manipulating large layers, which could
cause Kaniko to crash.
I improved handling of the .dockerignore file by:
1. Using docker's parser to parse the .dockerignore and using their
helper functions to determine if a file should be deleted
2. Copying the Dockerfile we are building to /kaniko/Dockerfile so that
if the Dockerfile is specified in .dockerignore it won't be deleted, and
if it is specified in the .dockerignore it won't end up in the final
image
3. I also improved the integration test to create a temp directory with
files to ignore, and updated the .dockerignore to include exclusions (!)
Issue #410 experienced an error with base image caching where they were
"Not Authorized" to get information for a remote image, but later were
able to download and extract the base image.
To fix this, we can switch to using the remoteImage function for getting
information about the digest, which is the same function used for
downloading base images. This way we can also take advantage of the
--insecure and --skip-tls-verify flags if users pass those in when
trying to get digests for the cache as well.
Added a --ignore flag to ignore packages and files in the build context.
This should mimic the .dockerignore file. Before starting the build, we
go through and delete ignored files from the build context.
* comments
* initial commit for persisent volume caching
* cache warmer works
* general cleanup
* adding some debugging
* adding missing files
* Fixing up cache retrieval and cleanup
* fix tests
* removing auth since we only cache public images
* simplifying the caching logic
* fixing logic
* adding volume cache to integration tests. remove auth from cache warmer image.
* add building warmer to integration-test
* move sample yaml files to examples dir
* small test fix
This change refactors the build loop a bit to make cache optimization easier in the future. Some notable changes:
The special casing around volume snapshots is removed. Every volume is added to the snapshotFiles list for every command that will snapshot anyway.
Snapshot saving was extracted to a sub-function
The decision on whether or not to snapshot was extracted
* Rework cache key generation a bit.
Cache keys are now based on the previous commands, rather than the previous state
of the filesystem.
* Refactor command interface a bit, only cache the context for commands that use it.
Currently, kaniko can only build a single image per container run, because the filesystem is full of the content of the first image.
When running kaniko in Jenkins, where we need to start the container "doing nothing" first (using the debug kaniko container), and then exec /kaniko/executor, this is a limitation because it means that if we want to build multiple images, we need to start multiple containers - see https://groups.google.com/forum/#!topic/kaniko-users/_7LivHdMdy0 for more details
A solution to fix this issue is to add a new flag to cleanup the filesystem at the end - the same way it is done between stages when building a multi-stages image. This way, the same (debug) container can be used to build multiple images.
* set default HOME env properly
* set HOME to / if user is set by uid
* fix test
* continue to skip user_run test
* fix unit test to match new functionality
* Enable overwriting of links (solves #351)
* add integration test to check extraction of images with replaced hardlinks
* Prevent following symlinks during extracting normal files
This fixes#359, #361, #362.
I changed RunE to Run so that usage wouldn't show upon error. Usage will
still show if PersistentPreRunE fails, which makes sense since those
functions check to make sure arguments passed in are valid.
Also changed logging of multi arg flags to Debugf so that output would
be cleaner.
To add layer caching to kaniko, I added two flags: --cache and
--use-cache.
If --use-cache is set, then the cache will be used, and if --cache is
specified then that repo will be used to store cached layers. If --cache
isn't set, a cache will be inferred from the destination provided.
Currently, caching only works for RUN commands. Before executing the
command, kaniko checks if the cached layer exists. If it does, it pulls
it and extracts it. It then adds those files to the snapshotter and
append a layer to the config history. If the cached layer does not exist, kaniko executes the command and
pushes the newly created layer to the cache.
All cached layers are tagged with a stable key, which is built based off
of:
1. The base image digest
2. The current state of the filesystem
3. The current command being run
4. The current config file (to account for metadata changes)
I also added two integration tests to make sure caching works
1. Dockerfile_test_cache runs 'date', which should be exactly the same
the second time the image is built
2. Dockerfile_test_cache_install makes sure apt-get install can be
reproduced
gometalinter is broken @ HEAD, and I looked into why that was. During
that process, I remembered that we took the linting scripts from
skaffold, and found that in skaffold gometalinter was replaced with
GolangCI-Lint:
https://github.com/GoogleContainerTools/skaffold/pull/619
The change made linting in skaffold faster, so I figured instead of
fixing gometalinter it made more sense to remove it and replace it with
GolangCI-Lint for kaniko as well.
As mentioned in #346, if only ENTRYPOINT is set in a stage then any
CMD inherited from a parent should be cleared.
If both entrypoint and cmd are set then nothing should change.
I added a function and unit test to review the config file after building a stage
which clears out config.Cmd if ENTRYPOINT was declared but CMD wasn't.
I also added an integration test to make sure this works, which should
be tested by the preexisting container-diff --metadata test.
While looking into #345, we were seeing the error:
Error: error building image: chmod /etc/mtab: operation not permitted
during extraction of `amazonlinux:1`. I looked into why kaniko couldn't
extract this file properly, and found that it already existed as a
symlink pointing to /proc/mounts, which returned an error when we tried
to run chmod on it.
Confusingly, in the image the /etc/mtab is a regular file, not a
symlink.
I can think of two ways to solve this problem:
1. Whitelist /etc/mtab so that whatever already exists in the system
is used
2. Check if a regular file already exists, and hasn't been extracted yet,
before extracting
I went with option 1 because for option 2 we'd have to keep a list of
all files that had been extracted in memory.
Refactoring builds by stage will make it easier to generate cache keys
for layers, since the stageBuilder type will contain everything required
to generate the key:
1. Base image with digest
2. Config file
3. Snapshotter (which will provide a key for the filesystem)
4. The current command (which will be passed in)
This will return a string representaiton of the current filesystem to be
used with caching.
Whenever a file is explictly added (via ADD or COPY), it will be stored
in "added" in the LayeredMap. The file will map to a hash created by
CacheHasher (which doesn't take into account mtime, since that will be
different with every build, making the cache useless)
Key() will returns a sha of the added files which will be used in
determining the overall cache key for a command.
CacheCommand returns true if the command should be cached. Currently,
it's only true for RUN but can be added to ADD/COPY later on (these are
different since the contents of files for ADD/COPY need to be included
in the cache key generation).
I also changed CreatedBy to String so that we can log each command
before cache extraction or regular execution takes place.
The bug in #329 occurred because of a bug in matchSources, where the
filepath wasn't absolute, so the source "/kaniko-bug/*" wasn't being
matched to the file "kaniko-bug/test-file"
To fix this, I added logic for making filepaths absolute and added to
the unit test for the function to test that it works.