Add localdir buildcontext to kaniko builder#983
Add localdir buildcontext to kaniko builder#983dgageot merged 7 commits intoGoogleContainerTools:masterfrom
Conversation
The localdir buildcontext creates a configMap of all sources and mounts it in to the kaniko pod. It points the kaniko context to the mounted directory.
pkg/skaffold/build/kaniko/run.go
Outdated
| for _, path := range paths { | ||
| files = append(files, []string{"--from-file", path}...) | ||
| } | ||
| cmd := exec.Command("kubectl", append([]string{"create", "configmap", configMapName(tag), "-n", ns}, files...)...) |
There was a problem hiding this comment.
I'm not super familiar with this strategy, but this just stores the actual files in configmaps and mounts them? AFAIK the size limit for configmaps is the size limit for keys in etcd, which is 1.5 MB by default.
Have you tried this with larger repositories?
There was a problem hiding this comment.
I looked into it and you're right, this probably won't work with larger repositories.
What if we created a tarball of the files and mounted that in via configMap? Would require adding a --context=tar:// option to kaniko but that should be pretty easy.
There was a problem hiding this comment.
Is that a per-file limit or a limit on the overall size of the repo?
There was a problem hiding this comment.
I think it's the overall size
There was a problem hiding this comment.
What if we use Persitent Volumes? It would be a bit of a pain to maintain them - we'd have to take care of creating and cleaning up the PVs and figure out how to interact across namespaces as well potentially, but it would free the users from having to depend on any external storage solutions. If we figure this out, this might be useful for BuildCRD too.
There was a problem hiding this comment.
WDYT of using an emptyDir to hold the build context? We could have an init container that runs while we kubectl cp the buildcontext into the emptyDir volume, and then mount that into the kaniko container.
I think the emptyDir can use up to however much memory is allocated to the cluster.
It would allow for easier cleanup than a PV/PVC, although would be slower since we have to copy the entire buildcontext over for every dev loop.
It could be a good place to start, and if it isn't working for people we can switch to the PV/PVC?
There was a problem hiding this comment.
Thats assuming that the entire configmap fits in a 1.5 MB key.
If it doesn't:
- If you don't shard, then a very small number of repos are going to be able to fit in 1.5 MB.
- If you shard, you end up with a key per 1.5 MB, which for the skaffold repo would be 110 keys (164M/1.5M). The storage limit for etcd v3 is 2 GB but supports up to 8 GB. I'm not sure what the performance would be on that, but using 8% of your etcd for a build context could easily reach a performance limit pretty quickly on a shared cluster.
There was a problem hiding this comment.
I like @priyawadhwa's idea
- use the skaffold docker context generation to create a tarball as usual
- run the kaniko pod with an init container that waits for a special file to be written to signify completion
- kubectl cp the tarball over to the emptyDir
- kubectl cp the special completion file
- init container completes
- kaniko runs
On the other hand, if we reuse a PV to store the build context, we can do incremental build context uploads, which will be very fast.
There was a problem hiding this comment.
Cool, I think we might have to extract the tarball within the init container as well, since that's what kaniko expects, but that shouldn't be difficult (or we could add the a tar:// build context to kaniko, I'm good with that as well).
@balopat are you good with trying this out?
There was a problem hiding this comment.
yeah, if 1.5MB is not good for most of the repos, then configmap doesn't work.
+100 on @priyawadhwa's idea - and switch over to PVs as a follow-up feature for incremental builds later down the line if there's enough usage to warrant it
pkg/skaffold/build/kaniko/run.go
Outdated
| }} | ||
|
|
||
| context := "" | ||
| if cfg.BuildContext.GCSBucket != "" { |
There was a problem hiding this comment.
I would change this to a switch statement, and maybe break out the implementations into their own functions
There was a problem hiding this comment.
The function could be something like func(ctx, artifact, cfg.BuildContext, cfg.Namespace, initialTag) (context, func())
examples/kaniko-local/skaffold.yaml
Outdated
| - imageName: gcr.io/k8s-skaffold/skaffold-example | ||
| kaniko: | ||
| buildContext: | ||
| localDir: true |
There was a problem hiding this comment.
maybe this should just be a struct? localDir: {}?
specifying localDir: false doesn't really make sense
There was a problem hiding this comment.
+1, if you make these pointers to structs in the config, the oneOf yamltag will do implicit validation for you.
| // a kaniko build context | ||
| type KanikoBuildContext struct { | ||
| GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"` | ||
| LocalDir bool `yaml:"localDir,omitempty" yamltags:"oneOf=buildContext"` |
There was a problem hiding this comment.
I don't think using the oneOf tag with different types will work. IIRC the type needs to be a shared pointer type, don't think they can be primitives even if they're the same (since the "nil" type for e.g. a string isn't actually nil, it's "")
examples/kaniko-local/skaffold.yaml
Outdated
| - imageName: gcr.io/k8s-skaffold/skaffold-example | ||
| kaniko: | ||
| buildContext: | ||
| localDir: true |
There was a problem hiding this comment.
+1, if you make these pointers to structs in the config, the oneOf yamltag will do implicit validation for you.
|
Any update on this? |
|
Hey @shashankkoppar we've been discussing an alternate solution in the comments on this PR, I should have something ready for review in the next day or two. |
|
@priyawadhwa thanks , currently working on including kaniko build and push part of our CI/CD. So this will really help. For now, have been using Jenkins PodTemplate to spin up kaniko pod |
I created a BuildContextSource interface to interact with setup and cleanup for kaniko build contexts. I transferred the code for GCS buckets there, and added the LocalDir build context. This should hopefully make it easier to add more build contexts (such as S3 buckets) in the future. I also moved the integration test from examples into integration/examples.
Codecov Report
@@ Coverage Diff @@
## master #983 +/- ##
==========================================
- Coverage 42.96% 42.81% -0.15%
==========================================
Files 74 74
Lines 3403 3419 +16
==========================================
+ Hits 1462 1464 +2
- Misses 1801 1815 +14
Partials 140 140
Continue to review full report at Codecov.
|
r2d4
left a comment
There was a problem hiding this comment.
Nice! A few comments. I’ll give it a go tomorrow. Looks good though
| // Setup uploads the context to the provided GCS bucket | ||
| func (g *GCSBucket) Setup(ctx context.Context, artifact *latest.Artifact, cfg *latest.KanikoBuild, initialTag string) (string, error) { | ||
| g.tarName = fmt.Sprintf("context-%s.tar.gz", initialTag) | ||
| if err := docker.UploadContextToGCS(ctx, artifact.Workspace, artifact.DockerArtifact, cfg.BuildContext.GCSBucket, g.tarName); err != nil { |
There was a problem hiding this comment.
What if these returned a io.WriterCloser and the actual context generation and writing happened outside each source’s setup?
| } | ||
|
|
||
| // waitForInitContainer tries to exec into the init container until it doesn't fail | ||
| func waitForInitContainer(p *v1.Pod) error { |
There was a problem hiding this comment.
Query the apiserver for init container status instead?
There was a problem hiding this comment.
There’s some example of pod watchers in the code.
We should remove all the PollImmediates when we get a chance. I added those a long time ago in minikube before I did it right.
There was a problem hiding this comment.
yah that's way better, added it in, thanks!
| // Generate the init container, which will run until the /tmp/complete file is created | ||
| ic := v1.Container{ | ||
| Name: initContainer, | ||
| Image: constants.DefaultDebianImage, |
There was a problem hiding this comment.
Can we use something smaller like alpine? Debian seems a bit heavy. The only requirement is the tar binary right?
pkg/skaffold/schema/latest/config.go
Outdated
| type KanikoBuildContext struct { | ||
| GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"` | ||
| GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"` | ||
| LocalDir *struct{} `yaml:"localDir,omitempty" yamltags:"oneOf=buildContext"` |
There was a problem hiding this comment.
Might as well make this its own type
Query apiserver while waiting for init container to start; use alpine image since it's more lightweight than debian.
|
@priyawadhwa can you please rebase? we should get this in soon then. Thank you! |
|
Done! |
|
Thanks @dgageot ! |
The localdir buildcontext creates a configMap of all sources and mounts
it in to the kaniko pod. It points the kaniko context to the mounted
directory.
Begins to fix #950