Skip to content

Add localdir buildcontext to kaniko builder#983

Merged
dgageot merged 7 commits intoGoogleContainerTools:masterfrom
priyawadhwa:localdir
Oct 10, 2018
Merged

Add localdir buildcontext to kaniko builder#983
dgageot merged 7 commits intoGoogleContainerTools:masterfrom
priyawadhwa:localdir

Conversation

@priyawadhwa
Copy link
Copy Markdown
Contributor

The localdir buildcontext creates a configMap of all sources and mounts
it in to the kaniko pod. It points the kaniko context to the mounted
directory.

Begins to fix #950

Priya Wadhwa added 2 commits September 13, 2018 17:20
The localdir buildcontext creates a configMap of all sources and mounts
it in to the kaniko pod. It points the kaniko context to the mounted
directory.
@priyawadhwa priyawadhwa changed the title Add localdir buildcontext to skaffold builder Add localdir buildcontext to kaniko builder Sep 14, 2018
for _, path := range paths {
files = append(files, []string{"--from-file", path}...)
}
cmd := exec.Command("kubectl", append([]string{"create", "configmap", configMapName(tag), "-n", ns}, files...)...)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with this strategy, but this just stores the actual files in configmaps and mounts them? AFAIK the size limit for configmaps is the size limit for keys in etcd, which is 1.5 MB by default.

Have you tried this with larger repositories?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked into it and you're right, this probably won't work with larger repositories.

What if we created a tarball of the files and mounted that in via configMap? Would require adding a --context=tar:// option to kaniko but that should be pretty easy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a per-file limit or a limit on the overall size of the repo?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's the overall size

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we use Persitent Volumes? It would be a bit of a pain to maintain them - we'd have to take care of creating and cleaning up the PVs and figure out how to interact across namespaces as well potentially, but it would free the users from having to depend on any external storage solutions. If we figure this out, this might be useful for BuildCRD too.

Copy link
Copy Markdown
Contributor Author

@priyawadhwa priyawadhwa Oct 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT of using an emptyDir to hold the build context? We could have an init container that runs while we kubectl cp the buildcontext into the emptyDir volume, and then mount that into the kaniko container.

I think the emptyDir can use up to however much memory is allocated to the cluster.

It would allow for easier cleanup than a PV/PVC, although would be slower since we have to copy the entire buildcontext over for every dev loop.

It could be a good place to start, and if it isn't working for people we can switch to the PV/PVC?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats assuming that the entire configmap fits in a 1.5 MB key.

If it doesn't:

  • If you don't shard, then a very small number of repos are going to be able to fit in 1.5 MB.
  • If you shard, you end up with a key per 1.5 MB, which for the skaffold repo would be 110 keys (164M/1.5M). The storage limit for etcd v3 is 2 GB but supports up to 8 GB. I'm not sure what the performance would be on that, but using 8% of your etcd for a build context could easily reach a performance limit pretty quickly on a shared cluster.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like @priyawadhwa's idea

  • use the skaffold docker context generation to create a tarball as usual
  • run the kaniko pod with an init container that waits for a special file to be written to signify completion
  • kubectl cp the tarball over to the emptyDir
  • kubectl cp the special completion file
  • init container completes
  • kaniko runs

On the other hand, if we reuse a PV to store the build context, we can do incremental build context uploads, which will be very fast.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I think we might have to extract the tarball within the init container as well, since that's what kaniko expects, but that shouldn't be difficult (or we could add the a tar:// build context to kaniko, I'm good with that as well).

@balopat are you good with trying this out?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, if 1.5MB is not good for most of the repos, then configmap doesn't work.
+100 on @priyawadhwa's idea - and switch over to PVs as a follow-up feature for incremental builds later down the line if there's enough usage to warrant it

}}

context := ""
if cfg.BuildContext.GCSBucket != "" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would change this to a switch statement, and maybe break out the implementations into their own functions

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function could be something like func(ctx, artifact, cfg.BuildContext, cfg.Namespace, initialTag) (context, func())

- imageName: gcr.io/k8s-skaffold/skaffold-example
kaniko:
buildContext:
localDir: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this should just be a struct? localDir: {}?

specifying localDir: false doesn't really make sense

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if you make these pointers to structs in the config, the oneOf yamltag will do implicit validation for you.

// a kaniko build context
type KanikoBuildContext struct {
GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"`
LocalDir bool `yaml:"localDir,omitempty" yamltags:"oneOf=buildContext"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think using the oneOf tag with different types will work. IIRC the type needs to be a shared pointer type, don't think they can be primitives even if they're the same (since the "nil" type for e.g. a string isn't actually nil, it's "")

- imageName: gcr.io/k8s-skaffold/skaffold-example
kaniko:
buildContext:
localDir: true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, if you make these pointers to structs in the config, the oneOf yamltag will do implicit validation for you.

@shashankkoppar
Copy link
Copy Markdown

Any update on this?

@priyawadhwa
Copy link
Copy Markdown
Contributor Author

Hey @shashankkoppar we've been discussing an alternate solution in the comments on this PR, I should have something ready for review in the next day or two.

@shashankkoppar
Copy link
Copy Markdown

@priyawadhwa thanks , currently working on including kaniko build and push part of our CI/CD. So this will really help. For now, have been using Jenkins PodTemplate to spin up kaniko pod

Priya Wadhwa added 2 commits October 2, 2018 16:48
I created a BuildContextSource interface to interact with setup and
cleanup for kaniko build contexts. I transferred the code for GCS
buckets there, and added the LocalDir build context. This should
hopefully make it easier to add more build contexts (such as S3 buckets)
in the future. I also moved the integration test from examples into
integration/examples.
@codecov-io
Copy link
Copy Markdown

codecov-io commented Oct 3, 2018

Codecov Report

Merging #983 into master will decrease coverage by 0.14%.
The diff coverage is 16.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #983      +/-   ##
==========================================
- Coverage   42.96%   42.81%   -0.15%     
==========================================
  Files          74       74              
  Lines        3403     3419      +16     
==========================================
+ Hits         1462     1464       +2     
- Misses       1801     1815      +14     
  Partials      140      140
Impacted Files Coverage Δ
pkg/skaffold/kubernetes/wait.go 26.19% <16.66%> (-3.23%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b809bd5...a87a459. Read the comment docs.

Copy link
Copy Markdown
Contributor

@r2d4 r2d4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! A few comments. I’ll give it a go tomorrow. Looks good though

// Setup uploads the context to the provided GCS bucket
func (g *GCSBucket) Setup(ctx context.Context, artifact *latest.Artifact, cfg *latest.KanikoBuild, initialTag string) (string, error) {
g.tarName = fmt.Sprintf("context-%s.tar.gz", initialTag)
if err := docker.UploadContextToGCS(ctx, artifact.Workspace, artifact.DockerArtifact, cfg.BuildContext.GCSBucket, g.tarName); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if these returned a io.WriterCloser and the actual context generation and writing happened outside each source’s setup?

}

// waitForInitContainer tries to exec into the init container until it doesn't fail
func waitForInitContainer(p *v1.Pod) error {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Query the apiserver for init container status instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s some example of pod watchers in the code.

We should remove all the PollImmediates when we get a chance. I added those a long time ago in minikube before I did it right.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yah that's way better, added it in, thanks!

// Generate the init container, which will run until the /tmp/complete file is created
ic := v1.Container{
Name: initContainer,
Image: constants.DefaultDebianImage,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use something smaller like alpine? Debian seems a bit heavy. The only requirement is the tar binary right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

type KanikoBuildContext struct {
GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"`
GCSBucket string `yaml:"gcsBucket,omitempty" yamltags:"oneOf=buildContext"`
LocalDir *struct{} `yaml:"localDir,omitempty" yamltags:"oneOf=buildContext"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might as well make this its own type

Query apiserver while waiting for init container to start; use alpine
image since it's more lightweight than debian.
@balopat
Copy link
Copy Markdown
Contributor

balopat commented Oct 8, 2018

@priyawadhwa can you please rebase? we should get this in soon then. Thank you!

@priyawadhwa
Copy link
Copy Markdown
Contributor Author

Done!

@balopat balopat mentioned this pull request Oct 8, 2018
@priyawadhwa
Copy link
Copy Markdown
Contributor Author

Hey @nkubala @dgageot, any chance you could take a look at this? I think it's almost there, just wanted to get it in before the release Thursday :)

@dgageot dgageot merged commit dd7f219 into GoogleContainerTools:master Oct 10, 2018
@priyawadhwa
Copy link
Copy Markdown
Contributor Author

Thanks @dgageot !

@priyawadhwa priyawadhwa deleted the localdir branch October 10, 2018 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Using skaffold + kaniko without GCS

8 participants