File Sync for skaffold dev#1039
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1039 +/- ##
==========================================
+ Coverage 42.32% 43.16% +0.83%
==========================================
Files 72 74 +2
Lines 3303 3408 +105
==========================================
+ Hits 1398 1471 +73
- Misses 1771 1799 +28
- Partials 134 138 +4
Continue to review full report at Codecov.
|
pkg/skaffold/kubernetes/sync.go
Outdated
| return perform(image, syncMap, deleteFileFn) | ||
| } | ||
|
|
||
| // TODO(r2d4): kubectl exec doesn't seem to take a namespace flag? |
There was a problem hiding this comment.
it's not documented, but you can pass --namespace to kubectl exec
➜ ~ kubectl exec getting-started --namespace=default -- "ls"
app
bin
...
➜ ~ kubectl exec getting-started --namespace=asdf -- "ls"
Error from server (NotFound): pods "getting-started" not found
There was a problem hiding this comment.
Good catch, I'll update
There was a problem hiding this comment.
Do you mind opening up an issue on kubectl to document that? That might be a good contribution to upstream kubectl.
pkg/skaffold/kubernetes/sync.go
Outdated
| if err != nil { | ||
| return errors.Wrap(err, "getting k8s client") | ||
| } | ||
| pods, err := client.CoreV1().Pods("").List(meta_v1.ListOptions{}) |
There was a problem hiding this comment.
I think we want to plumb through the namespace to these functions and pass it in here. As per my comment above this should work with kubectl exec
There was a problem hiding this comment.
I don't think we actually know what namespace we're looking for? We don't actually parse the namespace anywhere, and it can be different for kubectl and helm.
|
|
||
| // HasMeta reports whether path contains any of the magic characters | ||
| // recognized by filepath.Match. | ||
| // This is a copy of filepath/match.go's hasMeta |
There was a problem hiding this comment.
the original implementation also includes the filepath separator. why not just use that?
There was a problem hiding this comment.
pkg/skaffold/watch/watch_test.go
Outdated
| } | ||
|
|
||
| func (c *callback) call() { | ||
| func (c *callback) call(e Events) error { |
There was a problem hiding this comment.
callback.call() has to implement ChangeFn which is func(Events) error
pkg/skaffold/kubernetes/sync.go
Outdated
| for _, c := range p.Spec.Containers { | ||
| if strings.HasPrefix(c.Image, image) { | ||
| for src, dst := range files { | ||
| cmd := cmdFn(p, c, src, dst) |
There was a problem hiding this comment.
could we call these in parallel?
There was a problem hiding this comment.
Good idea, I'll make them run in parallel
pkg/skaffold/runner/runner.go
Outdated
| if !sync { | ||
| changed.Add(artifact) | ||
| } else { | ||
| color.Default.Fprintln(out, "Synced:", "copied", append(e.Added, e.Modified...), "deleted", e.Deleted) |
There was a problem hiding this comment.
I think this is the case, but just to be sure: if we get here, is it always true that we've performed a sync?
There was a problem hiding this comment.
Yep, shouldSync should probably be renamed, since it actually syncs and returns true or doesn't sync and returns false
|
@r2d4 when I tried, it would sync files when I don't have a |
|
@r2d4 to make the review easier, could you create a PR that introduces watch events? |
pkg/skaffold/runner/runner.go
Outdated
| return true, nil | ||
| } | ||
|
|
||
| func intersect(syncMap map[string]string, files []string) (map[string]string, error) { |
There was a problem hiding this comment.
This doesn't work if the artifact is not in . but in a sub-folder. sync destinations are relative to the workspace but file paths are relative to the root of the project
There was a problem hiding this comment.
Also if I want to copy all static/*.html files to /html, I end up copying them to /html/static/. It should not work that way.
There was a problem hiding this comment.
intersect function would benefit from a good test coverage
There was a problem hiding this comment.
I rewrote it to fix those issues and added another suite of tests to cover it. I still might rewrite both these functions after I split out watch, they're doing a little too much.
|
I’ll split out the watch changes into a separate PR |
|
@r2d4 it's possible that you could use similar logic to keep the build context up to date to make container builds on knative really fast |
|
Yes @ajbouh we can use the same information from the watcher to only update the file deltas on remote contexts 👍 |
90d2f35 to
0a206c2
Compare
|
I added a |
dgageot
left a comment
There was a problem hiding this comment.
I'll test it tomorrow. Just one question
| if err != nil { | ||
| return errors.Wrap(err, "getting k8s client") | ||
| } | ||
| pods, err := client.CoreV1().Pods("").List(meta_v1.ListOptions{ |
There was a problem hiding this comment.
Should you provide the ns here?
There was a problem hiding this comment.
I don't think we have the namespace. In port-forward and log, we have to watch all pods.
There was a problem hiding this comment.
We do have a namespace sometimes but it's ok, your proposal of filtering on label is equally good.
|
Sorry, @r2d4 you need to rebase once again... |
pkg/skaffold/kubernetes/sync.go
Outdated
| }) | ||
| } | ||
| if err := e.Wait(); err != nil { | ||
| return errors.Wrap(err, "syncing files:") |
| case len(changed.needsResync) > 0: | ||
| for _, s := range changed.needsResync { | ||
| if err := r.Syncer.Sync(s); err != nil { | ||
| logrus.Warnln("Skipping build and deploy due to sync error:", err) |
There was a problem hiding this comment.
When the sync fails I don't see the warning.
There was a problem hiding this comment.
In fact, sometimes I see it...
| } | ||
|
|
||
| func deleteFileFn(pod v1.Pod, container v1.Container, src, dst string) *exec.Cmd { | ||
| return exec.Command("kubectl", "exec", pod.Name, "--namespace", pod.Namespace, "-c", container.Name, "--", "rm", "-rf", dst) |
There was a problem hiding this comment.
A debug log would be useful here.
|
|
||
| func copyFileFn(pod v1.Pod, container v1.Container, src, dst string) *exec.Cmd { | ||
| return exec.Command("kubectl", "cp", src, fmt.Sprintf("%s/%s:%s", pod.Namespace, pod.Name, dst), "-c", container.Name) | ||
| } |
There was a problem hiding this comment.
A debug log would be useful here.
There was a problem hiding this comment.
We get a debug log with util.RunCmd. Does that work?
pkg/skaffold/runner/runner.go
Outdated
| logrus.Warnln("Skipping build and deploy due to sync error:", err) | ||
| return nil | ||
| } | ||
| color.Default.Fprintf(out, "Synced files for %s...\nCopied: %s\nDeleted: %s\n", s.Image, s.Copy, s.Delete) |
There was a problem hiding this comment.
Maybe print the number of files in info and the names in debug
There was a problem hiding this comment.
Should there be any message in stdout?
stdout: ?
info: # of files synced
debug: filenames
There was a problem hiding this comment.
I like info: # of files synced
debug: filenames
d344b05 to
5da3ccd
Compare
These run every second, so its difficult to read other debug statements
|
final rebase :) |
| - image: gcr.io/k8s-skaffold/node-example | ||
| context: node | ||
| sync: | ||
| '*.js': . |
There was a problem hiding this comment.
Shouldn't we avoid user data as keys ?
There was a problem hiding this comment.
I should have justified my comment indeed.
Coming from k8s it is unusual to have user data as keys; it surprised me.
Digging into k8s design doc, we have:
https://github.com/kubernetes/community/blob/master/contributors/devel/api-conventions.md#lists-of-named-subobjects-preferred-over-maps
There are whole discussions linked from there, notably about
- readability (mostly in case of subobjects though)
- merging definitions, which may or may not be of concert for skaffold
It lists exceptions to this though: "pure maps [...] as opposed to sets of subobjects".
I suppose if we later want to enrich the format with subobjects it would make things harder with the current format.
I am not (yet) familiar with all the skaffold format, I don't know if there are other cases like that, I couln't find any.
So this is just to open the discussion. Should I move that into an issue?
There was a problem hiding this comment.
Thanks for the link. That actually makes a lot of sense. Please open an issue! I think it’s a good suggestion
This PR introduces file sync for skaffold dev. Each artifact config now has an optional sync map. If the changes for the dev cycle are contained entirely in the sync map, then a rebuild and redeploy will be skipped, and those files will be synced to the running container. If only some of the changes are contained in the sync map, a full rebuild and redeploy will happen as normal.
That can optionally take glob patterns in the "key" part of the map (if the "key" is a pattern, the "value" destination must be a directory).
This PR changes the watcher slightly to compute WatchEvents, which are files that are added, modified, deleted, rather than an aggregate yes/no to whether things have changed. We were already computing this information and throwing it away, so it shouldn't be any slower.
It also adds a "webserver" example
Caveats: Files are copied with
kubectl cpandkubectl exec rm.kubectl cprequires thetarandcpbinaries to be present in the container.kubectl exec rmwill requirermto be in the container. This feature will not work for scratch containers.Future work for another PR will directly infer the "destination" from the Dockerfile if it not present.