Add samples for the Cloud ML Engine by elibixby · Pull Request #824 · GoogleCloudPlatform/python-docs-samples

elibixby · 2017-02-24T22:42:19Z

Add samples for triggering online prediction from code.

@nikhilk @brandondutra @JayLoomis since I can't make you reviewers on this repo.

Tests are to follow. This is to solicit initial feedback while I write tests. Note that I don't think we should highlight predict_from_files in the docs, as that's redundant and not as efficient as batch prediction. It's mainly there for testing and to make the file runnable (A repository policy).

brandondutra · 2017-02-24T22:56:15Z

ml_engine/online_prediction/predict.py

+                       version=None,
+                       force_tfrecord=False):
+    import json
+    import itertools


why the local imports? This is not a DF job.

This way the necessary inputs will show up in snippets in the docs. The [START foo] and [END foo] blocks indicate a displayable chunk for the docs

elibixby · 2017-02-24T23:00:33Z

Note. On discussing with @jonparrott I'm going to remove predict_from_files and write a short webapp.

brandondutra · 2017-02-24T23:00:53Z

ml_engine/online_prediction/predict.py

+
+    # Requests to online prediction
+    # can have at most 100 instances
+    args = [instances] * 100


why are we making 100 copies of this tuple. This looks wrong.

100 copies of the generator. This is how you batch generators in python (it's weird). But I'm deleting this code anyway in favor of a webapp.

brandondutra · 2017-02-24T23:03:06Z

ml_engine/online_prediction/predict.py

+                batch,
+                version=version
+            ))
+    return results


where are the results saved or printed?

brandondutra · 2017-02-24T23:03:59Z

ml_engine/online_prediction/predict.py

+    args = [instances] * 100
+    instance_batches = itertools.izip(*args)
+
+    results = []


so the input data could need batching (be large), but the results don't need batching? I'm ok with this script not doing batching. Depends on what others say.

brandondutra · 2017-02-24T23:11:57Z

ml_engine/online_prediction/predict.py

+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+"""Examples of using the Cloud ML Engine's online prediction service."""


Add comments on authentication. When should this work, or what needs to be true for the script to work.

in get_ml_engine_service, can you add a link to the doc page describing how I can download a service account file?

brandondutra · 2017-02-24T23:13:03Z

ml_engine/online_prediction/resources/test.json

@@ -0,0 +1 @@
+{"age": 25, "workclass": " Private", "education": " 11th", "education_num": 7, "marital_status": " Never-married", "occupation": " Machine-op-inspct", "relationship": " Own-child", "race": " Black", "gender": " Male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 40, "native_country": " United-States"}


does your batching code really work? Hard to tell with just 1 prediction row

elibixby · 2017-02-24T23:15:06Z

@brandondutra Switched to user input stream to avoid problems of batching and file reading which don't really belong in an online prediction sample.

elibixby · 2017-02-24T23:21:57Z

@xlcheng

elibixby · 2017-02-25T01:31:01Z

@brandondutra PTAL

elibixby · 2017-02-25T02:45:56Z

@jonparrott Installing TensorFlow appears to be broken... @jonparrott can you PTAL?

brandondutra · 2017-02-27T18:42:33Z

ml_engine/online_prediction/predict.py

+    import json
+    while True:
+        try:
+            user_input = json.loads(raw_input("Valid JSON >>>"))


I'm not a fan of raw-input (cannot re-run this quickly, and typing valid json is a pain). But this allows interactive input and "python predict.py < my_data.json". Maybe add file-level comments on these two ways of using this script?

So the main reason I wanted to do it this way, is we already have a solution for batch prediction (via the API) and a 100 request limit seems really bad if the use-case we are highlighting is predicting from files.

brandondutra · 2017-02-27T19:02:21Z

ml_engine/online_prediction/predict.py

+
+
+# [START census_to_example_bytes]
+def census_to_example_bytes(json_instance):


I was expecting the file path to be in the census example (in cloudml-samples). Is this file part of the census sample or a more generic 'calling online prediction' sample? If the latter, we need better warnings that this will not work with every model, and we need to describe what the model is expecting.

If this is not part of the census sample, a s/census/json/g is needed.

Sorry if this is a bad question, I not familiar with python-docs-samples

I was thinking we have sort of a hard separation between "things run as part of training" and "code run in your own client to send requests to the prediction service" The former being in cloudml-samples (and in the future tf/garden) and the latter being in python-docs-samples.

I will definitely add some better prose around this in the form of a docstring, and we'll also make it clear in docs.

brandondutra · 2017-02-27T19:06:43Z

ml_engine/online_prediction/predict_test.py

+from predict import census_to_example_bytes, predict_json
+
+
+MODEL = 'census'


I'm starting to think you want these files in GoogleCloudPlatform/cloudml-samples/census/something

theacodes · 2017-02-27T19:21:07Z

ml_engine/online_prediction/predict.py

@@ -0,0 +1,173 @@
+# Copyright 2016 Google Inc. All Rights Reserved. Licensed under the Apache


This license header looks weird, copy it from elsewhere?

s/2016/2017/g

theacodes · 2017-02-27T19:21:35Z

ml_engine/online_prediction/predict.py

+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+"""Examples of using the Cloud ML Engine's online prediction service."""


Nit: blank line between license and docstring.

theacodes · 2017-02-27T19:21:42Z

ml_engine/online_prediction/predict.py

@@ -0,0 +1,173 @@
+# Copyright 2016 Google Inc. All Rights Reserved. Licensed under the Apache


Needs a shebang

theacodes · 2017-02-27T19:22:20Z

ml_engine/online_prediction/predict.py

+# [END import_libraries]
+
+
+# [START authenticating]


We generally show constructing the service in each snippet instead of centralizing it. Every indirection adds cognitive load to the users.

theacodes · 2017-02-27T19:22:46Z

ml_engine/online_prediction/predict.py

+# [START predict_json]
+def predict_json(project, model, instances, version=None):
+    """Send data instances to a deployed model for prediction
+    Args:


blank newline above here.

theacodes · 2017-02-27T19:25:22Z

ml_engine/online_prediction/predict.py

+        to data.
+        version: [optional] str, version of the model to target.
+    Returns:
+        A dictionary of prediction results defined by the model.


We generally encourage snippets to be simple enough not to require this, but I understand if that's not reasonable here. If you're going to go full docstring, follow Napoleon style:

Args: project (str): ... model (str): ... instances (Mapping[ str, dict ]): ... version (str): optional ... Returns: Mapping [str, ...] : ...

theacodes · 2017-02-27T19:25:50Z

ml_engine/online_prediction/predict.py

+    Returns:
+        A dictionary of prediction results defined by the model.
+    """
+    import base64


Don't import here, import at the top.

How do you highlight that this import is only necessary for this snippet?

Is that not important?

theacodes · 2017-02-27T19:26:13Z

ml_engine/online_prediction/predict.py

+            for example_bytes in example_bytes_list
+        ]}
+    ).execute()
+    if 'error' in response:


Blank new line to separate control statements.

theacodes · 2017-02-27T19:26:56Z

ml_engine/online_prediction/predict.py

+
+def main(project, model, version=None, force_tfrecord=False):
+    """Send user input to the prediction service."""
+    import json


Don't import here.

theacodes · 2017-02-27T19:28:45Z

ml_engine/online_prediction/predict.py

+    import json
+    while True:
+        try:
+            user_input = json.loads(raw_input("Valid JSON >>>"))


Where do the users find out what kind of json to send here?

It depends on their model. This snippet will be part of a docs page that is attempting to explain just that. This will be at the end "now that you know what the prediction service does, here's how you call it".

theacodes · 2017-02-28T00:19:47Z

ml_engine/online_prediction/README.md

@@ -0,0 +1,38 @@
+# Online Prediction with the Cloud Machine Learning Engine


We don't have hand-written readmes in here any more. Please move all of this to the documentation and just link to the docs from here. I can add an auto-generated readme later.

theacodes · 2017-02-28T00:19:57Z

ml_engine/online_prediction/predict.py

+# the License.
+
+"""Examples of using the Cloud ML Engine's online prediction service."""
+from __future__ import print_function


This isn't necessary.

theacodes · 2017-02-28T00:20:44Z

ml_engine/online_prediction/predict.py

+        model (str): model name.
+        instances ([Mapping[str: any]]): dictionaries from string keys
+            defined by the model deployment, to data with types that match
+            expected tensors


Period? Also, maybe it's just my unfamiliarity with tensorflow, but this reads like gibberish.

Yeah it doesn't make much sense with context. But could also use some rewording.

theacodes · 2017-02-28T00:21:20Z

ml_engine/online_prediction/predict.py

+    Args:
+        project (str): project where the Cloud ML Engine Model is deployed.
+        model (str): model name.
+        instances ([Mapping[str: any]]): dictionaries from string keys


Any is capital. What's the key and value here?

theacodes · 2017-02-28T00:21:27Z

ml_engine/online_prediction/predict.py

+            expected tensors
+        version: str, version of the model to target.
+    Returns:
+        Mapping[str: any]: dictionary of prediction results defined by the


What's the key and value here?

theacodes · 2017-02-28T00:25:43Z

ml_engine/online_prediction/predict_test.py

+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations under
+# the License.
+"""Tests for predict.py ."""


blank newline both above and below this.

theacodes · 2017-02-28T00:26:14Z

ml_engine/online_prediction/predict_test.py

@@ -0,0 +1,68 @@
+# Copyright 2016 Google Inc. All Rights Reserved. Licensed under the Apache


2017, also, these headers still seem different from the ones in the rest of the repo.

theacodes · 2017-02-28T00:26:31Z

ml_engine/online_prediction/predict_test.py

+
+import pytest
+
+from predict import census_to_example_bytes, predict_json


just import predict, please don't import individual members.

theacodes · 2017-02-28T00:27:21Z

ml_engine/online_prediction/predict_test.py

+        predict_json(PROJECT, MODEL, [{"foo": "bar"}], version=VERSION)
+
+
+# TODO(elibixby) Run on Travis when TensorFlow PyPi package supports


Don't put todos in code, just file an issue or bug to track it.

theacodes · 2017-02-28T00:27:37Z

ml_engine/online_prediction/requirements.txt

@@ -0,0 +1 @@
+tensorflow>=1.0.0


Don't use ranges, pin the version and dpebot will handle updating it.

theacodes

LGTM after final nits, pending Travis.

theacodes · 2017-02-28T01:50:14Z

ml_engine/online_prediction/predict.py

+import googleapiclient.discovery
+# [END import_libraries]
+
+import six


this goes in the same section as import googleapiclient.discovery

theacodes · 2017-02-28T01:51:14Z

ml_engine/online_prediction/predict_test.py

+    assert base64.b64encode(b) is not None
+
+
+def test_predict_tfrecord():


Why not write a real test and mark it with pytest.mark.xfail('reason')?

File-like objects should be opened in binary mode for `blob.upload_from_file()` - cpython standard library accorded with [RFC 2616 Section 3.7.1](https://datatracker.ietf.org/doc/html/rfc2616#section-3.7.1) states the text default charset of iso-8859-1 - add clarifying notes in docstring - update code sample Fixes #818 🦕

elibixby added 2 commits February 24, 2017 10:22

Working commit

5dd9b31

Initial commit of prediction examples

42645c7

elibixby requested review from puneith and theacodes February 24, 2017 22:42

googlebot added the cla: yes This human has signed the Contributor License Agreement. label Feb 24, 2017

brandondutra reviewed Feb 24, 2017

View reviewed changes

Switch to user input stream

52e37a5

elibixby added 5 commits February 24, 2017 15:39

Fix user input loop

71daf1f

Add tests and requirements

11f076d

Add tfrecord stub

125dda0

Add authentication instructions

c43b65d

Small fixes

ad349a5

Fix tests and lint

7df6c4f

brandondutra reviewed Feb 27, 2017

View reviewed changes

theacodes suggested changes Feb 27, 2017

View reviewed changes

elibixby added 3 commits February 27, 2017 12:54

Run census_example_to_bytes in Jenkins only

d5a0e18

Fix some review comments

6bd101a

Add README

a241ff2

Move imports

5ca6773

theacodes suggested changes Feb 28, 2017

View reviewed changes

elibixby added 2 commits February 27, 2017 17:38

Fix review comments

8c3da38

Fix License Headers

90bfa5d

theacodes approved these changes Feb 28, 2017

View reviewed changes

Fix style. Add tfrecords test

2d1726a

elibixby merged commit b0417c9 into master Feb 28, 2017

elibixby deleted the mlengine branch February 28, 2017 02:12

chalmerlowe mentioned this pull request Apr 7, 2026

migrate code from googleapis/python-storage #13989

Open

9 tasks

		@@ -0,0 +1 @@
		{"age": 25, "workclass": " Private", "education": " 11th", "education_num": 7, "marital_status": " Never-married", "occupation": " Machine-op-inspct", "relationship": " Own-child", "race": " Black", "gender": " Male", "capital_gain": 0, "capital_loss": 0, "hours_per_week": 40, "native_country": " United-States"}



		# [START census_to_example_bytes]
		def census_to_example_bytes(json_instance):

		from predict import census_to_example_bytes, predict_json


		MODEL = 'census'

		@@ -0,0 +1,173 @@
		# Copyright 2016 Google Inc. All Rights Reserved. Licensed under the Apache

		@@ -0,0 +1,38 @@
		# Online Prediction with the Cloud Machine Learning Engine

		@@ -0,0 +1,68 @@
		# Copyright 2016 Google Inc. All Rights Reserved. Licensed under the Apache


		import pytest

		from predict import census_to_example_bytes, predict_json

		predict_json(PROJECT, MODEL, [{"foo": "bar"}], version=VERSION)


		# TODO(elibixby) Run on Travis when TensorFlow PyPi package supports

		assert base64.b64encode(b) is not None


		def test_predict_tfrecord():

		@@ -0,0 +1 @@
		tensorflow>=1.0.0

Conversation

elibixby commented Feb 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elibixby commented Feb 24, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elibixby commented Feb 24, 2017

Uh oh!

elibixby commented Feb 24, 2017

Uh oh!

elibixby commented Feb 25, 2017

Uh oh!

elibixby commented Feb 25, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elibixby Feb 27, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

elibixby Feb 27, 2017 •

edited

Loading