Lately I have been ‘playing’ with Google Cloud Datastore to check how useful it could be.
First thing that I have to say is that it is in beta. But this one is not like Gmail, that continued being beta for 5 years despite of being a stable product with a lot of users. In this case they are really in a early state and they cannot guarantee that there won’t be breaking changes.
One of those changes was recently the protocol used in the ruby and node.js clients, causing the existing emulator that is part of the gcloud SDK not to be compable anymore as it is explained in the documentation:
As of this release, the Datastore emulator that is part of the gcloud SDK is not compatible with gcloud-node. We usegRPC as our transport layer, while the gcloud SDK’s Datastore emulator does not support gRPC.
Use
gcd.sh
directlyFor now, you must use the gcd.sh script.
So I started to use it. I required gcloud defining the projectId and the apiEndpoint params and after creating the project with the same project id in my emulator, I could start running queries:
But having those params hardcoded there doesn’t look too nice, so after checking the documentation I saw that I could replace those params with environment variables. I first removed the apiEndpoint param and executed:
export DATASTORE_EMULATOR_HOST=localhost:8080
And with that set, everything kept working as expected. Be careful when you remove the apiEndpoint as without it and without the DATASTORE_EMULATOR_HOST, gcloud will try to connect using the default credentials that you define in your gcloud installation.
So after that, I removed projectId and exported the environment variable defined in the documentation:
export DATASTORE_PROJECT_ID=testing-project
But… ups! Something went wrong!
{ Error: Bad Request
at /Users/laura/testing-project/node_modules/grpc/src/node/src/client.js:417:17 code: 400, metadata: Metadata { _internal_repr: {} } }
So it looks like that param in the documentation is not really being recognized yet, or maybe it is an old param that they replaced. I continued looking for more examples and I found this other documentation page where I realized that they were using a new environment variable, GCLOUD_PROJECT, so I tried to export it this time:
export GCLOUD_PROJECT=testing-project
And after that, voilà, everything worked again. In the documentation where I found that param, they were still defining projectId: process.env.GCLOUD_PROJECT
so it could be that the environment variable wasn’t automatically loaded when they wrote that documentation or that it could stop being a supported one and that’s why they suggest reading yourself the projectId from the environment variable.
With everything running I decided to create a more complex query with an equality and an inequality. In production, according to the documentation, we will need to create first an index containing those two fields (indexes for simpler queries like only equalities or one single inequality are automatically created). But if we are using the emulator, it won’t fail when we try to run the query and will create the index.yml file defining the index to help us define them later in production.
So I created my code to read the users with a given name and over a certain age:
When I executed it, I found again a new error!
{ Error: Precondition Failed at /Users/laura/testing-project/node_modules/grpc/src/node/src/client.js:417:17 code: 412, metadata: Metadata { _internal_repr: {} } }
So bad news, the index generation looks like it is failing in that emulator version that they are providing. I investigated for a while and I realized that if you install gcloud in your computer and list all the components you can find two emulators: Cloud Datastore Emulator (Legacy) and Cloud Datastore Emulator. To run any of them you will need to use the beta component too. It will require to define the project id with gcloud config set project VALUE
or defining the environment variable CLOUDSDK_CORE_PROJECT. To run the emulator I needed to execute it like this:
gcloud beta emulators datastore start --host-port=localhost:8888 --no-legacy
The –no-legacy param was really important as it is the way that we tell gcloud that we want to use the new emulator instead of the old one and I couldn’t find any param to provide only the port (as I wanted to ensure that it was always running in a known port), so I needed to use the –host-port one.
And after running again the query I finally could find in my datastore configuration directory my index.yaml file (in my case /Users/laura/.config/gcloud/emulators/datastore/WEB-INF/index.yaml) looking like this:
And after all those tests, I managed to get an environment working with an emulator! Next step will be a real environment!
My conclusion is that it is a promising tool, but maybe it is a bit early for it!
Thank you for posting this! Crazy to think it’s been so long and still in beta.
For anyone who is running across this article in 2019 trying to figure out which env variable to set to get the datastore client to read from, it is GOOGLE_CLOUD_PROJECT.
As with most AppEngine things, the best documentation is the source code: https://github.com/googleapis/google-cloud-python/blob/master/datastore/google/cloud/datastore/client.py#L54
I’m putting together a lessons learned article as I start migrating from python 2.7 to python 3.7 standard environment.