Skip to content
This repository has been archived by the owner on Aug 10, 2023. It is now read-only.

Bazel Build Issues #6

Open
rathjo14 opened this issue Nov 13, 2019 · 25 comments
Open

Bazel Build Issues #6

rathjo14 opened this issue Nov 13, 2019 · 25 comments

Comments

@rathjo14
Copy link

Following the AstroNet readme as much as possible I have been running into some major problems in the Bazel building phase.

Bazel Version: 0.24.1
TensorFlow Version: 1.14.0
When running: bazel test astronet/... astrowavenet/... light_curve/... tf_util/... third_party/...

ERROR: /private/var/tmp/_bazel_rathjo14/d5d70ed4975039d87f5635d66a43ed87/external/com_google_protobuf/protobuf_deps.bzl:18:9: no such package '': BUILD file not found in any of the following directories.

  • /Users/rathjo14/exoplanet-ml/exoplanet-ml and referenced by '//external:six'
    ERROR: Analysis of target '//light_curve:light_curve_py_pb2' failed; build aborted: Analysis failed
    INFO: Elapsed time: 8.122s
    INFO: 0 processes.
    FAILED: Build did NOT complete successfully (23 packages loaded, 158 targets configured)
    FAILED: Build did NOT complete successfully (23 packages loaded, 158 targets configured)
    Fetching @local_config_cc_toolchains; fetching

Looking into the file mentioned in the error here is what I see (lines 17:23):

if not native.existing_rule("six"):
    http_archive(
        name = "six",
        build_file = "@//:six.BUILD",
        sha256 = "105f8d68616f8248e24bf0e9372ef04d3cc10104f1980f54d57b2ce73a5ad56a",
        urls = ["https://pypi.python.org/packages/source/s/six/six-1.10.0.tar.gz#md5=34eed507548117b2ab523ab14b2f8b55"],
    )
@adi-panda
Copy link

Hello,
Did you find a solution to the issue?

@jalalirs
Copy link

I am facing the same problem. Any solution?

@jalalirs
Copy link

Ok. I am not familiar with Bazel syntax at all, but after a long hustle and long searching and reading, the following solved the problem

Modify the last part of the BUILD file in the light_curve directory:

load("@com_google_protobuf//:protobuf.bzl", "py_proto_library")
py_proto_library(
name = "light_curve_py_pb2",
srcs_version = "PY2AND3",
srcs = glob(["proto/*.proto"]),
deps = [
"@com_google_protobuf//:protobuf_python",
],
)

Also in the WORKSPACE file, I updated the ProtoBuf library at the end of the file

http_archive(
name = "com_google_protobuf",
sha256 = "60d2012e3922e429294d3a4ac31f336016514a91e5a63fd33f35743ccfe1bd7d",
strip_prefix = "protobuf-3.11.0",
urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.11.0.zip"],
)
load("@com_google_protobuf//:protobuf_deps.bzl", "protobuf_deps")

protobuf_deps()

@ritwik12
Copy link

ritwik12 commented Feb 3, 2020

@jalalirs Above solution worked for py_proto_library but now this gives error for proto_library saying no such attribute 'cc_api_version' in 'proto_library' rule
Did anyone faced this?

@jalalirs
Copy link

jalalirs commented Feb 3, 2020

@jalalirs Above solution worked for py_proto_library but now this gives error for proto_library saying no such attribute 'cc_api_version' in 'proto_library' rule
Did anyone faced this?

Just remove cc_api_version

@ritwik12
Copy link

ritwik12 commented Feb 3, 2020

@jalalirs I did. then it gave numerous other errors.

//astronet/astro_cnn_model:astro_cnn_model_test                          FAILED in 6.2s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/astro_cnn_model/astro_cnn_model_test/test.log
//astronet/astro_fc_model:astro_fc_model_test                            FAILED in 6.1s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/astro_fc_model/astro_fc_model_test/test.log
//astronet/astro_model:astro_model_test                                  FAILED in 6.1s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/astro_model/astro_model_test/test.log
//astronet/ops:dataset_ops_test                                          FAILED in 6.2s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/ops/dataset_ops_test/test.log
//astronet/ops:input_ops_test                                            FAILED in 2.9s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/ops/input_ops_test/test.log
//astronet/ops:metrics_test                                              FAILED in 6.1s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astronet/ops/metrics_test/test.log
//astrowavenet:astrowavenet_model_test                                   FAILED in 6.1s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astrowavenet/astrowavenet_model_test/test.log
//astrowavenet/data:base_test                                            FAILED in 6.2s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/astrowavenet/data/base_test/test.log
//light_curve:kepler_io_test                                             FAILED in 6.2s
  /private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/execroot/__main__/bazel-out/darwin-fastbuild/testlogs/light_curve/kepler_io_test/test.log

Executed 9 out of 23 tests: 14 tests pass and 9 fail locally.
There were tests whose specified size is too big. Use the --test_verbose_timeoutINFO: Build completed, 9 tests FAILED, 10 total actions

@zoe4cs
Copy link

zoe4cs commented Feb 5, 2020

@jalalirs Above solution worked for py_proto_library but now this gives error for proto_library saying no such attribute 'cc_api_version' in 'proto_library' rule
Did anyone faced this?

I am facing the same problem, what versions of the packages you are using?

@ritwik12
Copy link

ritwik12 commented Feb 5, 2020

@zoe4cs bazel 2.0.0

@ritwik12
Copy link

@zoe4cs Any luck here?

@jalalirs
Copy link

I will fork the project tonight and commit my changes. I don’t remember all the modifications I made but lets see if my version works with you.
Wait for my reply

@ritwik12
Copy link

@jalalirs Ohk sure, thanks :)

@zoe4cs
Copy link

zoe4cs commented Feb 11, 2020

@zoe4cs Any luck here?

I guess versions of bazel and TensorFlow causing problem, but I haven't find a solution .

@jalalirs
Copy link

So here is what I did to make it run.

First, I ran it over a tensorflow image from docker hub. I used this tag 2.0.1-gpu-py3-jupyter

https://hub.docker.com/r/tensorflow/tensorflow

In the container, I installed bazel, cloned this repository and did the following modifications

Modify the last part of the BUILD file in the light_curve directory:

load("@com_google_protobuf//:protobuf.bzl", "py_proto_library")
py_proto_library(
name = "light_curve_py_pb2",
srcs_version = "PY2AND3",
srcs = glob(["proto/*.proto"]),
deps = [
"@com_google_protobuf//:protobuf_python",
],
)

Also in the WORKSPACE file, I updated the ProtoBuf library at the end of the file

http_archive(
name = "com_google_protobuf",
sha256 = "60d2012e3922e429294d3a4ac31f336016514a91e5a63fd33f35743ccfe1bd7d",
strip_prefix = "protobuf-3.11.0",
urls = ["https://github.com/protocolbuffers/protobuf/archive/v3.11.0.zip"],
)
load("@com_google_protobuf//:protobuf_deps.bzl", "protobuf_deps")
protobuf_deps()

I ran the test with the following command

bazel test astronet/... astrowavenet/... light_curve/... tf_util/... third_party/... --test_arg=--test_srcdir=/home/exoplanet-ml/exoplanet-ml/

https://pbs.twimg.com/media/EOGoWSOXUAUy0Yj?format=jpg&name=large

@ritwik12
Copy link

@jalalirs They were all version issues. tensorflow and tensorflow_probability.
Workin versions:

tensorboard            1.13.1    
tensorflow             1.13.2    
tensorflow-estimator   1.13.0    
tensorflow-probability 0.6.0 

Still two test cases are failing as below. Don't know why. From logs I can see -

======================================================================
ERROR: testBadLabelIdsRaisesValueError (__main__.BuildDatasetTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/private/var/tmp/_bazel_ritsharm/6c31e64f0da40b5f15aa6c8979a9a35d/sandbox/darwin-sandbox/91/execroot/__main__/bazel-out/darwin-fastbuild/bin/astronet/ops/dataset_ops_test.runfiles/__main__/astronet/ops/dataset_ops_test.py", line 231, in setUp
    self._file_pattern = os.path.join(FLAGS.test_srcdir, _TEST_TFRECORD_FILE)
  File "/Users/ritsharm/git/google-research/lib/python3.7/site-packages/absl/flags/_flagvalues.py", line 473, in __getattr__
    raise AttributeError(name)
AttributeError: test_srcdir

@jalalirs
Copy link

You need to pass the data source by adding the following parameter to the run command

--test_arg=--test_srcdir=

@ritwik12
Copy link

@jalalirs Thanks a lot for that but still after using

bazel test astronet/... astrowavenet/... light_curve/... tf_util/... third_party/... --test_arg=--test_srcdir=/Users/ritsharm/git/exoplanet-ml/exoplanet-ml/

It gives errors as

usage: astro_cnn_model_test.py [-h] [-v] [-q] [--locals] [-f] [-c] [-b]
                               [-k TESTNAMEPATTERNS]
                               [tests [tests ...]]
astro_cnn_model_test.py: error: unrecognized arguments: --test_srcdir=/Users/ritsharm/git/exoplanet-ml/exoplanet-ml

@jalalirs
Copy link

Probably you need tensorflow 2

@ritwik12
Copy link

@jalalirs But with TensorFlow 2 lots of other things are breaking :(

@ritwik12
Copy link

@jalalirs Tensorflow 2.0 is not supported as this project code uses.

tf.contrib.data.parallel_interleave(
AttributeError: module 'tensorflow' has no attribute 'contrib'

and tf.contrib is deprecated in tf 2.

Can you please check which version of tensorflow are you using?

@jalalirs
Copy link

jalalirs commented Feb 12, 2020

You are actually right, I am using 1.15
import tensorflow as tf
tf.__version__
'1.15.0'

@ritwik12
Copy link

@jalalirs

I got it correct. It was all version issues.

tensorboard            1.15.0    
tensorflow             1.15.0    
tensorflow-estimator   1.15.1    
tensorflow-probability 0.8.0  

Above versions passes all tests

@ritwik12
Copy link

@jalalirs Did the steps worked for you till the end as mentioned in this

For me it is giving lots of exceptions in Prediction step which is the last step:

# Generate a prediction for a new TCE.
bazel-bin/astronet/predict \
  --model=AstroCNNModel \
  --config_name=local_global \
  --model_dir=${MODEL_DIR} \
  --kepler_data_dir=${KEPLER_DATA_DIR} \
  --kepler_id=11442793 \
  --period=14.44912 \
  --t0=2.2 \
  --duration=0.11267 \
  --output_image_file="${HOME}/astronet/kepler-90i.png"

is there any code change?

@jalalirs
Copy link

@ritwik12 no I just ran the test command. After that I started using some of the modules directly. I am working on it intermittently, so I didn’t do any training yet.

I am an amateur in the astronomy field and just starting to get my hand dirty with its data. Yet, for this specific project, I am planning to skip all the bazel thing and build the code using direct python calls.

@ritwik12
Copy link

ritwik12 commented Feb 12, 2020

Ohk got it. Thanks a lot :) @jalalirs

@muHashh
Copy link

muHashh commented Feb 14, 2021

leaving a modified version here for people who happen to stumble upon this thread. I've linked the docker image at the top of the readme that I used to get it to work with my AMD Vega 56 and ROCm. Make sure to also follow the ROCm docker install guide If you have issues with rocm-dkms installing, switch to and older kernel version. I was running 5.8 (on Ubuntu 20 LTS which is the recommended distro) and installing 5.6 fixed the issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants