0

New to bazel so please bear with me :) I have a genrule which basically downloads and unpacks a a package:

genrule(
    name = "extract_pkg",
    srcs = ["@deb_pkg//file:pkg.deb"],
    outs = ["pkg_dir"],
    cmd = "dpkg-deb --extract $< $(@D)/pkg_dir",
)

Naturally pkg_dir here is a directory. There is another rule which uses this rule as input to create executable, but the main point is that I now need to add a rule (or something) which will allow me to use some headers from that package. This rule is used as an input to a cc_library which is then used in other parts of the repository to get access to the headers. Tried like this:

genrule(
    name = "pkg_headers",
    srcs = [":extract_pkg"],
    outs = [
        "pkg_dir/usr/include/pkg/h1.h",
        "pkg_dir/usr/include/pkg/h2.h"
    ]
)

But it seems Bazel doesn't like the fact that both rules use the same directory as output, even though the second one doesn't do anything (?):

output file 'pkg_dir' of rule 'extract_pkg' conflicts with output file 'pkg_dir/usr/include/pkg/h1.h' of rule 'pkg_headers'

It works fine if I use different "root" directory for both rules, but I think there must be some better way to do this.

EDIT I tried to use declare_directory as follows (compiled from different sources):

unpack_deb.bzl:

def _unpack_deb_impl(ctx):
  input_deb_file = ctx.file.deb
  output_dir = ctx.actions.declare_directory(ctx.attr.name + ".cc")
  print(input_deb_file.path)
  print(output_dir.path)
  ctx.actions.run_shell(
    inputs = [ input_deb_file ],
    outputs = [ output_dir ],
    arguments = [ input_deb_file.path, output_dir.path ],
    progress_message = "Unpacking %s to %s" % (input_deb_file.path, output_dir.path),
    command = "dpkg-deb --extract \"$1\" \"$2\"",
  )
  return [DefaultInfo(files = depset([output_dir]))]

unpack_deb = rule(
  implementation = _unpack_deb_impl,
  attrs = {
    "deb": attr.label(
      mandatory = True,
      allow_single_file = True,
      doc = "The .deb file to be unpacked",
    ),
  },
  doc = """
Unpacks a .deb file and returns a directory.
""",
)

BUILD.bazel:

load(":unpack_deb.bzl", "unpack_deb")

unpack_deb(
  name = "pkg_dir",
  deb = "@deb_pkg//file:pkg.deb"
)

cc_library(
  name = "headers",
  linkstatic = True,
  srcs = [ "pkg_dir" ],
  hdrs = ["pkg_dir.cc/usr/include/pkg/h1.h", 
          "pkg_dir.cc/usr/include/pkg/h2.h"],
  strip_include_prefix = "pkg_dir.cc/usr/include",
)

The trick with adding .cc so the input can be accepted by cc_library was stolen from this answer. However the command fails on

ERROR: missing input file 'blah/blah/pkg_dir.cc/usr/include/pkg/h1.h'

From the library.

When I run with debug, I can see the command being "executed" (strange thing is that I don't always see this printout):

SUBCOMMAND: # //blah/pkg:pkg_dir [action 'Unpacking tmp/deb_pkg/file/pkg.deb to blah/pkg/pkg_dir.cc', configuration: xxxx]
(cd /home/user/.../execroot/src && \
  exec env - \
  /bin/bash -c 'dpkg-deb --extract "$1" "$2"' '' tmp/deb_pkg/file/pkg.deb bazel-out/.../pkg/pkg_dir.cc)

After execution, bazel-out/.../pkg/pkg_dir.cc exists but is empty. If I run the command manually it extracts files correctly. What might be the reason? Also, is it correct that there's an empty string directly after bash command line string?

xba
  • 167
  • 2
  • 13

1 Answers1

0

Bazel's genrule doesn't work very well with directory outputs. See https://docs.bazel.build/versions/master/be/general.html#general-advice

Bazel mostly works with individual files, although there's some support for working with directories in Starlark rules with https://docs.bazel.build/versions/master/skylark/lib/actions.html#declare_directory

Your best bet is probably to extract all the files you're interested in in the genrule, then create filegroups for the different groups of files:

genrule(
    name = "extract_pkg",
    srcs = ["@deb_pkg//file:pkg.deb"],
    outs = [
        "pkg_dir/usr/include/pkg/h1.h",
        "pkg_dir/usr/include/pkg/h2.h",
        "pkg_dir/other_files/file1",
        "pkg_dir/other_files/file2",
    ],
    cmd = "dpkg-deb --extract $< $(@D)/pkg_dir",
)

filegroup(
    name = "pkg_headers",
    srcs = [
        ":pkg_dir/usr/include/pkg/h1.h",
        ":pkg_dir/usr/include/pkg/h2.h",
   ],
)

filegroup(
    name = "pkg_other_files",
    srcs = [
        ":pkg_dir/other_files/file1",
        ":pkg_dir/other_files/file2",
   ],
)

If you've seen glob, you might be tempted to use glob(["pkg_dir/usr/include/pkg/*.h"]) or similar for the srcs of the filegroup, but note that glob works only with "source files", which means files already on disk, not with the outputs of other rules.

There are rules for creating debs, but I'm not aware of rules for importing them. It's possible to write such rules using Starlark: https://docs.bazel.build/versions/master/skylark/repository_rules.html

With repository rules, it's possible to avoid having to explicitly write out all the files you want to extract, among other things. Might be more work than you want to do though.

ahumesky
  • 4,203
  • 8
  • 12
  • This is a nice solution except the fact I have to list every single file I need. In case I want to have i.e. a runnable binary from the package, I have to list all files in it's lib directory even though I don't care about any of these files in particular, because the dependency is the contents of the directory itself... – xba Jul 13 '20 at 09:26