1

My development instance of Accumulo became quite messy with a lot of tables created for testing.

I would like to bulk delete a large number of tables.

Is there a way to do it other than deleting the entire instance?

BTW - If it's of any relevance, this instance is just a single machine "cluster".

Community
  • 1
  • 1
daphshez
  • 9,272
  • 11
  • 47
  • 65

3 Answers3

4

In the Accumulo shell, you can specify a regular expression for table names to delete by using the -p option of the deletetable command.

billie
  • 341
  • 1
  • 2
  • What Christopher says is still true; the tables matching the regex are deleted serially rather than in bulk. It does provide a convenience for deleting multiple tables at once, though. – billie Feb 05 '16 at 14:16
  • That's good enough for me in a development environment. I also used -f so I don't have to confirm each table, so deletetable -f -p – daphshez Feb 08 '16 at 11:51
3

I would have commented on original answer, but I lack the reputation (first contribution right here).

It would have been helpful to provide a legal regex example.

The Accumulo shell can only escape certain characters. In particular it will not escape brackets []. If you want to remove every table starting with the string "mytable", the otherwise legal regex commands have the following warning/error.

user@instance> deletetable -p mytable[.]*

2016-02-18 10:21:04,704 [shell.Shell] WARN : No tables found that match your criteria

user@instance> deletetable -p mytable[\w]*

2016-02-18 10:21:49,041 [shell.Shell] ERROR: org.apache.accumulo.core.util.BadArgumentException: can only escape single quotes, double quotes, the space character, the backslash, and hex input near index 19 deletetable -p mytable[\w]*

A working shell command would be:

user@instance> deletetable -p mytable.*
daphshez
  • 9,272
  • 11
  • 47
  • 65
Junior
  • 41
  • 3
  • I just wish you had put the correct way to do this at the top so I didn't have to actually read the answer to figure out how to do it. :) – Mark_Eng Jun 28 '18 at 13:24
2

There is not currently (as of version 1.7.0) a way to bulk delete many tables in a single call.

Table deletion is actually done in an asynchronous way. The client submits a request to delete the table, and that table will be deleted at some point in the near future. The problem is that after the call to delete the table is performed, the client then waits until the table is deleted. This blocking is entirely artificial and unnecessary, but unfortunately that's how it currently works.

Because each individual table deletion appears to block, a simple loop over the table names to delete them serially is not going to finish quickly. Instead, you should use a thread pool, and issue delete table requests in parallel.

A bulk delete table command would be very useful, though. As an open source project, a feature request on their issue tracker would be most welcome, and any contributions to implement it, even more so.

Christopher
  • 2,427
  • 19
  • 24