10

I've been experimenting with the new Android Nearby Connections v2.0 API. Most of my devices can now talk to each other most of the time, but I also get a lot of error codes back when trying to connect. Checking status.getStatusCode() inside my program, I can see the following return codes:

  • STATUS_ALREADY_CONNECTED_TO_ENDPOINT (8003)
  • STATUS_BLUETOOTH_ERROR (8007)
  • STATUS_ENDPOINT_IO_ERROR (8012)
  • STATUS_ERROR (13)

I'm having a hard time making sense of these. The first error code seems self-explanatory, except that I see it in cases when I haven't hit the onConnectionResult callback with a "SUCCESS" return code on either side of the alleged connection. My current code is full of trace statements, and I'd see logging entries if those callbacks had been reached. So maybe the devices are connected at some lower level, but if so, the higher-level code doesn't always hear about it.

I'm guessing that STATUS_BLUETOOTH_ERROR indicates a Bluetooth error on the side that logs it, while STATUS_ENDPOINT_IO_ERROR indicates an error (probably involving Bluetooth) on the other end? Is it possible to get any more details? The STATUS_ERROR (13) status that I see once in a while sounds like the sort of error code a programmer would use for those "WTF, we should never get here" moments, but without access to the source code, I can only guess.

Note that I see these errors between devices that talk to each other beautifully at other times, using the same code. Sometimes if the code retries enough times, it eventually gets a stable connection. Sometimes it connects and gets instantly disconnected from the other end. Sometimes I just get an endless stream of repeated error messages (STATUS_BLUETOOTH_ERROR and/or STATUS_ENDPOINT_IO_ERROR).

I'm using Nearby Connections with the connection strategy P2P_CLUSTER. These problems seem to happen most often when both sides do both advertising and discovery. However, I wrote two smaller programs that specialize in either advertising or discovery, and they sometimes get these errors too (but less often).

In the trace messages, I've also noticed lots of warning messages from Nearby Connections that look like this:

09-04 22:54:40.070 3866-3924/? W/NearbyConnections: Cannot deserialize BluetoothDeviceName: expecting min 16 raw bytes, got 6

I'm guessing that this is because Nearby Connections uses its own short tokens (like ZGbx) instead of the device Bluetooth name? I'm not at all sure about that, though. And anyway, if these are Nearby Connections' own special tokens, then why would it be issuing warning messages about it?

Rapunzel Van Winkle
  • 5,380
  • 6
  • 31
  • 48

2 Answers2

9

[Disclaimer: I work on Nearby Connections] I can try and help out.

STATUS_ALREADY_CONNECTED_TO_ENDPOINT: This occurs if you call 'requestConnection' while you have any pending (onConnectionInitiated) or established (onConnectionResult) connections to the given endpoint. Move your log statements earlier, to onConnectionInitiated, and you should see why we throw this error.

STATUS_BLUETOOTH_ERROR: Something went wrong with Bluetooth. The phone is probably in a bad state. This (hopefully) shouldn't happen too often. But if you really want a fix, stop advertising & discovery before reattempting requestConnection. Nearby Connections will toggle Bluetooth when it detects this error, but only if nothing else is going on.

STATUS_ENDPOINT_IO_ERROR: We lost connection to the other device. This can happen for a variety of reasons (they could have walked too far away, Bluetooth was flaky, the device stopped responding, etc). If you're discovering while you have connections, avoid that. Discovery can be hard on the phone and reduces bandwidth at best, causes dropped connections at worst.

STATUS_ERROR: Something went wrong that didn't fit well in the other error codes. It's a catch-all. This is most-often returned in onConnectionResult(FAILED), to notify you that something went wrong in between onConnectionInitiated and waiting for both sides to accept the connection.

We've also lowered the log severity of "Cannot deserialize BluetoothDeviceName" in an upcoming release, since it's not really a warning. It's like you said; expected behavior when we see non-Nearby Connections devices while discovering.

If you continue to see problems, let us know what devices you're using and we'll be sure to add them to our test suite.

Xlythe
  • 1,958
  • 1
  • 8
  • 12
  • 2
    Thanks very much! I currently have logging statements all over my code, so I'll re-examine my logs in light of this new information, and see if a pattern emerges. From memory, one very common pattern is a STATUS_BLUETOOTH_ERROR followed immediately (on the next retry) with STATUS_ALREADY_CONNECTED_TO_ENDPOINT (without ever hitting onConnectionResult successfully). Adding a delay between retries didn't help. Are you suggesting that I stop discovery and/or advertising on every STATUS_BLUETOOTH_ERROR? If so, should I delay before restarting? – Rapunzel Van Winkle Sep 21 '17 at 03:01
  • 2
    Hmm. That could be a simultaneous connection clash; if you have 2 devices connecting to each other at the same time, we will (randomly) fail one of the device's requestConnection()'s first before the second device triggers onConnectionInititated() on both devices. We use IO_ERROR for that one, though, not BLUETOOTH_ERROR... Yes, a valid fix to BLUETOOTH_ERROR is to stop discovery / advertising and then retry the requestConnection(). – Xlythe Sep 22 '17 at 15:46
  • Do I need to stop/start both advertising and discovery in order to fix it? I changed the code to just stop/start discovery after Bluetooth errors, and it didn't work. I think I also might be seeing simultaneous connection clashes, which are probably especially likely when trying fresh code on two test devices. I try to be careful: After discovery, when I send out a connection request, I ignore any new discoveries, and only respond to the onConnectionInitiated that we requested. (Should I be responding to those others?) I'm trying to create a simple repeatable test case (difficult as you know). – Rapunzel Van Winkle Sep 23 '17 at 02:13
  • Stop both. The less that's going on, the more we can do to try and fix it. That said, do give us the model of your phones. We'd rather look in to fixing the root cause than have you jump through hoops like this. – Xlythe Sep 25 '17 at 19:54
  • Other than that, for simultaneous connections, we purposefully fail one connection so those exceptions you're seeing are expected. But the other connection should be succeeding... If it's not, I need to look in to that. Best bet is to keep trying that retry strategy (+ maybe a random backoff). – Xlythe Sep 29 '17 at 07:22
  • You ask great questions, but I think they can be broken up even more. Lets try and keep each question/answer to a single paragraph (maybe 2, as needed). What about... "When is simultaneous advertising/discovery possible?" "Can I mix and match strategies?" "What happens to my connections when I switch strategies?" "Successful connection, immediately followed by disconnection" "Both sides request connections, but don't successfully connect" etc – Xlythe Oct 02 '17 at 20:16
  • Ask them all :) Especially the edge cases, because those are the hardest to add to the documentation (adding them to the docs makes it hard to read, because then everything comes with an asterisk). – Xlythe Oct 03 '17 at 00:18
  • ERROR generally means the connection was broken between onConnectionInitiated and while waiting for both sides to accept. eg. Devices were too far away or the connection had a hiccup. It's a catch-all for anything except accept/reject. As for the follow up test, that's likely because the Bluetooth Mac address does not rotate. So you saw the old advertisement, and attempted to connect a few times (with several internal retries). When the device started advertising again, Bluetooth became connectable again, and the Discoverer happily connected, and the Advertiser happily accepted. – Xlythe Oct 06 '17 at 00:55
  • Ah, no, I meant Nearby Connections will retry internally. The following is a small part about how Connections works: The endpoint id is sent as a part of the advertisement, and it forms an id-mac address pair on the Discoverer side. The Discoverer then tries to connect to the Mac address associated with the endpoint id. And it'll fail, because you stopped advertising. So it retries, and if you start advertising soon enough, one of those retries may succeed. Even though you advertised a different endpoint id/name, the Mac address is the same. – Xlythe Oct 06 '17 at 01:57
  • I'll file a bug to fix this. Is this a blocking bug for you? Or just a testing issue? – Xlythe Oct 06 '17 at 05:25
  • It's not really intended; just a side effect of Bluetooth. I'll file a bug. It's fixable without breaking backwards compatibility. – Xlythe Oct 06 '17 at 05:53
  • Oh, thanks for adding the new questions at the bottom here. I missed a few of them. I'll try and respond by late Monday. – Xlythe Oct 07 '17 at 00:33
  • Yup, I can definitely do that. – Xlythe Oct 07 '17 at 00:49
  • I believe I answered everything. Let me know if there are any more questions (or if any of my answers were insufficient). – Xlythe Oct 12 '17 at 19:28
  • I've accepted this very useful answer, removed the temporary section at the end of my question, and cleaned up most of my comments here. (I left my first couple of comments, because I may write another question based on them later.) I didn't flag any of your comments for deletion, since you'd be a better judge about that. I think there's still some useful info in your comments that could be worked into your answers to help other developers? And are you interested in more questions? I'm planning to write some! Meanwhile, feel free to upvote any of these questions that you find interesting :) – Rapunzel Van Winkle Oct 13 '17 at 04:31
  • I find STATUS_ENDPOINT_IO_ERROR to occur very frequently when two devices try to connect to each other at the same time. In this case, the two devices never end up connecting to each other. Should I restart Discovery and Advertising on these devices in the hope they'll connect successfully again? Or will the API attempt to reconnect the devices? – shortstheory Feb 21 '18 at 21:12
  • 2 devices attempting to connect at the same time is a special case -- 1 device is expected to error out while the other side is expected to succeed. So IO_ERROR is an expected case, but only for one side... Unfortunately, the radios don't handle simultaneous connections very well and can jam up. Internally, we do some randomized backoff and retry, but if you got an error from requestConnection(), that means all of our retries have failed. – Xlythe Feb 21 '18 at 23:25
  • As a developer, the best that you can do is either... (a) Restart advertising / discovery. Eventually they'll connect successfully. This is what the WalkieTalkie Automatic sample does. (b) Build in your own randomized backoff and retry requestConnection() until it works (or until you receive onConnectionInitiated from the other side). – Xlythe Feb 21 '18 at 23:34
0

I just want to add that it may be necessary to have a short client name string when calling the API.

E.g., Nearby.Connections.requestConnection(googleApiClient, shortNameHere,....)

I had been generating my own client name with UUID.randomUUID().toString() and that seemed to cause the STATUS_BLUETOOTH_ERROR. All I did was change the code sample to use a UUID name and to use P2P_CLUSTER and I got that error.

This was the solution for me regarding the STATUS_BLUETOOTH_ERROR.

Markymark
  • 2,804
  • 1
  • 32
  • 37
  • That's interesting! I've got a bunch of questions, though. Are you using a 4-character string for your short name, or some other length? How do you generate it? Is it the same name that you give when you start advertising or discovering? Did it really get rid of all of your STATUS_BLUETOOTH_ERROR? I'm still seeing this intermittently, and sometimes it gets into a state where it never comes out of it (even after stopping/starting advertising and discovery). I'm willing to try your idea, but need to understand it better first. – Rapunzel Van Winkle Nov 14 '17 at 04:43
  • I'm now using a 5 character string of numbers like the one in the sample (Just using the java.util.Random class). Yes, I was/am using SharedPreferences to store the generated ID. So I've been playing with it for about another hour and I did get another `STATUS_BLUETOOTH_ERROR` intermittently. Previously I would get that error every time. This is the sample I'm talking about https://github.com/googlesamples/android-nearby – Markymark Nov 14 '17 at 05:18
  • Right now I'm facing a problem where I try to send over bytes and a file as described here https://developers.google.com/nearby/connections/android/exchange-data but immediately when I attempt to send, the connection is lost. I'm still investigating it. – Markymark Nov 14 '17 at 05:21
  • Well, intermittent errors are better than errors every time, so I guess that's progress. I'm currently using names that are 17-31 characters (16 pseudo-random characters stored in Shared Preferences, concatenated with a user-defined screen name that's 1-15 characters). – Rapunzel Van Winkle Nov 14 '17 at 06:36
  • 1
    Good to know. I just realized that my problem from my previous comment was caused because when I started an intent to choose an image (to send a file to the other device) the main activity became paused. Apparently, everything disconnects when the activity goes into the background. I'm sure the reason is for battery drain. I was really hoping that an IntentService with a foreground notification could also utilize the Nearby API. Not sure that's possible though. – Markymark Nov 14 '17 at 06:40
  • With regard to your lost connections, I seem to get onEndpointLost a lot more often with the most recent Google Play Services. I did have code to downgrade the endpoint's status to "unknown" and try to reconnect, but seemed to always get "ALREADY CONNECTED" back. So I'm currently just logging onEndpointLost and otherwise just ignoring it (basically treating that callback as noise). I'm not sure what I'd do in a production app! I prefer to have robust error handling, but according to my current observations, it seems to just spontaneously recover from onEndpointLost anyway. – Rapunzel Van Winkle Nov 14 '17 at 06:43