Broadcast Messages with Source set

Hey Karel,

What source port are you sending on, and what port are you listening on. When you send a message with the source set to a non-zero value the response is returned to the source port of the original message. That’s the general issue that people have getting responses when setting the source.

The other thing to check would be the res_required and the ack_required fields.

Failing both those give me the hex of the packet you are sending and I’ll take a look.

Well, the source port is a broadcast address that is locally bound, and that I listen on using selector()'s in java. So, anything sent back to the source should be picked up. The message in question is GetService to detect the device in the first place. it is set tagged as per specification, but however, the documentation does not state that res_req and ack_req should be set. Should they be set for this message type?

Karel

I did some testing, and verification via wireshark. Doing a GetService with source set will not yield a reply from the bulb. They only respond to source = 0, for this kind of message.

Works fine here. Do you have the header frame ‘tagged’ set to 1, and ‘target’ zero-filled? Do you have any other broadcast messages working?

Yep - broadcasts, as long as source = 0, are working fine. Once I set it to a non zero value, there is no response.

Must be something else wrong with your packets, I’m definitely using source with no problem in my client lib.

Can we get a hex dump of the packet you are sending so that I can test it here locally?

Will provide that asap. What email address can I send you this? (is a wireshark dump ok as well?)

Daniel,

Wireshark is not really cool in terms of copy pasting data, but here is some feedback. For your info, I have 3 bulbs in place. At the client side, there is one thread per bulb running, and each thread does the same stuff : broadcast to .255 with a Source set. Here below, .181 is the client host. Packets are in sequential order as sent on the LAN. Offset and Bytes are the LIFX payload

From .181 -> .255

0000 24 00 00 34 1b 8b c3 f6 00 00 00 00 00 00 00 00
0010 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
0020 02 00 00 00

From .181 -> .255

0000 24 00 00 34 e8 de dd 78 00 00 00 00 00 00 00 00
0010 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
0020 02 00 00 00

From .181 -> .255

0000 24 00 00 34 22 dd c5 63 00 00 00 00 00 00 00 00
0010 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00
0020 02 00 00 00

As you can see, the packets are very similar, except for the Source bytes in bold

From .197 (bulb) -> .181

0000 29 00 00 54 22 dd c5 63 d0 73 d5 01 c5 93 00 00
0010 4c 49 46 58 56 32 00 01 44 a7 cc 31 1e ac 0e 14
0020 03 00 00 00 01 7c dd 00 00

From .180 (bulb) -> .181

0000 29 00 00 54 22 dd c5 63 d0 73 d5 01 eb 9f 00 00
0010 4c 49 46 58 56 32 00 01 04 50 87 45 1e ac 0e 14
0020 03 00 00 00 01 7c dd 00 00

From .187 (bulb) -> .181

0000 29 00 00 54 22 dd c5 63 d0 73 d5 02 0b 98 00 00
0010 4c 49 46 58 56 32 00 01 84 6e 4a 57 1e ac 0e 14
0020 03 00 00 00 01 7c dd 00 00

From .187 -> .181

0000 29 00 00 54 22 dd c5 63 d0 73 d5 02 0b 98 00 00
0010 4c 49 46 58 56 32 00 01 c4 d4 4d 58 1e ac 0e 14
0020 03 00 00 00 05 7c dd 00 00

After this, all bulbs send a bunch of ServiceResponses to .255 using some other Source value (e.g not one of the ones I used initially)

Now , apart from the fact that I have all threads listen to the 57600 port, which is wrong as each thread in turn my read data from that socket, and thus have the wrong UDP packet in the wrong thread, it is quite strange to notice that in a series of broadcasts, the bulbs only respond with the Source set to the latest Source value they receive. Therefore, should the bulbs be awarded more time between UDP broadcasts to respond with the proper Source value set?

Karel

I’m getting some feedback from the firmware team before I answer this one. Its a little bit out of my depth. Stay tuned…

I had stuck for a while with sockets in Java before.

My final solution is to open DatagramSocket at 56700 port (it can be opened only once per machine).
And you have to setBroadcast(true) for the socket when sending a packet to 255.255.255.255
I send a single broadcast message and receive all responses until timeout (there are multiple packets so we have to filter by message ID, source and other fields).

Here are discovery packets for the reference:
Send >
24 00 00 34 BE BA FE C0 00 00 00 00 00 00 00 00
00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 00
02 00 00 00
Receive <
29 00 00 54 BE BA FE C0 D0 73 D5 03 49 27 00 00
4C 49 46 58 56 32 00 00 C4 D2 CB FB 67 4C 0F 14
03 00 00 00 01 7C DD 00 00

Part of my client is a separate thread that does discovery, and I “reserved” the 56700 port for that thread, effectively setting all the broadcast flags and so on. For the regular threads-per-bulb, I stepped away from V1 compatibility and used a different port number (56701, 56702,…) per bulb.

Karel

Yeah, discovery is a special case, you can’t do it in a multi-thread.
Just to be sure you send a single GetService with defined Source, listening at the same port and got response with a different Source?

Yep because I mistakenly assumed that when you bind the same port in different threads, each thread would receive/read copies of the same datagram, but it doesn’t. In java NIO it is a first come first served principle.

To get back on-topic, I did some more testing today, and indeed, when you fire off too many broadcast messages at once, the previously “set” source field values are discarded.

It would be great of the firmware team could specify the minimum time between broadcast messages, e.g. how much time does a bulb need to properly respond to a GetService before the next one can be sent

Hi Karel,

The Protocol v1 relied on distributed state synchronization and used broadcast responses exclusively but we realized that does not scale well with the way 802.11 works.

In Protocol v2 changes, we ensured that you do not need to use broadcast except for initial discovery.

Let’s assume you have LIFX bulbs running the latest firmware (or atleast a version that supports Protocol v2), the discovery works as below:

You should:

  1. Send a broadcast packet with a client specific source set in the header.
  2. Receive a unicast response back from all the LIFX bulbs to the port that your UDP datagram originated from.
  3. Keep track of the LIFX bulbs and their IPv4, send subsequent messages unicast directly to the bulbs.

You should see a request / response sequence like below:

14:15:31.584364 IP (tos 0x0, ttl 64, id 35626, offset 0, flags [DF], proto UDP (17), length 64)
    tardis.lan.40418 > 192.168.1.255.56700: UDP, length 36
14:15:31.599144 IP (tos 0x0, ttl 64, id 426, offset 0, flags [none], proto UDP (17), length 69)
    192.168.1.239.56700 > tardis.lan.40418: UDP, length 41

You should be able to use a single UDP socket that is used to send and receive datagrams, let us know if you’re having trouble with this.

A post was split to a new topic: Detecting which protocol version to use

Thanks, what you describe is exactly what I do, but with a twist. The ecosystem I add the bulbs to has a dynamic configuration possibility which is driven by device detection (not specific for LIFX only), and configuration by a user through a GUI. However, the configuration of the ecosystem can also be provided by a Domain Specific Language, in which in the LIFX case, devices are identified by their MAC address.

So, I have one part of the code that does a general discovery of devices, using the V1 compatibility mode by binding the 56700 port, and reading back all the responses, and converting that in to what we call discovery results in a kind of inbox, from which the user can select and configure devices

For each bulb in the system, either added through the GUI, or as defined, there is a separate thread running doing all the management and communications. Since bulbs can be defined in a DSL, where there is no discovery as such, well, each thread (or handler, as we call them), needs to discover the bulbs on the LAN. (since we only have the bulb MAC address as configuration parameter). It has to be done by bulb separately, as each bulb (or device in general, in the ecosystem) can be put online, offline, …in the ecosystem . In essence, what I have at startup, is that each thread will broadcast using a specifically bound port (56701, 56702, and so forth), match the returned responses with the MAC address it is managing, configure the bulb, and continue all further communication using unicast. Exactly as you describe. (but I agree, if you have 3 bulbs, you get back 3 x 3 broadcast responses initially)

The problem lies in the fact that when these broadcasts are sent at the same time (well, almost, as it depends on other code execution and so forth, but it is pretty close at the same time), the bulbs seem to discard any broadcast message and only take into consideration the last source they see

I could imagine that the solution lies in pacing the broadcasts on the LAN, allowing the bulbs to respond. But how long should that interval be?

Hi,

I understand, let me clarify a bit more on how we keep track of the packets (source, sender IP and PORT) and this might help you with the problem.

  1. We have about 16 slots in our table (this might increase or decrease in future firmware revisions).
  2. Each slot is keyed using a hash of the source, sender IP and PORT.
  3. Requests are tracked using this key
  4. Responses are then sent unicast by looking up the source, IP and PORT information in this table.

If you’re sending multiple requests using the same source, IP and PORT - the responses will go to the right destination. We advice you use a wrap around sequence number to individually differentiate the request packets.

If you are sending more then 16 concurrent requests with different source, IP and PORT combinations, then there is a chance that LRU will kick in - this means a few of responses will end up being broadcast to the LIFX UDP PORT 56700.

If you think pacing the requests is what you have to do, I recommend looking at the RTT and keeping the concurrent requests to less than 16.

A better design would be to have a single sender thread that deals with all the network transport to ensure you can effectively control the rate at which broadcasts and unicasts are done - this gives you control over the QoS.

The protocol documentation specifically addresses this, and recommends a maximum rate of 20 packets per second per device. This does mean that for accurate rate limiting you need to track both broadcast and unicast packets per device.