The following is an overview of the Constrained Application Protocol (CoAP) and how Itron is using it to strengthen its presence in the Internet of Things (IoT) world. CoAP is a simple, RESTful web transfer protocol designed specifically for IoT and resource constrained equipment. It is for machine-to-machine applications such as smart energy and building automation.
CoAP has the advantage of being low cost and secure. It can work with 10KiB of RAM and 100 KiB of code space. At the same time its ease of use and platform independence make CoAP more attractive to users. CoAP supports different data types like text, HTML, XML, JSON among others. It can ask for messages to be returned in different formats too.
Itron’s platform supports the implementation of the CoAP protocol end to end through the NIC, or the CoAP library support on the Core Snappy Linux environment on the IoT Edge Router. The CoAP implementation includes a CoAP Gateway software that serves as a proxy between the client and the end device. A CoAP API completes the solution, providing CoAP interface for client applications.
One use case has a NIC as the CoAP proxy for sensors that are attached to it. The NIC also acts as a CoAP server for data that it provides, such as date, time, battery level, signal strength, or any other sensor data built into it. The intelligent endpoints and sensors are implemented as CoAP servers.
Sensor devices communicate using CoAP through the NIC based developer kits. The NIC routes CoAP messages through the mesh via an access point (AP) and a CoAP proxy that is Gateway. A sensor device acts as a CoAP server and the application that communicates with it is a CoAP Client.
A CoAP Client can reach the sensor by sending CoAP requests to the CoAP Gateway using CoAP Gateway APIs. The NIC contains a CoAP proxy that communicates to an attached sensor via a UART interface using CoAP over HDLC. Embedded in the HDLC are CoAP calls that are handled by the sensor attached to the NIC.
A user CoAP client is required to communicate with an end device. A CoAP protocol command is sent by the client to the Gateway which acts as a CoAP proxy. This communication with end devices is facilitated through the CoAP Gateway API.
The CoAP Gateway’s (GW) job is to control traffic flow from Itron's back office to the mesh network. Some of that traffic is CoAP, so GW acts as a CoAP proxy. Users establish a session with GW and can send proxy requests to GW. GW converts these requests and sends them out to the mesh so that they can reach devices. When a device responds, the response comes back through GW and is passed back to the original client.
Gateway as a CoAP Proxy
Since the Gateway functions as a standard CoAP-to-CoAP proxy, usage mostly follows the CoAP standard.
Gateway's CoAP Proxy supports the following features:
- Mesh-side security - CoAP Gateway encrypts traffic bound for Itron NIC-equipped devices using our proprietary encryption.
- Address resolution - clients access devices by host name, abstracting away the dynamic routing on Itron's mesh
- Proxy of basic CON/NON CoAP requests using the Proxy-Uri and Proxy-Scheme options - responses to these requests will be proxied back through Gateway to the client that made the request.
- Reliable retransmission for CON messages - gateway tracks outbound (sent to the client) confirmable messages and expects to receive an ACK/RST response from the client. If the ACK/RST is not received Gateway will attempt to resend the confirmable message as described in the CoAP RFC.
- Proxy CoAP observations of mesh devices that expose observable resources - Gateway allows the client to initiate an observation via a proxy request and will forward subsequent update notifications from the mesh CoAP server to the observing clients.
- Aggregation of CoAP observations as described in Section 5 of the CoAP observation RFC - Gateway handles observe requests by starting an observation "on behalf of" a client, but Gateway "owns" this observation. This allows Gateway to use a single observation of a mesh resource to service multiple observation requests for the same device/resource combination. This reduces redundant network traffic and prevents burdening constrained network devices with multiple observers.
- Response caching based on the indicated Max-Age option in the CoAP response.
Proxy features NOT supported by Gateway:
- Proxied CoAP ping requests to mesh devices - this is actually not possible based on the CoAP spec. A "CoAP ping" is achieved by sending an empty CON message to a server, which is an invalid combination and should result in an RST message from the server that indicates the server received the message. However, it is not possible to proxy an empty message because it cannot include the Proxy-Uri or Proxy-Scheme options (because it is empty). Instead, clients should send a proxied GET request for a mesh device's /.well-known/core resource to determine if it is responsive to CoAP requests.
- Block-wise transfer of CoAP proxy messages - at the time of this writing no Itron mesh devices support CoAP blockwise transfer. Block-wise responses may also not work correctly with the Gateway's response caching mechanism.
- DTLS security - Gateway currently does not support DTLS either between the client and Gateway or between Gateway and the mesh device.
At this time no Itron mesh devices support CoAP blockwise transfer. Gateway does support block-wise transfer for its local resources.
Client applications must delegate address resolution to the Gateway for devices on the Itron mesh, hence device-CoAP server URIs must specify only the device MAC ID.Gateway exposes an API that supports the CoAP-to-CoAP forward proxy for access to devices on the Itron mesh. Its features include:
- Transparent forwarding of CoAP protocol messages to devices.
- Address resolution - clients access devices by host name, abstracting away the dynamic routing on Itron's mesh.
- Mesh-side security - CoAP Gateway encrypts traffic bound for Itron's NIC-equipped devices using our proprietary encryption. It is planned to support end-to-end DTLS in a future release.
Gateway Local CoAP Resources
Gateway directly exposes a few local CoAP resources:
- /.well-known/core - Allows discovery of local resources.
- /echo - Gateway will send a response containing the same body as the request.
- /help - Provides some information about using the Gateway's CoAP proxy, particularly how session management works.
- /sessions - Used by Gateway CoAP Proxy clients to initiate an authenticated session with the Gateway. A session must be established prior to sending proxy requests.
Immediate ACK Note
On the mesh side ("downstream”) of Itron's proxy implementation of CoAP, which is in the Gateway:
- 5683 is the unsecured port – data is exchanged in clear text on the mesh
- 4849 is the secure port - encrypted data is exchanged on the mesh
On the interface between the CoAP client and the Gateway (“upstream”)
- Firewall port 5683 is open as a UDP port because the CoAP proxy servers always listen on port 5683
Gateway uses the concept of sessions to track client connections under a particular account. The CoAP standard has no concept of sessions, so this is an additional requirement specific to Itron's CoAP Gateway. A CoAP session has the following lifecycle:
- Create a session
- Do proxy requests that keep the session active
- Close the session
The following diagram shows the expected CoAP session lifecycle.
Users can capture this CoAP usage pattern in a shell script, allowing them to build complex functionality. The following resembles a typical pattern:
- Get JWT token
- Establish a Gateway session with JWT token
- One or more CoAP commands (such as a GET on a device)
- Delete the Gateway session
Gateway requires CoAP clients to establish a Gateway session before sending any proxy requests to mesh devices. Clients manage their sessions by posting account credentials, in the form of a JWT token, to the Gatewayʹs sessions resource. Refer to the Security section of the API Overview for security, authentication, and how to obtain a token.
In order to establish a session, you must POST your credentials to an endpoint on the Gateway itself. Session options are passed as query string parameters.
This is the security protocol used between the Gateway and the devices. The default will use Itron network (net_mgr V2) security. It can be disabled by setting it to 'none'.
If this is set to true, the account will only be allowed to create on session at a time on the Gateway.
Once the session is established, the Gateway uses the source address and port of the client to track the session. This has the advantage of not requiring clients to provide a session token with every request. But, it has the disadvantage of causing the Gateway to lose track of sessions if the client's ip address changes (say they restart their laptop and get a new address) or if the client's port changes (because their client uses a new socket for a request, or because operations put a misconfigured UDP proxy in front of the Gateway). In other words, CoAP clients must use the same internet address and port used to establish the session for all subsequent requests.
Gateway has a concept called "sticky sessions". When a client makes a second session request with the same account ID and the sticky=true query parameter Gateway will see if any other session for that account already exists and will update it to use the IP address/port of the new session request. This is intended to allow the session to "follow" a client's IP address and also to keep the client from accidentally opening multiple sessions when it doesn't really need them.
However, this does not transfer existing CoAP operations such as observations or reliability re-transmissions. The CoAP protocol has a number of dependencies on client IP address for doing request/response matching and it's likely that changing client address in the middle of a CoAP operation will cause confusion.
If you do not need multiple concurrent sessions on the Gateway (and you probably don't) then consider setting the query parameter sticky=true. If your client's socket address changes, you will have to re-authenticate, but the Gateway will re-use your previous session.
Message Response, Retry, and Timeout Handling
Note in the diagram that when a CoAP proxy request is sent to the Gateway it is queued to be sent to the mesh. Gateway immediately responds to the client with a CoAP ACK, indicating that the message was received and processed and that the client does not need to retry sending the request. Eventually the message is removed from the queue and sent to the mesh, which will result in one of the following:
- Success - the request makes it to the mesh device, the response comes back to Gateway and Gateway proxies the response to the client. This could include an error response from the server, but the communication was successful.
- Timeout & Retries - Gateway does not receive a response from the mesh device in a timely manner. In this case Gateway will retry sending the request to the device on behalf of the client. Gateway implements the CoAP reliable retransmission scheme and will make four attempts to send the message, each with an increasing timeout. The base timeout for the request is 20 seconds and each subsequent retry roughly doubles the timeout period (with a random factor). If all four retries fail, then Gateway will report an error to the client.
- Milli Request Serialization: The very long milli timeout is driven by the BAP retrying a request queued on the mesh. That timeout is multiplied for each pending request in the BAP queue - i.e. if there are 3 requests in the queue, the third one potentially has to wait for the other two to time out before it is processed. This results in an unpredictable timeout if more than one client is putting requests into the BAP queue. Gateway addresses this issue by strictly serializing requests to each milli to ensure that the BAP queue depth is predictable. Gateway will only allow one request per milli device at a time and will reject subsequent requests to that device until the current one completes. Requests made to "busy" milli devices will be rejected with a 5.03 SERVICE_UNAVAILABLE message with a payload indicating 'Request blocked by device serialization requirements, another request is enqueued or in process for this device.'
- Error - Gateway may report an error to the client, possibly from a timeout, or failure to resolve the target host, or policy limits preventing sending the request on behalf of the session.
In any case, the eventual CoAP response indicates the completion of the request transaction, the ACK just means Gateway received the request.
The Gateway's concept of a CoAP session is based on tracking the trusted IP address and port of a client that has authenticated. This means that if the client sends traffic from a different port or address Gateway will not recognize the client as having a session and will reject the request with an UNAUTHORIZED_401 (message code 129) response. When this occurs Gateway will log something like the following:
2017-04-28T01:37:18.981+0000 [gateway-nio-2-3] WARN SessionAwareHandler - Could not retrieve session for: /18.104.22.168:42278. Anonymous sessions are not allowed, please supply account Id
- Using the popular Firefox Copper CoAP plugin will not work. The plugin uses a different ephemeral port for every request and there's no way to configure it to use a constant port. This means Gateway will never recognize the client as having a session.
- coap-client from libcoap must be used with the "-p" flag for every command. This flag specifies the local port that the client will use to send and receive traffic from. If it is not specified then coap-client also uses an ephemeral port for every command.
- Firewalls that do UDP port forwarding cause subtle problems. The client can think it is sending on a stable port but the firewall may be changing that port behind the client's back to forward the request to Gateway. When this happens, Gateway is receiving traffic from whatever port the firewall decides and cannot match the port to one that the session was established on, which result in all requests being rejected.
CoAP-client uses a tiny buffer to read the CoAP URL from the console. If you are having trouble creating a session, you may need to specify the query options separately. For example:
Observation Aggregation by Gateway
The CoAP Gateway aggregates observations on behalf of multiple clients as described in section 5 of the CoAP observe RFC:
"If two or more clients have registered their interest in a resource with an intermediary, the intermediary MUST register itself only once with the next hop and fan out the notifications it receives to all registered clients. This relieves the next hop from sending the same notifications multiple times and thus enables scalability."
Gateway handles proxied observe requests by observing "on behalf" of the requester, which can be thought of as Gateway "owning" the observation of the mesh device. If additional clients attempt to observe the same resource then the existing observation is shared, Gateway does not start a second observation. When an observe notification is received, Gateway makes a copy of the response for each registered client and fills in the CoAP token for that observer before sending it to the client. The result is that the observe aggregation is essentially invisible to clients.
As described in the observe RFC, Gateway uses request options that are part of the Cache-Key to identify the observed resource. This includes the full request URI (host, port, path, and query parameters) as well as the "Accept" option that indicates the desired Content-Format of the response. Two requests that match all of these options exactly will generate the same Cache-Key and will be served by a single observation owned by Gateway.
The example above shows two clients registering for the same /temp resource on device123. As each client sends its observation request Gateway registers the client's IP address and the client token used to initiate the observation. Gateway then generates its own token (called "GWToken" in the diagram) and uses it to send the request to the server. This means that Gateway is acting as a client and is observing the resource on the server - Gateway owns the observation on behalf of the clients.
Gateway re-sends the initial observe request as each client starts an observation. This ensures that each client is sure to get a fresh response and that the server is still aware of the observation. As described in section 3.3.1 of the observe RFC, it is valid for a client (in this case Gateway) to re-issue a new GET request with options identical to the original observation at any time to refresh its state. The server responds with its current state, which is distributed back to the proxy observing clients. When the server emits a new state notification to Gateway, Gateway then forwards the update (modified to use the client's token) to each observing client.
When a client cancels its observation either via a RST response to an observe update or through an explicit cancellation request, Gateway removes that client from its list of observers. When a resource reaches zero registered observers then Gateway will send a cancellation request to the server, ending the shared Gateway owned observation.
Milli Serialization and Observe Cancelation
The Gateway CoAP proxy only allows a single request at a time to each milli. It is possible to initiate a case where quickly canceling a CoAP observation will result in a 5.03 SERVICE_UNAVAILABLE response caused by the serialization limitation. This occurs when:
- A client initiates a CoAP observation of a milli
- The client receives an initial response of 2.03 VALID to the start of the observation (indicating a successful start but no content)
- The client unsubscribes from the observation before any observe update is sent through the proxy.
This will result in a 5.03 SERVICE_UNAVAILABLE response to the unsubscribe request. This error response can be safely ignored, Gateway will correctly cancel the observation.
Each observation response includes a Max-Age option that indicates how long it is "fresh" for. If no Max-Age is sent in the response then Gateway fills in the default value of 60 seconds. When the last observation notification expires and becomes "stale" it may indicate a problem with the observation. The server may have rebooted and forgotten the observation or the responses may have been lost in the network, but either way the client needs to assume that the observation needs to be repaired. When the last update becomes stale Gateway will automatically re-issue the original observe request to attempt to get a fresh value. The response to this re-establishment request will be distributed out to all observing proxy clients.
Gateway also includes a CoAP response cache. Each confirmable GET request that passes through Gateway is hashed and its options are turned into a Cache-Key. Any CONTENT_205 response is the stored with this Cache-Key until the Max-Age of the response expires, at which point it is removed from the cache. Any subsequent request that matches the Cache-Key will be served the cached response rather than making a round-trip to the device on the mesh.
The one exception to this is that all observe start or observe cancel GET requests are not served from the cache. This is important because these requests alter the state of the server so they must make it through. However, as discussed above, this may result in clients receiving duplicate observe responses.
CoAP Tools & Reference Applications
Gateway exposes a CoAP API to act as a CoAP-to-CoAP forward proxy to allow clients to exchange CoAP messages with devices on Itron mesh network. The API allows clients to issue CoAP requests to mesh devices that expose CoAP servers. These requests are proxied by the GW.
A component of the Itron developer program, the Java CoAP client is a standalone command line utility that can be used to communicate with any Itron based CoAP, IoT device; any of the Hardware Developer Kits available for purchase on the Itron Developer Portal.
The JAVA CoAP Client is packaged as a jar and takes a configuration file, that defines network parameters and the IoT device, as well as command line options. Users can run the client on Linux, Windows, and MacOS.
A sample Windows CoAP Client application available in the SDK provides the capability to communicate with a CoAP sensor device on Itron network. The CoAP client automatically retrieves devices available to the user and enables communication with those devices by issuing CoAP commands. Source code for the application is available in the Document Center.
A sample Windows CoAP Server application available in the SDK provides the capability for a third party to simulate a sensor device connected to the Developers Kit. Source code for the application is available in the Document Center.
A Windows CoAP Server application can assume the role of a sensor device by connecting to a NIC based Dev Kit over UART. This is a useful test tool for simulating a sensor and developing an application that consumes sensor data through the Itron Data Platform or the CoAP Gateway.
CoAP Sensor Emulation
The Windows CoAP Server is written in open-source C# .NET, which, in turn, leverages other open source components. Some of the underlying CoAP implementation is taken from CoAPSharp, a CoAP implementation provided by EXILANT Technologies.
CoAP Server Functionality:
- Resource simulation
- Native NIC/Milli operations, including firmware image management and NIC/Milli-specific operations
- Diagnostic operations, such as plugin validation and review of recent CoAP requests and responses
- Tools for generating plugin shell code in the correct simulator format, and for extracting the core code for insertion into the final sensor server implementation