READY FOR DISCUSSION
For the request interception strategy, we have concluded that using VPN is the only viable option. Given that choice, there are still options to consider, for instance regarding the VPN protocol to use, where to end the VPN tunnel, what product to use to implement the VPN, etc. This page discusses these options and ends with a conclusion.
Note that the below text does not distinguish HTTP from HTTPS. In theory, we can handle both protocols equally well, through the MITM approach described on the page Intercepting HTTPS traffic, but in practice we are not using this today because of issues with mobile apps.
This section describes a number of solution options:
The purpose of the VPN is to force all network traffic through the classification engine (practically: Smoothwall). This approach has the benefit of making everything available to the classification engine (not just the URL, but also the full request and response, including payload and headers). This comes at the cost of having to tunnel all network traffic of the user through the VPN tunnel and the Yona server running the classification engine. This puts a considerable bandwidth demand on the Yona servers, causes a potential bottleneck for the user and a considerable battery drain, due to the encryption that VPNs apply.
On Android, it is possible use
tun2socks to set up a VPN service on the device that forwards all traffic to a SOCKS server that could be running on the device itself. That SOCKS server could work in two ways:
- Route the traffic through the classification engine
The SOCKS server would forward all requests to two destinations:
- To the actual destination, for regular processing
- To the classification engine, for classification. The classification engine should not serve the responses, to limit the network bandwidth.
- Send only the URLs to the classification engine
The SOCKS server would need to do two things:
- Interpret the requests and send all URLs to the classification engine for classification (can be done asynchronously).
- Forward all requests to the actual destination
Instead of sending all requests or URLs to the classification engine, we could sample only a percentage. That could give a drastic reduction in load and still give a high probability of a good classification.
Comparison of options ( is positive, is negative):
|Traffic through classification engine||Only URLs sent to classification engine||Comment|
Sufficient info for classification
|If the traffic flows through the classification engine, it gets the full requests and responses, so has all possible data to make the right classification decision. With only URLs, it cannot do much..|
|Reduced network bandwidth||If the traffic flows through the classification engine, the Yona server needs to do all requests, resulting in the same amount of network traffic as before (actually slightly more, as all requests are sent twice).|
|Functionally correct||If the traffic flows through the classification engine, all requests are executed twice. This will lead to issues with some web servers.|
|Ease of implementation||The difficulty of the first option is to prevent from returning all responses from the classification engine. The difficulty with the second one is that the app needs to interpret and understand the network requests. That looks like a very difficult task.|
Concluding: The first option retains our current classification strength and is relatively easy to implement, but has very little benefits. The second one will reduce the classification accuracy and is very difficult to implement. Neither seems particularly attractive.
The tun2socks solution mentioned above could also be used to send the network traffic from the device to a SOCKS endpoint on the Smoothwall server (assuming supports this). This would take away the encryption burden from the device, thus reducing the battery drain.
Efficient VPN protocol
At the moment, we use SSL VPN with OpenVPN. This has several downsides. On Android, we are now embedding OpenVPN in the app. This works well but makes the app bigger than necessary and spreads the GPL license to the app. On iOS, we cannot bundle OpenVPN because of the GPL license, so users need to download and install it manually, which us cumbersome. Besides the issues with OpenVPN, the battery drain would be less and the speed more if we could use a VPN protocol that is native to the devices.
The alternative would be to use IPSec and on Android then use the IKEv2 key exchange, as that is known to be very efficient. Unfortunately, Smoothwall does not support IKEv2. To get around that, we could consider not to use the VPN option of Smoothwall but instead set up different VPN provider, but that brings the difficulty of passing the user identity from the VPN tunnel to Smoothwall
To be done.