Skip to end of metadata
Go to start of metadata

USED FOR INITIAL ANALYSIS, NOT ACCURATE ANYMORE

Assumptions

  1. The cluster must support 10000 mobile users who are generating moderate internet traffic. Moderate is defined as 250 http requests/hour.
  2. The cluster must be available 24x7x365, with 99.99% availability corresponding to approx. 5 minutes downtime per month. The Yona services for request analysis and status queries must be available 99.999%, and the Yona services for account management must be available 99.9%. The VPN server should be available 99.999%, as blocking user access to internet is considered a no-go.
  3. The cluster supports the Yona microservices architecture, in which three services are recognized
    1. Admin service. Provides administrative capabilities like maintaining the set of goals
    2. Analysis service. Provides SmoothWall with the necessary information and analyzes suspicious HTTP requests and creates appropriate entries for a person and their buddies.
    3. App service. Provides the services that back the mobile app: user sign up, account maintenance, buddy relationship maintenance, message retrieval, etc.
  4. The cluster avoids single points of failure
  5. The SmoothWall request forwarding to the Yona analysis service can only use Perl scripts.
  6. The cluster supports Docker deployment of all Yona services and all 3rd-party components.
  7. The cluster may be deployed on virtual machines in different ways:
    1. for DEV and TEST a cluster deployment on a single VM
    2. for ACC and PROD a cluster deployment on multiple VMs.

(warning) The split in various microservices is motivated by earlier discussions, the first prototype of the Yona services and a number of requirements mentioned in various emails and discussions. The criteria for dividing functionality in microservices include separation of concerns, domain boundaries, differences in scalability and performance requirements and app-facing versus background batch functionality. 

Components

The following components are foreseen for the cluster architecture

  1. Load Balancer for the web traffic. Alternatives are among others  nginx and  HAProxy . Proposal is to use nginx
  2. Public REST web services (microservices), implemented in Java/SpringBoot. Alternatives are Apache Tomcat and Spring Cloud. Current prototype uses Tomcat.
  3. Private REST backend services (microservices), implemented in Java/SpringBoot. See #2 for alternatives
  4. Message streams between SmoothWall and the Analysis service. Alternatives are no messaging (using POSTs from Perl onto Analysis endpoint), RabbitMQ, Redis messaging and Spring AMQP. Proposal is to use direct POSTs in the first iteration. If throttling is required an AMQP messaging solution may need to be included.
  5. Relational database for storing Yona data. Alternatives are MySQL, JavaDB, HSQLDB, MariaDB. Current prototype uses HSQLDB. Proposal is to use MariaDB with Galera multimaster cluster.
  6. Cache for Yona data. Alternatives are Redis and HazelCast. Proposal is to use HazelCast.
  7. A bastion host for secure access to the cluster.
  8. Logging infrastructure. Alternatives are ELK (ElasticSearch, LogStash, Kibana) stack. Proposal is to use ELK
  9. Perl script for posting HTTP requests to analysis services. See the details page.

(warning) For the components a few alternatives are listed. There are typically many others...

We may want to consider some broader infrastructure platforms, which offer a lot more functionality, but increase the learning curve...

  • Rancher/Rancher OS: a Docker-based software platform, OS, infrastructure, load balancer, etc.
  • SpringCloud: A cloud platform
  • OpenStack: A generic cloud platform definition and implementations
  • AWS: A mature cloud platform

Design

A first draft of the cluster is shown below. Message streams and caching are currently not included. It assumes multiple Linux (CoreOS is a well-defined container-supporting version) VMs, residing in different availability zones within CloudVPS.  PowerPoint is attached (no Gliffy alas)

First a more functional setup

 

Then a more technical setup, with specific components filled in

Detailed calculations of load

In order to discuss the cluster architecture based on facts, an Excel sheet with some calculations has been prepared. The calculations are assuming the original Yona goal: assessing the violations of common and user-level categories (goals). A new requirement for assessing the total time spent on Internet will affect the calculations, and the architecture, in a profound manner. The calculations should be used as input for performance and scalability testing.

Assumptions
#users10000
#http requests/user/hour250
% of common level violations10%
% of user level violations30%
#Yona status calls/user/hour10
#Yona account retrieval calls/user/hour0.05
#Yona account update calls/user/hour0.025
#Yona account creation calls/hour5
#Yona buddy creation calls/hour10
Crypto read/write operations2
% of push notifications per violation10%
Resulting loadHourMinSec
# VPN servers101010
# http requests250000041667694
Perl script comparisons250000041667694
Analysis service POSTS250000416769
Goal conflicts75000125021
Crypto operations for write150000250042
Database write operations150000250042
Status service GETS / hour100000166728
Crypto operations for read200000333356
Push notifications75001252
Account service GETS / hour50080
Account service PUTS / hour25040
Account service POSTS/hour1500
VPN Profile creation/hour1500

  • No labels

12 Comments

  1. Please review the assumptions for the cluster architecture. In my opinion it is important to have consensus about them first, before we move to design.

    1. I've added a few comments to the assumptions. The proposed components are fine with me.

  2. Jan Bosch: please provide feedback on these assumptions.

  3. FYI: The Perl script is available and I've created a small wiki page about it.

  4. I propose to discuss the assumptions, the cluster architecture and the microservices approach in detail during a face2face meeting or a conference call. 

    1. This is very useful!

      A couple of questions and comments on the assumptions and formulas:

      1. "% of common level violations". With this, you mean that 10% of the entries in the SmoothWall log are potentially relevant to Yona, right?
      2. "% of user level violations". Does this imply that as out of the number of "potentially relevant" items, 30% are really relevant?
      3. If the above two are correct, then "Analysis service comparisons" is a bit confusing to me. I would suggest calling this goal conflicts for now. We're still in search for a better team, but it's at least in alignment with the current code.
      4. You currently calculate the number of crypto operations for write through a multiplication by 4. The current code does 3 crypto operations per message, but this isn't really necessary. The URL needs to be encrypted, but the goal association and the user association don't have to, because the data is anonymous. That brings it down to 1 per message.
      5. The multiplication factor for database write operations is not that straight. It'll be more than 2, but varying. The initial write implies:
        1. INSERT in MESSAGE_DESTINATIONS_MESSAGES
        2. INSERT in MESSAGES
        3. INSERT in MESSAGES_URLS
        Successive writes of additional URLs (as part of the same event) will imply:
        1. UPDATE of MESSAGES
        2. INSERT in MESSAGES_URLS
      6. It's hard to say something about "Crypto operations for read", see the comments at Architecture. With the changes that I mentioned under 4 above, it will be fairly low, probably less than 4. 

       

      1. #1. That is right

        #2. Indeed which is 30% of 10% = 3.33% of all requests

        #3. goal conflicts is ok with me. However it shows we need to get the terms & glossary defined (wink)

        #4. allright

        #5. there may be more than only the db writes, such as push notifications (in some way). we may want to calculate these as well.

        #6. fair enough. I added those crypto operations because they are typically more expensive.

        I will change some of the terms in the document.

  5. OK, I started splitting the Yona server. This is what I'm thinking of:

    Microservices

    The Core layer contains the JPA entities and the Spring services. The three services pillars contain the Spring MVC controllers. Makes sense?

    There are two sets of arguments that came together: those provided by you in the comments and the the need for different authentication and authorization mechanisms for the different categories of services. I'll let you know whether it's feasible with a reasonable effort.

    1. I thought about it and it is possible to divide Yona in such way. However I find it quite generic. Maybe drawing a domain model with some bounded contexts makes sense. From that domain model the services can be derived. Remember it is still in brainstorming. And the mapping into technology has not been made. 

      1. The Yona server is now broken down in the chunks described in the previous comment. See this commit.

  6. First proposal of cluster design is added. Still a lot of questions and missing components. 

    Ron van der Wijngaard: do we need perse components like Consul? Or should we then adopt a platform like Rancher. complementary to Docker?

    1. Looks good to me. I've updated the set of services to align it with the current implementation. We might break that down further during the implementation, in which case we will need to update this page.

      Gert Jan SchipperRichard QuistBart Boerendans: Please provide your feedback.