Micro Service, Serverless, Event Based Architecture and API Gateway

Desktop application era was disrupted by the web application era and then mobile application and then the ‘modern’ architecture, well at least it is modern right now.

The term modern is loosely correlated with Micro Service architecture, serverless, event based architecture and API Gateway. What was it’s all about? Each of those pattern is designed to solve high traffic and high load condition, imagine serving farom 100 visitors/day to 1 million visitors/day, how can you have a turnkey solution for solving those kind of traffic and load?

In my experiences for developing monolithic application which work well until it doesn’t due to the increase of load or increase of the data managed, most of the problem is limitation of database, the simple query that runs well before take 15+ seconds, the database support transaction, reporting, validation and etc, which lead to database overload. Other might argue to use materialized view, temporary tables, memory based database, high spec server with 64 cores, 2TB of memory and RAID 10 SSD drive for overcoming the issue, yes it could work, but wouldn’t it be wonderful if we can just solve it by adding low spec server and it just automagically load balance every aspect of the application?

This is the main idea that the above patterns trying to achieve, though some of the pattern also require high spec server but mostly for the main engine and it can scale using lower spec, or it can provide performance as high specification solution with much lower specification.

Let’s start with a case: I have an application which had 3 millions data, and it serves as online public catalog which have many facet, backend reporting and also for transaction processing from 800+ office units. The main load was the online public catalog and the load of the database. How to optimize this?

Micro service: splitting the processes of the application into tiny micro services that can work independently. First due to the main load of the application is due to online public catalog, use search engine database to serve the load. Adding data replication or push from transactional database to search engine and then serve the public catalog along with other search functionalities from search engine. The application access the search engine using simple REST web service. If the load of the online catalog increase then just scale the search engine. This also work for reporting, instead of using the transactional database, create a dedicated datawarehouse database instead to serve all the reporting query through REST web service also.

Are those above really micro service pattern? Some might argue it isn’t, well I agree, they are just splitting the application features. This bring us to another case of a monolithic application that have a certain validation process that strain the database, we already used 32core CPU, 32GB memory, 1TB of RAID 10 SSD for the database but the performance was still not sub seconde. One might argue for creating temporary tables or optimize the tables, but this application is distributed application and changing database structure is a tedious with hundreds of offices to be upgraded wihtin shor span of time, these options are not well accepted, so how would you solve this?

First the problem within database query usually involve multiple joins of table and multiple criterias which sometimes are just cannot be solved by indexing further, and upgrading the database into cluster might not solve this issue of processing data using query, how about if we move out the processing from the database into scripting instead? How exactly? Application will just get from database the data required for processing with minimum join to the application memory and then process the data using application logic. Join and criteria are not that hard to process using array manipulation, but how about the memory and cpu consumption of moving the process from database to application?

The options are whether keep the process in database which take also cpu and memory and hopefull that increasing the server specification or upgrade the database using clustering will solve it or move to the application and then scale the application using more application server. Most of the people I met agree that scaling transactional database is harder than just scaling application server, like spawning additional apache web server running the same application.

By moving the process to application and isolating the function into a small application that can be accessed by REST service, it become micro service. Load getting higher or need more capacity for processing the validation? Just add more application server, if one server can process 100 transaction / second and we need 200 transaction / second then just add another server, done. We’ve successfully make the application easily scalable.

Serverless, adding another server and done, how exactly can we do that? Setting up the web server, copy the application, load balancing the requests between servers, that’s not so easy! How about if we already have like 5 servers using the same application and then we need to change something? How about 20 servers? It take forever … being overly melodramatic here. The answer is using VM or Container solution, we deploy our application in VM or Container, so we can easily add new instance of our application, but we add the hassle of creating new VM / Container image everytime we change our application. The application should be stateless, meaning it doesn’t use session information, all the information it need should be provided by web service so it can scale without any dependency to database, session variables or other components.

API Gateway, now serverless has made adding server easy but how about the load balancing part? Who manage the request distribution and which request goes to which servers and also new server are available to be used / registered? API Gateway to the rescue, you register all server in API Gateway and then mark that for /request_type will be load balanced to designated servers. Consider it as network router

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.