Azure – Akshay Surve

This blog post was cross-posted from DeltaX Engineering Blog - {recursion} where it was published first.

In the second half of 2016 - we decided to migrate our multi-tenant app from bare-metal servers to Azure. While you can find numerous benchmarks for various cloud platforms - there are very few relatable drill-downs on the thought process as part of such migrations to the cloud as-is. More importantly, this was not just migration - it was literally a war with all hands on the deck; keeping the existing usage, client data, and growth intact we were able to migrate over 1.4TB data and existing clients to the cloud successfully.

This is how we declared WAR
Declaration of WAR

Finally, we emerged as winners post the last tenant migration Declaration of WAR

I knew this story needed to be told. I got an opportunity to talk at the (Software Architects Bangalore meetup)[https://www.meetup.com/SoftwareArchitectsBangalore/events/237117024/] and share this journey with the larger community. Here are slides from the talk:

Migrating a multi-tenant app to Azure (war biopic) from ★ Akshay Surve

The overall response and feedback post the talk was humbling - everyone was amazed at what we were able to achieve.

So, here is my humble request to the team

If I look back at our journey, we have recovered from massive failure; seen through classic disasters and built innovative and meaningful solutions. While we are moving mountains, working on disaster recovery or building that fancy little new feature; let’s share our story on this blog.

This blog post was cross-posted from DeltaX Engineering Blog - {recursion} where it was published first.

Advancements by cloud-based IAAS providers (Amazon Web Services, Google Cloud and Azure have made on-demand scale and flexibility a reality. Today, as a startup you don’t need to worry about over-provisioning infrastructure, forecasting growth and go over long-term infrastructure contracts to meet your demands. Interestingly, a new suite of cloud services are questioning the very existence of a core aspect of common application architectures - the ‘server’ and are coined as serverless.

What is the ‘server’ in `serverless`?

Let’s say you wanted to run a service on the cloud - for this, you would need to do the following:

Decide the type of computing resources you need. Instance type, cores, memory and storage space.
Choose an OS / Machine image to install on the instance
Setup / deploy your service

Steps 1 & 2 above constitute the ‘server’ in the serverless paradigm and in effect, these are the steps you wouldn’t have to worry about. All you need to do is to choose your execution environment and submit your code.

Available Options

When it comes to the serverless paradigm - each of the major cloud IAAS providers have launched their own options. Here is a quick summary of options available:

IAAS	Serverless Paradigm	Supported Environments
Amazon Web Services	AWS Lambda	Node.js, Java, Python, C# (.NET Core)
Microsoft Azure	Azure Functions	Node.js, C#, F#, Python, PHP, and shell
Google Cloud	Cloud Functions	Node.js

Ref: Click here for a detailed comparison on Stackoverflow

There are slight differences in the extent of support and capabilities but the process to initiate works as follows:

Select a development environment
Choose the amount of memory, execution timeout etc.
Setup a trigger for launch

Proof of Concept

In part, to test drive the paradigm and at the same time build something useful, I worked on two POCs.

Azure Function: Cachewarmer Function

When it comes to our web application, we use Entity Framework as the ORM. Considering the multi-tenant nature of the application and the volume of tables - context initialization takes an unexpectedly long time. It’s for this exact reason we had to build a mechanism to warm the context cache to initialize it and keep it ready for external requests.

Trigger: CRON

Dev Environment: shell

Description: I cooked together a sequence of cURL requests to make pings to a special endpoint on the web application which initiates a context load. Considering we have over 500 tenants we had to batch a series of requests and to avoid hitting the max execution time I had to split this into two separate functions.

Honestly, this was really a trivial function, but it is exactly why having a serverless architecture was justified. Not to forget, we were up and running within 20 mins.

AWS Lambda: Slackbot dxdb

This was in retrospective a solid use case. Let me take a deep dive onto this one:

Purpose: As noted earlier, we have over 500 tenant databases. When it comes to querying the databases - it’s pretty cumbersome to connect to them individually using SMSS and then run individual queries. When it comes to executing small queries to check data; it would be pretty useful to simply fire the query in the Slack channel and see the results. An unexpected consequence of using Slack is also that one can fire the query from the Slack mobile application as well and see the results on the go.

Features Supported:

Detect the DB to connect with intelligently from the schema
Support delayed response. Some queries can take longer to execute while Slack for an immediate response has a window of 3 seconds.
Formatting output to the extent possible
Minimal error notifications

How it works? Slack command dxdb

Every invocation of the command makes a POST request to the AWS API Gateway with the command and the request text; in our case the query.
The AWS API Gateway invokes the AWS lambda function dxdbExecuteSQL and passes the request params. Tip: The AWS API Gateway is probably the most underrated yet one of the most powerful and flexible services AWS has launched. Will explore this in the future.
dxdbExecuteSQL function authenticates the request, does minimal checks on the kind of queries (in our case only read-only) and does two things.First formats the intermediate response in the form of MSSQL prompt to be sent back to Slack through the API gateway. Next invoke the dxdbDelayedSlackResponse lambda function.
dxdbDelayedSlackResponse lambda function parses the query, identifies the tenant, fires the query, reads the results, formats the response and makes POST request back to Slack.

Although the setup is complex and layered, I only had to focus on the workflow and the business logic; the effort of picking an instance, setting it up and keeping it running was not something I had to worry about. Another interesting thing about this setup is that - the function is not running all the time, it is only executed on invocation and the icing on the cake is that you are only billed for the time it executes in increments of 100ms.

Code: Project is available on Github.

Follow-up Thoughts

Going serverless is an extension of adopting the cloud but demands a change in the thought process of layering your architecture. The recent trend around microservices-based architecture also fits well with the serverless paradigm.

Interestingly, each of the cloud services offers a minimal code editor. I can see how in the future you could probably have a full-fledged IDE available at your disposal. Looking at the pace of innovation, we are another step closer to not just programming for the cloud but literally in the cloud.

What is the ‘server’ in serverless?