Evaluate languages and frameworks, learn asyncio and layered design, and setup lint, test, code coverage tooling from the very beginning.
At SlangLabs, we are building a platform for programmers to easily and quickly add multilingual, multimodal Voice Augmented eXperiences (VAX) to their mobile and web apps. Think of an assistant like Alexa or Siri, but running inside your app and tailored for your app.
The platform consists of:
This blog post is to share the best practices and lessons we have learned while building the microservices.
At the idea-phase of a startup, one has some sense of destination and direction but does not know exactly what to build. That clarity emerges only through iterations and experimentations. We were no different, so we had to pick a programming language and microservice framework suitable for rapid prototyping. These were our key considerations:
There is no perfect choice for the programming language that ticks all of the above. It finally boils down to Python vs. Java/Scala because these are the only feasible languages for machine learning work. While Java has better performance and tooling, Python is apt for rapid prototyping. At that stage, we favoured rapid development and machine learning over other considerations, therefore picked Python.
Microservice architecture facilitates each service to independently choose the programming language and framework, and there is no need to standardize on one. However, a heterogeneous system adds DevOps and infra overheads, which we wanted to avoid as we were just a couple of guys hacking the system. With time, our team and platform grew and now has microservices in Go-lang and JavaScript too.
β
With Python, came its infamous Global Interpreter Lock. In brief, a thread can execute only if it has acquired the Python interpreter lock. Since it is a global lock, only one thread of the program can acquire it and therefore run at a time, even if the hardware has multiple CPUs. It effectively renders Python programs limited to single-threaded performance.
While GIL is a serious limitation for CPU-bound concurrent Python apps, for IO-bound apps, cooperative multitasking of AsyncIO offers good performance (more about it later). For performance, we desired a web framework which is lightweight yet mature, and has AsyncIO APIs.
We evaluated three Python Web Frameworks: Django, Flask, and Tornado.
Tornado was just right for our needs. But most of our design tactics are independent of that choice, and are applicable regardless of the chosen web framework.
In recent time, more AsyncIO Python web frameworks are emerging: Sanic, Vibora, Quart, FastAPI. Even Django is beginning to support async.
β
Before we plunge into design and code, letβs understand some key concepts: cooperative multitasking, non-blocking calls, and AsyncIO.
Threads follow the model of preemptive multitasking. Each thread executes one task. OS schedule a thread on a CPU, and after a fixed interval (or when the thread gets blocked typically due to an IO operation, whichever happens first), OS interrupts the thread and schedules another waiting thread on CPU. In this model of concurrency, multiple threads can execute parallelly on multiple CPUs, as well as interleaved on a single CPU.
In cooperative multitasking, there is a queue of tasks. When a task is scheduled for execution, it executes till a point of its choice (typically an IO wait) and yields control back to the event loop scheduler, which puts it the waiting queue, and schedules another task. At any time, only one task is executing, but it gives an appearance of concurrency.
β
In synchronous or blocking function calls, the control returns back to the caller only after completion. Consider the following pseudocode:
β
In asynchronous or non-blocking function calls, the control returns immediately to the caller. The called function can pause while execution. It takes a callback routine as an argument, and when called function finishes and results are ready, it invokes the callback with results. Meanwhile, the caller function resumes execution even before completion of the called function. Assume there is a non-blocking async_read function, which takes a callback function, and calls it with the read bytes. Consider the following pseudocode:
β
As you can see asynchronous code with callbacks is hard to understand because the execution order of the code can be different from the lexical order.
AsyncIO syntax of async and await facilitates writing asynchronous code in synchronous style instead of using callbacks, making code easy to understand.
β
When a function is async, it is called coroutine. It must be awaited, as its results will be available only in future. An await expression yields the control to the scheduler. Code after the await expression is like a callback, the control to be resumed here later when coroutine completes and results are ready.
AsyncIO has an IO Event Loop, a queue that holds all completed coroutines ready to be resumed.
While Tornado has worked out well for us so far, we did not know it then. We designed our microservices such that Tornado-dependent code was segregated and localized. It was to easily migrate to a different framework if the need arises. Regardless, it is a good idea to structure your microservice into two layers: Web Framework Layer and framework independent Service Layer.
Web Framework Layer is responsible for REST service endpoints over HTTP protocols. It does not have any business logic. It processes incoming requests, extract relevant information from the payload, and calls a function in the Service Layer which performs business logic. It packages the returned results appropriately and sends the response. For Tornado, it consists of two files:
Service Layer contains only business logic, and knows nothing about HTTP or REST. That allows any communication protocol to be stitched on top of it without touching business logic. There is only one requirement for this layer:
Logical service APIs allow the Web Framework Layer to be implemented (and replaced) without getting into the nitty-gritty of the inner working of the service. It also facilitates standardization and sharing of a large portion of web framework code across services.
We are rare among startups to automate testing and code coverage from the very beginning. It may appear counter-intuitive but we did it to maintain high velocity, and fearlessly change any part of the system. Tests offered us a safety net needed while developing in a dynamically-typed interpreted language. It was also partly due to paranoia regarding our non-obvious choice of Tornado, to safeguard us in case we need to change it.
There are three types of tests:
We wrote integration tests both for Service Layer to test business logic, as well for Web Framework Layer to test the functioning of REST endpoints in Tornado server.
β
Clone the GitHub repo and inspect the content:
β
The directory addrservice is for the source code of the service, and the directory test is for keeping the tests.
Using a virtual environment is one of the best practices, especially when you work on multiple projects. Create one for this project, and install the dependencies from requirements.txt:
β
The script run.py is a handy utility script to run static type checker, linter, unit tests, and code coverage. In this series, you will see that using these tools from the very beginning is actually most economical, and does not add perceived overhead.
Letβs try running these. In each of the following, you can use either of the commands.
Static Type Checker: mypy package
Linter: flake8 package
Unit Tests: Python unittest framework
This will run all tests in the directory tests. You can run unit or integration test suites (in tests/unit and tests/integration directories respectively) as following:
Code Coverage: coverage package
After running tests with code coverage, you can get the report:
You can also generate HTML report:
If you are able to run all these commands, your project setup is complete.
There are several choices for building microservices: Java, JavaScript, Python, and Go. If microservice involves interfacing with ML libs, choices reduce to Java and Python.
For quick prototyping, Python is more suitable. But it comes with the drawback of Global Interpreter Lock. Cooperative multitasking with non-blocking asynchronous calls using asyncio comes to rescue. Tornado is the only mature Python web framework with asyncio APIs.
Layered design can derisk in case framework is to be changed in future. Tests can also be layered: unit, integration, and end-to-end. It is easy to setup lint, test, code coverage from the very beginning of the project.
β