AWS China, Big Data and IoT (PART 1)
Apart from Docker and DevOps methodologies, the other big topics that I enjoy the most currently at work are Cloud Architecture and Big Data/IoT.
In this article I would like to describe a little bit the situation of AWS in China, specially what is the current status of Services Available and what is the forecast of improvement that we can hopefully expect for the next months.
I will also try to provide a step by step guide to create an infrastructure made with IoT in mind, where we can achieve an scalable, secure and ServerLess installation using only AWS Services that are currently available in China.
AWS China VS AWS Europe
For this face to face comparison, I will take AWS China North Region, which is currently the only AWS Region available in China, against AWS Europe West Region, Ireland, which is one of the most advanced and up-to-date AWS Regions (Frankfurt is usually a bit behind in terms of Services and Features).
Let’s see a screenshot with the current list of Services:
It is easy to see the big difference in the amount of services between this 2 regions, if we put our focus in IoT/Big Data/Predictive Modeling, we find very fast some hurting absences like Lambda, Database Migration Service, Athena, Kinesis Firehose, GreenGrass, QuickSight or the whole Artificial Inteligence section.
This means that we can’t create an infrastructure for Big Data or IoT in Amazon AWS China? Of course not, we just need a bit more imagination, and sometimes to put a bit more of glue manually.
NOTE: In general, seems the situation is going to improve soon, AWS is not a friend of releasing Roadmaps with exact dates, but from my conversations with the AWS guys seems that we will have Lambda next month, GreenGrass before end of the year, and around Christmas, the new AWS Region in NingXia, this Region is supposed to be in a different level than the current Beijing region, and will start from the begining with 3 Availability Zones and many more services than the actually available in Beijing.
If we were in another region, I would like to propose a database free, ServerLess S3 based infrastructure, with Firehose and Probably Athena or Quicksight on top of S3.
But in China we can still propose some cool solutions, like the following one:
The main goals when designing this architecture were:
- ServerLess (Only the BI Dashboard solution requires a server that has to be administered)
- Easy to Scale
- Encrypted at transit and at rest
- Easy to manage
In my opinion we achieved our goals with this infrastructure, I will explain in the next sections of this article why I think so.
On top of that (I might detail the integration of this services on a following article) AWS EMR (Or AWS ML, or Tensorflow) can be easily setup to optimize data models and others like Glacier can be used for cheap long term backup storage.
In the next article PART 2 I will go through the step by step implementation.