AWS: Lambda (Serverless)

Long time no see, I guess, but now I am back to posting more often over here. Today’s topic focuses on an interesting technology: the serverless.

Serverless and containers (or in their very popular form, docker) have been buzzwords throughout the last couple of years and while the containers took off finally, though many people are still migrating or evaluating, the serverless did not enjoy quite the same success. There is a lot of people talking about it but yet few who had the courage to actually use serverless architecture on large scale in production systems.

Now, the two big cloud providers provide serverless architecture: Azure in the form of Azure functions and AWS in the form of AWS Lambda. I haven’t actually had the chance to try Azure functions but I have heard that they are somewhat better than AWS, will definitely test that out. On AWS platform I have tried it about 2 years ago and then just a few weeks ago and I have some thoughts to share from the perspective of a developer.

These days you can create AWS lambdas from scratch, from predefined blueprints or a repository of functions submitted either by AWS, individual users or companies. That wasn’t the case two years ago. You could only do it from scratch or from blueprints provided by AWS which were definitely helpful but the addition of a repository of lambdas is a nice one. Pretty much anyone can submit their lambda to allow others to use them provided they follow their guide for publishing linked here.

Assuming you are going to start your lambda from scratch, you can choose a name for your function, the runtime which can vary between: C# (.NET Core), Java, Go, Node, Python. That wasn’t the case two years ago, you could at most choose between Java, Python and Node.js so definitely we can see improvements there.
After you’ve chosen the runtime, you can choose to create a role from scratch or from a template which essentially is a list of access policies that this lambda is allowed to perform. No worries, you can change all of these later as well, which is great!

When you are all done with all that you’ll be faced with a screen which allows further configuration of your lambda function, although the lambda is already created. So, how the lambda works is that it needs to be triggered by some sort of an event  or API call and then do something defined in the code of the lambda. On the left hand side you’ll have listed all the types of events the lambda can react to (e.g. API calls from API Gateway, S3 events: file creations, deletions etc). As soon as you click on one of them you are prompted with the configuration of that event, maybe it make sense to cover some of these events in future posts. Under the lambda, on the right side you’ll be able to see the access policies you’ve previously configured and although this can be changed, you must go to IAM service to do it which is not extremely intuitive, especially since there is no link on this page to guide you.

Next is a nice little editor which you can customize to your liking and also you can set the entry point for your AWS lambda. The sad part with this editor is that will stop loading once your package gets too big (>3 MB) which can be exceeded easily as your Lambda must contain all dependencies or libraries, however your lambda should never exceed 50MB which is reasonable I think. Then you have the chance to set environment variables which can be quite useful if you want to distinguish for instance, production lambdas from test lambdas.

The important part!
So now you get to configured the physical resources allocated to your lambda. Note that so far you didn’t choose any OS, CPU, RAM, Network config, Drives or anything. You can choose your RAM, maximum execution time (Timeout), VPC, concurrency, and send alerts in case the maximum of retries is exceeded (will talk about that).

  • RAM: you can select up to 3GB of RAM. The CPU is scaled proportionally. This part is not very transparent, but one thing to note is that around 1.5GB memory, AWS will add another core, so if your code is single threaded it won’t be able to take advantage of it. This is however a huge improvement from 2 years ago. Back then you couldn’t go more than 256Mb or 512Mb, can’t remember exactly.
  • VPC: The Virtual Private Cloud (VPC) acts like a logical division of the network in AWS, which you could VPN into and extend a corporate network into, in case for instance that you Lambda needs or calls internal corporate resources. This wasn’t available at all 2 yeas ago. Though I have heard this has significant performance drawbacks, adding more than 10 seconds to your execution time due to adding ENIs (Elastic Network Interface) to your Lambda.
  • Maximum execution time: The maximum execution time is now set to 5 minutes, up from 2 minutes 2 years ago. Big improvement but still not enough for some long running tasks. Azure supports 10 minutes maximum.
  • Concurrency: Here you can limit the amounts of AWS lambda that can run in parallel. After you exceed this limit, the execution will be throttled so all invocations will fail, an alert will indicate this on Monitoring tab of your Lambda. This can be useful in number of ways depending on your needs (e.g. prevent unauthorized trigger of the lambda, DOS attacks)
  • Alerting which can either publish a message in SQS or send an email via SNS when the maximum of retries is exceeded.
  • Default temporary disk space is set to max 512 Mb.
  • Body request size for API calls cannot exceed 6 Mb, while for events can not be larger than 128 Kb.

So, all in all, there’s been lots of improvements of the last 2 years but there are still some problems that are left unsolved:

  • There is a limitation in the resources you can choose which is kind of low. The lack of transparency about CPU power and also network bandwidth is a bummer as some people would like more control, myself included.
  • The execution time is limited to 5 minutes, but you can go around if you really need to as you do have access to the remaining time allocated for the Lambda execution (e.g. if you are processing large sets of data). I will share an example Lambda shortly.
  • If a Lambda execution failed, it will be automatically retried up to 2 times with some delay in between, but there is no way to configure this. It’s definitely not ideal, for instance in the case that a lambda times out, it will be retried and you might get duplicate data, depending on what your Lambda does. You can make the lambda always succeed in the code. I will include this in the example.
  • Cold starts suck. Those are basically the first executions of the Lambda which will be slower as subsequent execution will reuse the initial Lambda to avoid this. However if you are bursts of traffic around certain times then you’ll have more concurrent Lambdas therefore more cold starts (e.g. food ordering service)
  • Cloudwatch logs are not the most nice to look at.
  • You kinda need to design with AWS Lambda in mind. So if you think that you could just migrate your code from your ordinary EC2/Docker you might need to re-evaluate as Lambdas have only one entry point and in the case of APIs can only host one API.
    This will cause locking and could cause migration problems if another technology emerges.
  • There are some concerns I have around deployments, automated testing and versioning as this can get quite complicated.

Due to these reasons, I think the Lambda has not been widely accepted yet but I think it’s promising, has been doing a lot of progress in the last couple of years and I hope will continue on this path.

Pricing
Lambda pricing is a function between the resources allocated and execution time and I found it quite cheap, though if you are accessing other services like S3 you’ll be billed for those too, so it can spiral out of control easily, as it is easy often the case with AWS.

Cloud First: AWS

AWS (Amazon Web Services) is one of the biggest cloud providers along with Microsoft Azure with data centers across the globe. They provide a mixture of IaaS (infrastructure as a service) and PaaS (platform as a service). Their offerings include a large number of services that can easily replace most of the infrastructure you would have in a traditional datacenter and perhaps even more.

AWS regions

As you can see they cover most of the world, except Africa and some parts of Asia and Middle East. So if most of your users are located in Europe, the US, most of Asia or Australia + New Zeeland then you’re probably okay using AWS. So one of the important factors are in using AWS is which locations are important for you and your users.

That being said, AWS has an extensive IaaS offering including but not limited to:

  • computing: EC2, ELB
  • storage: S3, Glacier
  • database: RDS, Dynamodb
  • Networking: VPC, R53

They also have started making PaaS offerings, with the notable mentions of EB (Elastic Beanstalk) and Lambda. EB combines multiple IaaS such as EC2, load balancers, networking capabilities, database into a platform which is managed by itself and the user only has to deploy the application to EB which will in turn take care of the rest. For Lambda, there is no infrastructure at all, you just deploy your application to it and it will run on certain triggers with no underlying visible infrastructure.

Scalability is one of the main drivers for AWS, compared to traditional infrastructure. With AWS you don’t necessarily have to plan in advance your infrastructure, unless you are really cost centric, because AWS is able to scale up and down as needed and even automatically if a PaaS is chosen. I remember there were times when you’d have to plan in advance, months or perhaps years the buildup of a datacenter, the requirements and the tremendous costs associated with it. With AWS you can just spin off as many machines you want or storage capability and you’ll just pay for it. And once you no longer need it, you can terminate it and everything will be fine. If we are to think about reliability and security I could easily say that AWS is more reliable and probably more secure than most traditional data centers out there, not to say that they are perfect but they rarely have incidents and when they do they alert their users and usually tend to have quick fixes.

I think one of the main disadvantages for using AWS is the learning curve, especially if we are talking about regular software developers which are the most users of AWS. The amount of details and configurations in AWS is extensive and getting to know each and every option will take time, plenty of time. Now luckily, AWS has decent documentation and they have a CLI you can interact with so you don’t necessarily have to remember all the UIs. And once you have learned it, it will be hard to give up on it, because the learning curve is high and some of the concepts and technology is proprietary. So, for instance you start using lambda, you’ll start writing your code in a certain way so that it fits in the lambda model. So, if one day you’d like to switch from AWS to some other cloud provider, that might not be as straightforward as you’d hope.

I think there are many things to say about AWS, from the individual services in the AWS offering to advanced infrastructure designing. Personally, I really like AWS Elastic Beanstalk and CloudFormation. The first one is a PaaS solution that will take care about VMs, monitoring, scaling, alerting, deployments, logging and many other things while CloudFormation allows you to design the infrastructure that you need in a very friendly way, finally having as an output a JSON file you can export and use it as a script to later create physical infrastructure. CloudFormation really turns infrastructure into code which is something developers have always liked.

Maybe some people are concerned about CD/CI pipelines while working in AWS, but there is little reason to be concerned about because AWS fully adheres and complies with those principles. AWS is really good in providing SDKs and CLI to interact with their services in a variety of languages which helps a lot in terms of development and scripting.

Obviously most of these services would require a dedicated post for each of them but to keep this one short I won’t go into the details now. I will definitely have separate posts for Elastic Beanstalk and CloudFormation soon.

AWS 5 Pillars

Finally I can really recommend you the AWS Well Architected Framework, which is a set of principles and guidelines set by AWS separated into pillars that will help you design and maintain the infrastructure you need for your services.

LambdaWrap, a Ruby GEM for AWS Lambda

Author: Markus Thurner
Reblogged from: http://lifeinvistaprint.com/techblog/lambdawrap-ruby-gem-aws-lambda/

We’ve released LambdaWrap, a Ruby gem to simplify deployment to a serverless application based on AWS Lambda. This blog post talks about the motivation behind this Ruby gem, and technical details of LambdaWrap.

Where did we start?

There was a lot of talk among my team regarding the benefits of AWS Lambda, an offering from Amazon Web Services to run code without provisioning or managing servers. So, while we were skeptical of how that would work out, we gave it a try for a simple RESTful service.

Since our service was very simple, we didn’t want to start with a framework like serverless. Instead, we simply had some basic unit tests, uploaded our scripts manually and tested everything online.

As our thoughts evolved on how to support this service in a production environment, we weren’t satisfied with our manual approach. We turned to AWS CloudFormation, but quickly realized that it didn’t support versioning of Lambda functions nor setup of API Gateway. So we decided to create a rakefile and leverage the AWS SDK for Ruby. We started simple by just supporting uploading of the Lamba functions, but then added a wrapper for aws-apigateway-importer, and since we were using AWS DynamoDB, also included a simple wrapper for DynamoDB table creation.

What does LambdaWrap do?

LambdaWrap allows one to create a build pipeline for an AWS Lambda-based RESTful service within less than an hour, so that you can immediately start focusing on creating value, namely, writing your service code. LambdaWrap is easy to get into, since it focuses on the deployment pipeline only.

Let’s look at a basic example: a RESTful service that exposes a PUT and a GET method on a resource, and uses AWS DynamoDB to store data. With a documentation-first approach, go to editor.swagger.io, document the two endpoints, and save the swagger.json locally. Then, create your rakefile and have a deploy task like the following:

That’s it. A few lines of ruby code and your deployment is fully automated, including supporting short-lived environments such as those for developers or pull request builds. (Yes, we recommend creating a full environment during a pull request to ensure the deployment pipeline works, and also to run some service level tests against it.) Once you start doing that, you probably also want to be able to destroy all those short-lived environments to avoid unnecessary expenses caused by unused resources.

How did this help Cimpress?

We’re still in the early stages of our service’s lifecycle. But so far, deploying new versions of a service has been a smooth experience. Unifying this into a Ruby gem already benefited our small team by not duplicating the same deployment script over and over again, or worse – diverging over time. So while it took a little effort to bundle everything into a reusable library, the return on investment was almost immediate. So we believe the creation of LambdaWrap is a good story in moving a small team forward while showing what can be achieved with simple, generic script and by sharing our AWS Lamba experience in order to lower the entry hurdle for other teams.

What’s next?

We believe LambdaWrap allows others to learn from our experience of creating a deployment pipeline for services based on AWS Lambda. It’s so simple (in terms of just a few Ruby files) that it can easily be copied for more specific use cases, extended with additional, generic features, or both. Yes, we would appreciate your feedback and contributions. With that, we do not have a long-term roadmap for LambdaWrap, although we’ll continuously extend it. We will certainly continue to focus on deployment, not local execution or other alternatives. Other larger frameworks such as serverless already do that job particularly well.

Cloud First: Azure

Hello everyone! As promised this is a follow up post to this one and is a high level overview of the Microsoft Azure cloud service.

I’ve been using Azure for a couple of years now and and if I was to summarize I would say it’s a good service with great and rich interface, with rather poor organized SDK and quite expensive I might add but let’s get into it, shall we?

First things first, you can register for a free trial right here and you’ll get an amazing 200 USD that you can spend on Azure services right away and I encourage you to do so. Afterwards you can dig into the Azure portal which has been recently redesigned and it looks great and the performance is also slightly better but let’s get into more details.

Availability

 

They do have data centers scattered across the globe(Europe, Asia, Americas, Australia) so it shouldn’t be a problem for you to get where you’re customers are.

Interface

 

As with every Microsoft product we all know that interface is usually good and this one is no exception. I mean it’s beautifully done from the dashboard which can be customized by simply using drag and drop to the organization of information so that you don’t get confused. Don’t underestimate this because there is quite a bit of information going on and I can easily see how some could get this wrong but they did a great job and I love it.

Services

Obviously this is the most important part, right?
Well, there are a lot of things going here. From the classical VMs, to load balancers, app services, CDN(content delivery network),  Redis caching, Active Directory, VPN, machine learning services(HDInsight), storage, databases (read SQL and noSQL such as documentDB), IoT processing, schedulers just to name a few.
So there’s plenty from which you can choose and some of them are uniquely to Azure, I’ll give you a few examples in a minute. Naturally Azure has a very good integration with Visual Studio leveraging team services.
I believe one of the unique features of Azure is the Marketplace where Microsoft but also 3rd party vendors are offering cloud services. Some of those services are Auth0, load testing service, cloud-based RabbitMQ service, face recognition APIs, Logentries, speech APIs, testing service and so much more.
Furthermore there are templates for VMs and/or app/web services. You want a Sharepoint server farm, ELK + Kibana, Oracle Dbs, wordpress blog? You can literally have anything you might want in just a few minutes and people get to contribute to it.

SDK

Well you can find all of the SDKs including the CLI right here with a decent amount of supported languages such as: Java, .NET, Node.js, PHP, Python, Ruby. That being said the documentation is rather chaotically organized. While for Ruby the documentation is easy to find, where as for others is not as simple as I would like. For instance the documentation for Python is here.

Pricing

They do have a pricing calculator right here but my overall impressions is that is rather expensive. Just to give you an example a quad-core 28gb of RAM VM(D12 v2) in Azure costs 317.69 USD/month at the date of this post where I could buy from OVH from instance a quad-core, 32GB of RAM dedicated server for roughly 85 EUR in any of their regions (including France, Germany, Netherlands).

Conclusion

It’s a great cloud service which I really recommend but one can clearly see some areas of improvement. There is more to say here and I could go through each service and describe it but it’s too much and this is just a overview of the service. If you are particularly interested in any of their services please let me know.

I totally loved that Marketplace and template based configuration services such that you could start from step 2 or 3 rather than step 0 plus the amazing variety of already done APIs/services that you can choose from.
If you can deal with the price, that is.

Welcome to the cloud!

Hello and welcome to 2016! This is the first post of the 2016 and I wanted to start with a series of posts on cloud services, because it appears to be a cool place that everyone wants to know about and there is quite a competition between the big four competitors out there: Amazon through Amazon Web Services(AWS), Microsoft through Azure, Google through Google Cloud Platform and IBM. I’ve been using Azure for years but relatively recently I started with AWS and I can talk about them from a developer perspective. I haven’t used the other two so I really can’t say.

But first let’s clear out a few concepts and explain what is this cloud thing.

PaaS – Platform as a Service
With PaaS  developers get a framework they can build upon and which makes development, testing and deployment of application quick and simple. That being said as a developer you don’t care about the OS management, virtualization software, servers, storage, network and other things, you just manage your application.

IaaS – Infrastructure as a Service
With IaaS you get a self-service where you can access, monitor and manage infrastructure that is often located across the globe. A great advantage over traditional infrastructure is that you don’t have to pay for it out front, but pay as you go and based on the actual necessities and scaling is very easy and efficient. Opposed to PaaS you have to manage servers, storage, networking and OS.

Both Azure and AWS support support IaaS and PaaS and they offer very similar services however the user experience is a bit different. If you are still wondering what are the advantage of the PaaS/IaaS over traditional infrastructure here is a summary for you:

  • It’s just there. You don’t need to manage your infrastructure, at least not entirely.
  • You can focus on your application your are building instead of infrastructure that brings no value
  • You can scale up and down as you please. You don’t have to get a ton of infrastructure that you’are only going to use during the peak season.
  • You don’t care if servers break down, all is taken care of.
  • Getting the infrastructure you need may be just a click away.
  • You can have pre-configured infrastructure for you to deploy to right away.
  • You can have the infrastructure in the geographical area you want so that the response time is as small as possible.

Because I want to cover both Azure and AWS in depth I will create separate posts for each one of them starting with Azure in the following days.

See you soon!

Simple Share Buttons