There are moments in the development of a platform where you know you hit a point of no return. Remember Emmet Brown in Back To The Future III? Yep, pass that point and you either succeed or fall into the ravine and die, nothing in between.
What are we talking about here? SEARCH, in upper cases. Search is one of those hidden things killing platforms receiving more traffic than expected, or more traffic than the business owner can afford. Hitting the database for absolutely everything is not something a good engineer should feel comfortable with.
For the same reason we all use different caching solutions to avoid generating the same results again and again by just storing pure static HTML files, we should think as well about indexing all the data in our database so its optimised for data search.
Let’s put it this way… a database is a good way to store, read and manipulate data. It is fast overall and it does its job most of the time quite nicely and efficiently (given that your SQL queries are good, can’t fix stupid). But when your backend needs to deal with many write/reads for its core functionality plus a search box in the website that keeps consistently being hit by visitors, it gives up and starts eating resources like Godzilla.
ElasticSearch to the rescue
ElasticSearch is a search engine based on Apache Lucene, an open source search engine library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.
This is the go-to solution for all your use cases like search, logging and analytics. Elastic, the company behind it, signed an agreement with Alibaba Cloud back in 2017 to provide a managed deployment of this platform.
In a normal situation, you’ll grab a VM or Docker Image and deploy your own ElasticSearch on the cloud. For basic applications is a good start, but don’t be fooled, it gets messy quickly, specially with storage and when upgrading versions.
Alibaba Cloud’s managed solution
Based on the n4 and sn2ne ECS families, depending on your network throughput requirements, these instances perform at the level you would like for a production environment.
Either using the web interface or launching this service by API/Terraform, you’ll be able to adjust the number of replicas for the cluster in an easy way. Once done, just sit and relax at your mission control place of choice.
All ElasticSearch deployments comes with Kibana by default, a great web interface to play with search queries, get good top-level views and create amazing dashboards and data visualisation screens.
Let’s do it
We are going to deploy a basic ElasticSearch cluster with Terraform. This example below will create the most basic ElasticSearch you can imagine, a cluster with 2 data nodes, subscribed for 1 month, 20GB of storage and deployed in 1 single zone.
resource "alicloud_elasticsearch_instance" "main" {
instance_charge_type = "PrePaid"
period = "1"
data_node_amount = "2"
data_node_spec = "elasticsearch.n4.small"
data_node_disk_size = "20"
data_node_disk_type = "cloud_efficiency"
vswitch_id = "vsw-xxxxxxxxxxxxxxxxxxxxx"
password = "password-goes-here"
version = "7.7_with_X-Pack"
description = "app-es-cluster"
zone_count = "1"
private_whitelist = [
"192.168.0.0/16",
]
enable_public = true
public_whitelist = [
"0.0.0.0/0",
]
}
output "domain" {
value = alicloud_elasticsearch_instance.main.domain
}
output "kibana_domain" {
value = alicloud_elasticsearch_instance.main.kibana_domain
}
As you can see, I whitelisted the internal IP CIDR and also all external IPs from Internet, so you can play with it easily. For obvious reasons, don’t do this on production, or do it only if you are sure of what you are doing and taking other security measures to protect your data.
For the outputs I used the exported attributes “domain” and “kibana_domain”, in that way it will be very easy for you to start pointing your tools to the new cluster as soon as it gets created.
Conclusion
I guess this is one of the main benefits of the cloud, the fact of being able to launch open source projects with a couple of clicks or API calls is very convenient. This is specially true with Alibaba Cloud, as it respects the open source solutions very well, making it easy to bring your project out of the managed solution if you decide so.