terraform自动化_使用Terraform在GCP中自动化Elasticsearch部署

田德运
2023-12-01

terraform自动化

We all want to utilize the awesome powers of Elasticsearch, but the number of plugins, configurations, and labyrinthine documentation creates a giant roadblock for getting started.

我们所有人都想利用Elasticsearch的强大功能,但是插件,配置和迷宫文档的数量为入门提供了巨大的障碍。

NB! This guide expects you to have some basic knowledge about Terraform, Bash scripting, and Elasticsearch.

注意! 本指南希望您具有有关Terraform,Bash脚本和Elasticsearch的一些基本知识。

In GCP(Google Cloud Platform), there are only a few options to get started with Elasticsearch at the moment: use hosted SaaS in Elastic or one of the offerings in the marketplace. Both options have drawbacks and I intend to show my approach on how to get started and see some results fast while remaining in full control of your deployment.

在GCP(Google云平台)中,目前只有少量选择可开始使用Elasticsearch:在Elastic中使用托管的SaaS或市场中的一种产品。 这两个选项都有缺点,我打算展示如何入门并快速查看一些结果,同时完全控制您的部署。

Later you can absorb the details in the documentation and get a deep understanding at your own pace. This article is by no means a description of a perfect setup of an Elasticsearch cluster, it’s intended more as a way to get all the tools you need up and running fast in a smooth way, then it’s up to you to modify the scripts according to your own use case. I’ve been working in GCP, but these examples should be applicable in any cloud environment.

以后,您可以吸收文档中的详细信息,并按自己的进度深入了解。 本文绝不是对Elasticsearch集群的完美设置的描述,其目的更多是作为一种使您所需的所有工具顺利运行并快速运行的一种方式,然后由您根据以下内容修改脚本您自己的用例。 我一直在GCP工作,但是这些示例应适用于任何云环境。

Elasticsearch, Kibana, Logstash, and all associated plugins are open-source, so the only cost is the VMs (virtual machines) and infrastructure running in GCP or any other cloud environment.

Elasticsearch,Kibana,Logstash和所有相关插件都是开源的,因此唯一的成本就是在GCP或任何其他云环境中运行的VM(虚拟机)和基础架构。

Terraform is perfect for automizing deployments since you can tear down and spin up VMs and infrastructure in a matter of minutes with a single command.

Terraform非常适合自动化部署,因为您只需一个命令就可以在数分钟内拆除和启动虚拟机和基础架构。

1.地形 (1. Terraform)

1.1。 虚拟机 (1.1. Virtual machines)

First of all, we need the VMs running Elasticsearch:

首先,我们需要运行Elasticsearch的VM:

There is a lot happening here. You can choose how many machines you’d like to deploy by iterating over the above snippet. The most interesting is the startup script that I call ./startup-elastic.sh. We will get back to this topic in section 2.

这里发生了很多事情。 您可以通过遍历上面的代码片段来选择要部署的计算机数量。 最有趣的是我称为./startup-elastic.sh的启动脚本。 我们将在第2节中回到该主题。

We also need a new instance for running Kibana:

我们还需要一个新实例来运行Kibana:

So, this is the heart of what we need, the actual GCE VM (Google compute engine) running Elasticsearch and Kibana as Hashicorp configuration language.

因此,这就是我们需要的核心,即运行Elasticsearch和Kibana作为Hashicorp配置语言的实际GCE VM(Google计算引擎)。

You can choose whatever settings that suit you in the var file. This configuration will deploy a VM with a 200 GB SSD drive in its own subnet with a service account that has minimal required rights. There are no external IPs, so we have to figure out another way of accessing the VMs.

您可以在var文件中选择适合您的任何设置。 此配置将在其自己的子网中部署具有200 GB SSD驱动器的VM,该VM的服务帐户具有最少的必需权限。 没有外部IP,因此我们必须找出另一种访问VM的方法。

My tfvar file looks something like this:

我的tfvar文件看起来像这样:

So far so good, but we need more than just VMs, right?Let’s set up the IAM and give the service account used by the VMs correct permissions.

到目前为止,还不错,但是我们不仅需要虚拟机,对吗?让我们设置IAM,并为虚拟机使用的服务帐户提供正确的权限。

1.2。 权限 (1.2. Permissions)

To be able to create a backup of our data in GCS (Google Cloud Storage), the service account will need some fine-grained permissions.

为了能够在GCS(Google Cloud Storage)中创建我们的数据的备份,服务帐户将需要一些细粒度的权限。

First, we create a custom role with the permissions the VMs need to backup towards GCS and get the key to the service account, then we create a service account and apply that role to it. Last of all, we generate a key to that service account. This key we will save in the Keystore in Elasticsearch and use it to create a backup repository in GCS.

首先,我们创建一个自定义角色,该角色具有VM向GCS备份所需的权限并获取服务帐户的密钥,然后创建服务帐户并将该角色应用于该角色。 最后,我们生成该服务帐户的密钥。 我们将把这个密钥保存在Elasticsearch的密钥库中,并用它在GCS中创建一个备份存储库。

1.3。 网络 (1.3. Network)

We also need some networking infrastructure for security.

为了安全起见,我们还需要一些网络基础架构。

We start by creating a separate VPC (Virtual Private Cloud) and subnet for the VMs and infrastructure. Here we specify the IP range that you can give to the machines.

我们首先为VM和基础架构创建一个单独的VPC(虚拟私有云)和子网。 在这里,我们指定您可以赋予机器的IP范围。

Since the VMs don’t have an external IP, you cannot access the internet and download any software. To get around this problem, we need to open a NAT gateway with a router to be able to download anything from the internet. Once the software is installed, we can remove the NAT gateway.

由于虚拟机没有外部IP,因此您无法访问Internet并下载任何软件。 要解决此问题,我们需要使用路由器打开NAT网关,以便能够从Internet下载任何内容。 安装软件后,我们可以删除NAT网关。

In GCP, if you want to access your VPC network from e.g. Appengine or Cloud functions, you’ll need a VPC connector. In my setup, I host an API on GAE that invokes Elasticsearch internal load balancer via the VPC connector.

在GCP中,如果要通过Appengine或Cloud功能访问VPC网络,则需要一个VPC连接器。 在我的设置中,我在GAE上托管了一个API,该API通过VPC连接器调用Elasticsearch内部负载均衡器。

An Elasticsearch cluster with more than 1 node needs a load balancer to distribute the requests. To put the VMs under a load balancer, we need to create instance groups. For redundancy, we put the VMs in the same region, but different zones. In case there is a problem in one zone, the others won’t be affected. In this example we’re deploying three VMs, my-elastic-instance-1 and 2 are in zone d, while my-elastic-instance-3 is in zone c.

节点数超过1的Elasticsearch集群需要负载均衡器来分发请求。 要将虚拟机置于负载均衡器之下,我们需要创建实例组。 为了实现冗余,我们将虚拟机放在相同的区域,但位于不同的区域。 万一某个区域出现问题,其他区域将不受影响。 在此示例中,我们将部署三个VM,my-elastic-instance-1和2在d区,而my-elastic-instance-3在c区。

Since we also want to access Kibana through a load balancer, we’ll create an instance group for that too.

由于我们也想通过负载均衡器访问Kibana,因此我们也将为此创建一个实例组。

Now, all we need is the load balancer with health-checks and forwarding rules.

现在,我们需要的是带有运行状况检查和转发规则的负载平衡器。

1.4。 防火墙规则 (1.4. Firewall rules)

And of course, the part that I always forget, the firewall rules. Allowing internal communication between the nodes in the subnet, allowing load balancer and health checks to communicate with the VMs.

当然,我经常忘记的部分是防火墙规则。 允许子网中节点之间的内部通信,允许负载平衡器和运行状况检查与VM通信。

2. Elasticsearch Bash脚本 (2. Elasticsearch Bash script)

In this section, we will look at the fun stuff: the startup script running on the machines when they are deployed.

在本节中,我们将介绍一些有趣的东西:部署在机器上的启动脚本。

The script takes a few input parameters:

该脚本带有一些输入参数:

  • Name of the Elasticsearch clusters master-node

    Elasticsearch集群主节点的名称
  • IPs of all nodes in the cluster

    集群中所有节点的IP
  • GCS backup-bucket name (where the backup is stored)

    GCS备份桶名称(存储备份的位置)
  • GCS bucket name where the certificate is stored

    存储证书的GCS存储桶名称
  • Service account key

    服务帐号密钥
  • Password to Elasticsearch.

    Elasticsearch的密码。

The buckets were created manually, but if you prefer to do it in Terraform, that’s also fine.

这些存储桶是手动创建的,但是如果您更喜欢在Terraform中创建存储桶,那也可以。

Before we take a look at the script, there’s another manual step — the creation of certificates. This could probably be automized, but I found it easier to generate them manually and put them in a GCS bucket and copy them into Elasticsearch at startup.

在查看脚本之前,还有另一个手动步骤-创建证书 。 这可能可以自动化,但是我发现手动生成它们并将它们放入GCS存储桶中,然后在启动时将它们复制到Elasticsearch会更容易。

The certificates are important to secure the communications within the cluster and to enable some features in Kibana.

证书对于确保群集内的通信安全并启用Kibana中的某些功能很重要。

I’ll break down the script into smaller chunks and explain them one by one.

我将脚本分解成较小的块,并逐一解释。

2.1。 启动检查 (2.1. Startup check)

We start by checking if the credentials.json file is present or not, so we only run the startup script once, not every time we restart the machine. If the file is present we exit the script.

我们首先检查是否存在credentials.json文件,因此我们只运行一次启动脚本,而不是每次重新启动计算机时都运行一次。 如果文件存在,则退出脚本。

2.2。 安装先决条件 (2.2. Install prerequisites)

Then we download and install the prerequisites. You can specify which version you like, or just take the latest 7.x version as below.

然后,我们下载并安装必备软件。 您可以指定所需的版本,或仅采用以下最新的7.x版本。

2.3。 配置elasticsearch.yml (2.3. Configure elasticsearch.yml)

Now, we need to configure Elasticsearch in the elasticsearch.yml file. Here we are setting the IPs of the nodes and deciding which one is the master node.

现在,我们需要在elasticsearch.yml文件中配置Elasticsearch。 在这里,我们设置节点的IP,并确定哪个是主节点。

If you remember from section 1, the startup script took some input variables.Those were:

如果您还记得第1节中的内容,启动脚本接受了一些输入变量,其中包括:

2.4。 配置堆大小 (2.4. Configure heap-size)

If you have any previous experience with Elasticsearch, you know how important RAM is. I won’t go into this discussion, I’m just setting the heap-size to 50% of total RAM and below 32 GB.

如果您以前有过Elasticsearch的经验,您就会知道RAM的重要性。 我将不进行讨论,我只是将堆大小设置为总RAM的50%且低于32 GB。

2.5。 安装GCS备份插件 (2.5. Install GCS backup plugin)

In this step, we install the necessary plugin we need to use GCS as a backup for all our Elasticsearch indices. This also adds the service account key to the Keystore and restarts Elasticsearch.

在此步骤中,我们安装了必要的插件,我们需要使用GCS作为所有Elasticsearch索引的备份。 这还将服务帐户密钥添加到密钥库中,然后重新启动Elasticsearch。

2.6。 启用监控 (2.6. Enable monitoring)

The next command enables X-Pack monitoring in the cluster settings.

下一个命令在群集设置中启用X-Pack监视。

2.7。 扩展Elasticsearch.yml并复制证书 (2.7. Extend Elasticsearch.yml and copy certificate)

Here we copy the certificate that we placed in the GCS bucket into Elasticsearch, and we are also extending the elasticsearch.yml file with some security settings. And last, we are adding the password to the Elasticsearch Keystore. In these examples, we’re not setting any password to the Keystores.

在这里,我们将放置在GCS存储桶中的证书复制到Elasticsearch中,并且还使用一些安全设置扩展了elasticsearch.yml文件。 最后,我们将密码添加到Elasticsearch Keystore。 在这些示例中,我们没有为密钥库设置任何密码。

Now we don’t want to keep our password in the tfvar file like all the other arguments that we are sending into this script. If you paid attention in the beginning, our tfvar file did not have any elastic_pw. We are adding it when we deploy our terraform code, to keep the password away from the code and repository.

现在,我们不想像将要发送到此脚本中的所有其他参数那样将密码保留在tfvar文件中。 如果您在一开始就注意了,我们的tfvar文件中没有任何elastic_pw。 我们在部署terraform代码时会添加它,以使密码远离代码和存储库。

Like this:

像这样:

2.8。 注册备份存储库,创建自定义角色,等等。 (2.8. Register backup repository, create custom roles, etc.)

This last step is registering the backup-repository and creating a policy so snapshots will be taken daily to our backup bucket. We’re also creating some custom roles. After this step, you can just append anything you’d like from the Elasticsearch documentation.

最后一步是注册备份存储库并创建策略,以便快照将每天带到我们的备份存储桶中。 我们还将创建一些自定义角色。 完成此步骤后,您可以随便添加来自Elasticsearch文档的任何内容。

When you install Elasticsearch, your life will be a lot easier if you also install Kibana (frontend for Elasticsearch) and some of the plugins to monitor your cluster.

当您安装Elasticsearch时,如果您还安装Kibana(Elasticsearch的前端)和一些插件来监视集群,您的生活将会轻松很多。

3. Kibana Bash脚本 (3. Kibana Bash script)

In this last part, we will go through the Kibana startup script and tie it all together by explaining how we will access Kibana safely through the browser.

在最后一部分中,我们将介绍Kibana启动脚本,并通过说明如何通过浏览器安全地访问Kibana将它们结合在一起。

For simplicity, I’ve chosen to run all beats plugins in the same instance as Kibana.

为简单起见,我选择在与Kibana相同的实例中运行所有Beats插件。

3.1。 启动检查 (3.1. Startup check)

As we mentioned earlier in section 2, we only want to run this script once.

正如我们在第2节中提到的,我们只想运行一次该脚本。

3.2。 安装先决条件 (3.2. Install prerequisites)

3.3。 密钥库和密码 (3.3. The Keystore and passwords)

Just like Elasticsearch, Kibana has a Keystore to securely store sensitive information. The trick with the Keystores is that Kibana will read from kibana.yml and kibana.keystore for settings. Now Kibana can connect to Elasticsearch without exposing our password in plain text. Beats and APM have Keystores that operate in a similar way.

就像Elasticsearch一样,Kibana拥有一个密钥库来安全地存储敏感信息。 密钥库的窍门是,Kibana将从kibana.ymlkibana.keystore读取设置。 现在,Kibana可以连接到Elasticsearch,而无需用纯文本公开我们的密码。 Beats和APM具有以类似方式操作的密钥库。

3.4。 将证书复制到Kibana (3.4. Copy certificates to Kibana)

3.5。 附加Kibana配置文件 (3.5. Append the Kibana config file)

3.6。 启动Kibana (3.6. Start Kibana)

To send data to Elasticsearch, we need Beats. They function as data shippers for Elasticsearch and Logstash. I will avoid describing the details since it’s best found in the documentation.

要将数据发送到Elasticsearch,我们需要Beats。 它们充当Elasticsearch和Logstash的数据发送者。 我将避免描述这些细节,因为最好在文档中找到它。

3.7。 文件拍 (3.7. Filebeat)

3.8。 启动Filebeat (3.8. Start Filebeat)

3.9。 度量节拍 (3.9. Metricsbeat)

3.10。 心跳 (3.10. Heartbeat)

3.11。 APM (3.11. APM)

3.12。 Logstash (3.12. Logstash)

Pew, that was a lot of configuration!

哎呀,那是很多配置!

Showtime, let’s deploy!

Showtime,让我们部署!

Now we have all configurations in our repository and have deployed Elasticsearch with Terraform without exposing the passwords or certificates. Plus, you can easily add more plugins and configurations as you need them.

现在,我们已在存储库中拥有所有配置,并且已在Terraform上部署了Elasticsearch,而没有公开密码或证书。 另外,您可以根据需要轻松添加更多插件和配置。

Note that everything is deployed within its own VPC network with no external IP addresses. That’s great, but how do we access the Kibana web server securely through the browser?

请注意,所有内容均部署在自己的VPC网络中,没有外部IP地址。 太好了,但是我们如何通过浏览器安全地访问Kibana Web服务器?

3.13。 访问基巴纳 (3.13. Access Kibana)

This part we are going to do manually in the Google Console, but it could also be automized with Terraform.

我们将在Google控制台中手动完成此部分,但也可以使用Terraform将其自动化。

We reserve a static IP address, create an HTTPS load balancer. We set backend configuration to point to our Kibana instance group, while the frontend configuration is set to our external static IP address and port 443. Then we create a certificate and assign it to the load balancer. The certificate can be created directly in the console while setting up the load balancer.

我们保留一个静态IP地址,创建一个HTTPS负载平衡器。 我们将后端配置设置为指向我们的Kibana实例组,同时将前端配置设置为我们的外部静态IP地址和端口443。然后我们创建一个证书并将其分配给负载平衡器。 设置负载均衡器时,可以直接在控制台中创建证书。

So now Kibana can be accessed on the IP you’ve reserved. But so can anyone else… Which leads us to the next point:

因此,现在可以在您保留的IP上访问Kibana。 但是其他人也可以……导致我们进入下一点:

Enter IAP (identity-aware proxy)

输入IAP( 身份识别代理 )

IAP lets us manage who has access to our load balancer, by setting up the Oauth client and giving users IAP-secured Web App User permission.

通过设置Oauth客户端并向用户授予受IAP保护的Web App用户权限,IAP允许我们管理谁有权访问我们的负载均衡器。

Now only the users with this permission in the GCP project can access your IP. On top of that, they have to enter the password that we have sent into Kibana.

现在,只有GCP项目中具有此权限的用户才能访问您的IP。 最重要的是,他们必须输入我们发送到Kibana的密码。

Brilliant, now we can Terraform up a completely new environment running Elasticsearch, Kibana, and Logstash with one single command in just a few minutes!

精妙的是,现在我们可以在短短几分钟内通过一个命令就可以构建一个全新的运行Elasticsearch,Kibana和Logstash的环境!

What an ELK deployment in one command feels like
一个命令中的ELK部署的感觉如何

Thanks to Anders Akerberg for his knowledge and support.

感谢Anders Akerberg的知识和支持。

翻译自: https://towardsdatascience.com/automate-elasticsearch-deployment-in-gcp-part-1-terraform-3f51b4fcf5e6

terraform自动化

 类似资料: