Terraform 是一个 IT 基础架构自动化编排工具,它的口号是 "Write, Plan, and create Infrastructure as Code", 基础架构即代码。Terraform 几乎可以支持所有市面上能见到的云服务。
Terraform 要解决的就是在云上那些硬件资源分配管理的问题。相比较 Chef, Puppet, Ansible 这些软件配置工具,Terraform 提供的是软件配置之前,软硬件(基础)资源构建的问题。
当我们创建资源时,使用 terraform 比 ansible 好在哪里?
就创建资源这个角度来说,terraform 和 ansible 都能完成,terraform能够并发,效率高很多,另外它在资源生产成功之后会在本地以一个state文件的形式记录整个资源的详细信息,而这些信息的记录使得整个模板所定义的资源可以保证前后端的高度一致性,可以有利于后续对于整个一套资源的有效的版本控制。同时Terraform拥有一个Data Source功能,利用这个功能可以实现对于已有资源的获取,比如在生产资源之前想要查看当前有哪些可用区,有哪些可用镜像等,所有的这些都可以通过DataSource实现。
Terraform 和 Ansible 的结合
新建目录,并生成配置文件,比如 azure.tf
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
}
配置有两部分:provider 和 resource。provider 告知与哪一个云平台打交道,这里是Azure;如果使用AWS,这里就写成 provider "aws"。第二部分是资源,说明要生成哪些资源,例子中是resource group,还可以继续往下写,比如网卡,存储,虚拟机等。
格式:resource resource_type resource_name { }
A resource block has two string parameters before opening the block: the resource type (first parameter) and the resource name (second parameter). The combination of the type and name must be unique in the configuration.
我已经通过Azure CLI 登陆过,所以上面provider 部分没有提供用户验证信息,如果单独配置,使用如下形式:
# Configure the Microsoft Azure Provider
provider "azurerm" {
# More information on the authentication methods supported by
# the AzureRM Provider can be found here:
# http://terraform.io/docs/providers/azurerm/index.html
subscription_id = "..."
client_id = "..."
client_secret = "..."
tenant_id = "..."
}
这些信息怎么获取? 可以用Azure CLI 的命令生成:
az ad sp create-for-rbac --role="Contributor" --scopes="/subscriptions/${SUBSCRIPTION_ID}"
详细信息参考微软文档
初始化
在初始化项目的时候,Terraform 会解析目录下的*.tf文件并加载相关的 provider插件。
$ terraform init
Initializing provider plugins...
- Checking for available provider plugins on https://releases.hashicorp.com...
- Downloading plugin for provider "azurerm" (1.20.0)...
Terraform has been successfully initialized!
You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.
If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
apply changes
This output shows the execution plan, describing which actions Terraform will take in order to change real infrastructure to match the configuration.
$ terraform apply .
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
+ azurerm_resource_group.rg
id: <computed>
location: "eastasia"
name: "royTR"
tags.%: <computed>
Plan: 1 to add, 0 to change, 0 to destroy.
Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes # 查看 execution plan 符合期望,输入 yes 确认,之后真正执行。
azurerm_resource_group.rg: Creating...
location: "" => "eastasia"
name: "" => "royTR"
tags.%: "" => "<computed>"
azurerm_resource_group.rg: Creation complete after 1s (ID: /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR)
Apply complete! Resources: 1 added, 0 changed, 0 destroyed.
$ terraform state show
id = /subscriptions/7c91db0e-eb7f-491b-997f-32cf55b85dea/resourceGroups/royTR
location = eastasia
name = royTR
tags.% = 0
更多
$ terraform state list
module.roy-azure.azurerm_availability_set.hdp-avset
module.roy-azure.azurerm_network_interface.bastion-nic
...
$ terraform state show module.roy-azure.azurerm_virtual_machine.hdp-slave[1]
...
location = japaneast
name = roy-tf0-hdp-slave-02
...
$ terraform state show module.roy-azure.azurerm_network_interface.hdp[0]
...
ip_configuration.0.load_balancer_backend_address_pools_ids.# = 0
ip_configuration.0.load_balancer_inbound_nat_rules_ids.# = 0
ip_configuration.0.name = hdp-01-ip-conf
....
private_ip_address = 10.0.10.8
...
改配置
修改刚才的文件,添加tag部分。
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
apply changes
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
~ update in-place
Terraform will perform the following actions:
~ azurerm_resource_group.rg
tags.%: "0" => "1"
tags.environment: "" => "TF sandbox"
Plan: 0 to add, 1 to change, 0 to destroy.
terraform destroy
$ terraform destroy
azurerm_resource_group.rg: Refreshing state... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg)
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- azurerm_resource_group.rg
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
Terraform will destroy all your managed infrastructure, as shown above.
There is no undo. Only 'yes' will be accepted to confirm.
Enter a value: yes
azurerm_resource_group.rg: Destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxx/resourceGroups/royTR-rg, 10s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 20s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 30s elapsed)
azurerm_resource_group.rg: Still destroying... (ID: /subscriptions/xxxxx/resourceGroups/royTR-rg, 40s elapsed)
azurerm_resource_group.rg: Destruction complete after 48s
Destroy complete! Resources: 1 destroyed.
单独删除一个资源:
$ terraform destroy -target=module.roy-azure.azurerm_virtual_machine.hdp[2]
...
An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
- destroy
Terraform will perform the following actions:
- module.roy-azure.azurerm_virtual_machine.hdp[2]
Plan: 0 to add, 0 to change, 1 to destroy.
Do you really want to destroy all resources?
....
Destroy complete! Resources: 1 destroyed.
要创建一个VM,需要一些资源已经具备,这些资源可能包括:
先来一个简单的例子,创建网络:
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
在location
等部分引入了插值(interpolation),它已经在前面的资源定义,之后直接调用,格式是 TYPE.NAME.ATTRIBUTE.
Azure 网络和虚拟机的基础架构如下图所示:
azure_vm_structure.png
把上面的图,变成代码,创建VM需要的整个文件:
# Configure the provider
provider "azurerm" {
version = "=1.20.0"
}
# Create a new resource group
resource "azurerm_resource_group" "rg" {
name = "royTR"
location = "eastasia"
tags {
environment = "TF sandbox"
}
}
# Create virtual network
resource "azurerm_virtual_network" "vnet" {
name = "royTFVnet"
address_space = ["10.0.0.0/16"]
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
}
# Create subnet
resource "azurerm_subnet" "subnet" {
name = "royTFSubnet"
resource_group_name = "${azurerm_resource_group.rg.name}"
virtual_network_name = "${azurerm_virtual_network.vnet.name}"
address_prefix = "10.0.1.0/24"
#address_prefix = "${cidrsubnet(var.cluster_cidr, 8, 10)}"
}
# Create public IP
resource "azurerm_public_ip" "publicip" {
name = "myTFPublicIP"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
public_ip_address_allocation = "dynamic"
}
# Create Network Security Group and rule
resource "azurerm_network_security_group" "nsg" {
name = "myTFNSG"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}
}
# Create network interface
resource "azurerm_network_interface" "nic" {
name = "myNIC"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_security_group_id = "${azurerm_network_security_group.nsg.id}"
ip_configuration {
name = "myNICConfg"
subnet_id = "${azurerm_subnet.subnet.id}"
private_ip_address_allocation = "dynamic"
public_ip_address_id = "${azurerm_public_ip.publicip.id}"
}
}
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
name = "royTFVM"
location = "${azurerm_resource_group.rg.location}"
resource_group_name = "${azurerm_resource_group.rg.name}"
network_interface_ids = ["${azurerm_network_interface.nic.id}"]
vm_size = "Standard_DS1_v2"
storage_os_disk {
name = "myOsDisk"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Premium_LRS"
}
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04.0-LTS"
version = "latest"
}
os_profile {
computer_name = "royvm"
admin_username = "royzeng"
}
os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/royzeng/.ssh/authorized_keys"
key_data = "ssh-rsa AAAAB3Nz{snip}hwhqT9h"
}
}
}
Provisioners 通常用来引导一个资源,在销毁资源前完成清理工作,进行配置管理等。
Provisioners拥有多种类型可以满足多种需求,如:文件传输(file),本地执行(local-exec),远程执行(remote-exec)等 Provisioners可以添加在任何的resource当中:
# Create a Linux virtual machine
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "file" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
source = "newfile.txt"
destination = "newfile.txt"
}
provisioner "remote-exec" {
connection {
type = "ssh"
user = "royzeng"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"ls -a",
"cat newfile.txt"
]
}
}
上面的方式适合有public ip,能够直接连接的机器,对于不能直接连接的vm,通过跳板来实现。
官方的方法,定义 bastion_host
resource "null_resource" "connect_private" {
connection {
bastion_host = "${aws_instance.bastion.public_ip}"
host = "${aws_instance.private.private_ip}"
user = "ubuntu"
private_key = "${file("~/.ssh/id_rsa")}"
}
provisioner "remote-exec" {
inline = ["echo 'CONNECTED to PRIVATE!'"]
}
}
或者
resource "azurerm_virtual_machine" "vm" {
<...snip...>
provisioner "remote-exec" {
connection {
bastion_host= "${azurerm_public_ip.bastion.ip_address}"
type = "ssh"
user = "${var.admin-username}"
private_key = "${file("~/.ssh/id_rsa")}"
}
inline = [
"sudo parted /dev/disk/azure/scsi1/lun0 mklabel msdos",
"sudo parted /dev/disk/azure/scsi1/lun0 mkpart primary 1 100%",
"sudo partprobe",
"sleep 5; sudo mkfs.xfs /dev/disk/azure/scsi1/lun0-part1",
"sudo mkdir /roytest",
"sudo mount /dev/disk/azure/scsi1/lun0-part1 ${var.mount_path[0]}",
"echo 'UUID='`sudo blkid -s UUID -o value $(readlink -f /dev/disk/azure/scsi1/lun0-part1)` ${var.mount_path[0]} 'xfs defaults 0 0' | sudo tee -a /etc/fstab",
"df -hl | grep /dev/sd"
]
}
}
另一种方法,用local-exec 来跳转
provisioner "local-exec" {
## 简化方式
command = "ssh -o "ProxyCommand ssh -q -W %h:%p -i mykey jump_server” -C 'echo hello'"
## 真实环境用的方式
command = <<EOF
sleep 30; ansible-playbook -i '${element(azurerm_network_interface.master_bind.*.private_ip_address, count.index)},' ${local.ansible_ssh_args} ${var.ansible_path}/mount_disk.yml --extra-vars '{
"root_user": "centos",
"deviceName": "/dev/disk/azure/scsi1/lun0",
"mountPath": "${var.mount_path}",
"bind_zone_name": "${var.bind_zone_name}"
}'
EOF
}
为了让ansible 脚本单独运行,而不需要创建或销毁资源,可以用 null_resource 调用 provisioner 来实现。
resource "null_resource" "datanode" {
count = "${var.count.datanode}"
triggers {
instance_ids = "${element(aws_instance.datanode.*.id, count.index)}"
}
provisioner "remote-exec" {
inline = [
...
]
connection {
type = "ssh"
user = "centos"
host = "${element(aws_instance.datanode.*.private_ip, count.index)}"
}
}
}
# file variables.tf
---
variable "prefix" {
default = "royTF"
}
variable "location" { }
variable "tags" {
type = "map"
default = {
Environment = "royDemo"
Dept = "Engineering"
}
}
文件中 location
部分没有定义,运行terraform的时候,会提示输入:
$ terraform plan -out royplan
var.location
Enter a value: eastasia
<...snip...>
This plan was saved to: royplan
To perform exactly these actions, run the following command to apply:
terraform apply "royplan"
命令行输入
$ terraform apply \
>> -var 'prefix=tf' \
>> -var 'location=eastasia'
文件输入
$ terraform apply \
-var-file='secret.tfvars'
默认读取文件 terraform.tfvars
,这个文件不需要单独指定。
环境变量输入
TF_VAR_name
,比如 TF_VAR_location
对于 list 变量
# 定义 list 变量
variable "image-RHEL" {
type = "list"
default = ["RedHat", "RHEL", "7.5", "latest"]
}
# 调用 list 变量
storage_image_reference {
publisher = "${var.image-RHEL[0]}"
offer = "${var.image-RHEL[1]}"
sku = "${var.image-RHEL[2]}"
version = "${var.image-RHEL[3]}"
}
map 是一个可以被查询的表。
variable "sku" {
type = "map"
default = {
westus = "16.04-LTS"
eastus = "18.04-LTS"
}
}
查询方式(使用 lookup)
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "${lookup(var.sku, var.location)}"
version = "latest"
}
定义输出
output "ip" {
value = "${azurerm_public_ip.publicip.ip_address}"
}
测试
$ terraform apply
...
Apply complete! Resources: 0 added, 0 changed, 0 destroyed.
Outputs:
ip = 52.184.97.1
$ terraform output ip
52.184.97.1
Bug? 第一次运行,ip 输出是空的,terraform output ip
命令的结果也是空的,过一段时间才能看到结果。
$ terraform output -module=roy-azure
bastion-private-ip = 10.0.1.4
bastion-public-ip = 40.115.243.72
cluster_cidr = 10.0.0.0/16
cluster_location = japaneast
cluster_prefix = roy-tf0
cluster_resource_group = roy-tf0-rg
hdp-master-ip = 10.0.10.4,10.0.10.6,10.0.10.7
hdp-master-name = roy-tf0-hdp-master-01,roy-tf0-hdp-master-02,roy-tf0-hdp-master-03
hdp-slave-ip = 10.0.10.5,10.0.10.9,10.0.10.8
hdp-slave-name = roy-tf0-hdp-slave-01,roy-tf0-hdp-slave-02,roy-tf0-hdp-slave-03
k8s-master-ip = 10.0.20.8,10.0.20.5
k8s-master-name = roy-tf0-k8s-master-01,roy-tf0-k8s-master-02
k8s-slave-ip = 10.0.20.6,10.0.20.7,10.0.20.4
k8s-slave-name = roy-tf0-k8s-slave-01,roy-tf0-k8s-slave-02,roy-tf0-k8s-slave-03
virtual_network = roy-tf0-vnet
DataSource 的作用可以通过输入一个资源的变量名,然后获得这个变量的其他属性字段。
用 Azure网络 来举例,提供一些信息,查询其它的属性。具体必须提供什么,能查到什么,参考这个链接。
data "azurerm_virtual_network" "test" {
name = "production"
resource_group_name = "networking"
}
output "virtual_network_id" {
value = "${data.azurerm_virtual_network.test.id}"
}
output "virtual_network_subnet" {
value = "${data.azurerm_virtual_network.test.subnets[0]}"
}
ansible通过主机列表来连接目标主机,我们就要想办法让terraform来生成。(local-exec 也是一种方式,这是另一种思路:用terraform 调用ansible)
terraform 生成inventory的思路是:从模板到文件,需要先用template_file 渲染成一个字符串,然后用local_file把这个字符串输出到一个文件。
模版文件
## file inventory.tpl
[backend]
${bastion_private_ip}
[frontend]
${bastion_pub_ip}
[all:vars]
ansible_ssh_private_key_file = ${key_path}
ansible_ssh_user = dcpuser
渲染和输出
## file inventory.tf
data "template_file" "inventory" {
template = "${file("./test/inventory.tpl")}"
vars {
bastion_private_ip = "${element(azurerm_network_interface.bastion-nic.*.private_ip_address, count.index)}"
bastion_pub_ip = "${element(azurerm_public_ip.bastion.*.ip_address, count.index)}"
key_path = "~/.ssh/id_rsa"
}
}
resource "local_file" "save_inventory" {
content = "${data.template_file.inventory.rendered}"
filename = "./myhost"
}
运行后,当前目录生成文件myhost
[backend]
13.78.94.242
[frontend]
10.0.1.4
[all:vars]
ansible_ssh_private_key_file = ~/.ssh/id_rsa
ansible_ssh_user = dcpuser
对于多个主机,使用 join
来把它们合在一起。
File inventory.tf
data "template_file" "k8s" {
template = "${file("./templates/k8s.tpl")}"
vars {
k8s_master_name = "${join("\n", azurerm_virtual_machine.k8s-master.*.name)}"
}
}
resource "local_file" "k8s_file" {
content = "${data.template_file.k8s.rendered}"
filename = "./inventory/k8s-host"
}
File k8s.tpl
[kube-master]
${k8s_master_name}
Final result
[kube-master]
k8s-master-01
k8s-master-02
k8s-master-03
Module 是 Terraform 为了管理单元化资源而设计的,是子节点,子资源,子架构模板的整合和抽象。将多种可以复用的资源定义为一个module,通过对 module 的管理简化模板的架构,降低模板管理的复杂度,这就是module的作用。
Terraform中的模块是以组的形式管理不同的Terraform配置。模块用于在Terraform中创建可重用组件,以及用于基本代码组织。每一个module都可以定义自己的input与output,方便代码进行模块化组织。
用模块,可以写更少的代码。比如用下面的代码,调用已有的module 创建vm。
# declare variables and defaults
variable "location" {}
variable "environment" {
default = "dev"
}
variable "vm_size" {
default = {
"dev" = "Standard_B2s"
"prod" = "Standard_D2s_v3"
}
}
# Use the network module to create a resource group, vnet and subnet
module "network" {
source = "Azure/network/azurerm"
version = "2.0.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
address_space = "10.0.0.0/16"
subnet_names = ["mySubnet"]
subnet_prefixes = ["10.0.1.0/24"]
}
# Use the compute module to create the VM
module "compute" {
source = "Azure/compute/azurerm"
version = "1.2.0"
location = "${var.location}"
resource_group_name = "roytest-rg"
vnet_subnet_id = "${element(module.network.vnet_subnets, 0)}"
admin_username = "royzeng"
admin_password = "Password1234!"
remote_port = "22"
vm_os_simple = "UbuntuServer"
vm_size = "${lookup(var.vm_size, var.environment)}"
public_ip_dns = ["roydns"]
}
## file main.cf
module "roy-azure" {
source = "./test"
}
## file test/resource.tf
variable "cluster_prefix" {
type = "string"
}
variable "cluster_location" {
type = "string"
}
resource "azurerm_resource_group" "core" {
name = "${var.cluster_prefix}-rg"
location = "${var.cluster_location}"
}
https://docs.microsoft.com/en-us/azure/virtual-machines/linux/terraform-install-configure
https://learn.hashicorp.com/terraform/azure/install_az
https://www.terraform.io/docs/providers/azurerm/r/virtual_machine.html
Create a VM cluster with Terraform https://docs.microsoft.com/en-us/azure/terraform/terraform-create-vm-cluster-with-infrastructure
http://aukjan.vanbelkum.nl/2016/02/23/Ansible-inventory-from-Terraform/
Terraform Azure modules https://registry.terraform.io/browse?offset=9&provider=azurerm
How to use Ansible with Terraform https://alex.dzyoba.com/blog/terraform-ansible/