Terraform支持Amazon OpenSearch数据采集

关键要点

在今天的发布中，我们介绍了对 Amazon OpenSearch 数据采集的 Terraform 支持。Terraform 是一个基础设施即代码IaC工具，能够高效地构建、部署和管理云资源。OpenSearch 数据采集是一个完全托管的无服务器数据收集器，可以将实时日志、度量和跟踪数据传送到 Amazon OpenSearch Service 域和 Amazon OpenSearch Serverless 集合中。本文将解释如何使用 Terraform 部署 OpenSearch 数据采集管道，并展示如何使用 HTTP 源作为输入，以 Amazon OpenSearch Service 域索引作为输出。

解决方案概述

本文中描述的步骤将使用 Terraform 部署一个公开可访问的 OpenSearch 数据采集管道，以及其他支持此管道所需的资源。我们将实现基于 Terraform 的教程：使用 Amazon OpenSearch 数据采集向域中导入数据。

我们将使用 Terraform 创建以下资源：

资源描述Amazon OpenSearch 域一个公开可访问的域AWS 身份与访问管理角色OpenSearch 数据采集管道所需角色Amazon CloudWatch 日志组存储日志的组OpenSearch 数据采集管道实际的数据采集管道

您创建的管道将 HTTP 源作为输入，并使用 Amazon OpenSearch 作为接收器来保存事件批次。

先决条件

要按照本文中的步骤进行操作，您需要：

一个活跃的 AWS 账户。在本地计算机上安装 Terraform。有关更多信息，请访问安装 Terraform。创建 AWS 资源所需的 IAM 权限，使用 Terraform 管理。安装 awscurl，用于通过命令行使用 AWS Sigv4 认证发送 HTTPS 请求。有关安装该工具的说明，请查看 GitHub 仓库。

创建目录

在 Terraform 中，基础架构作为代码管理，称为项目。Terraform 项目包含各种 Terraform 配置文件，例如 maintf、providertf、variablestf 和 outputtf。我们将创建一个目录，以便连接到 AWS 服务，使用 AWS 命令行界面：

bashmkdir osispipelineterraformexample

切换到该目录：

bashcd osispipelineterraformexample

创建 Terraform 配置

创建文件以定义 AWS 资源：

bashtouch maintf

在 maintf 中输入以下配置并保存文件：

hclterraform { requiredproviders { aws = { source = hashicorp/aws version = gt 536 } }

requiredversion = gt= 120}

provider aws { region = eucentral1}

data awsregion current {}data awscalleridentity current {}locals { accountid = dataawscalleridentitycurrentaccountid}

output ingestendpointurl { value = tolist(awsosispipelineexampleingestendpointurls)[0]}

resource awsiamrole example { name = exampleosisrole assumerolepolicy = jsonencode({ Version = 20121017 Statement = [ { Action = stsAssumeRole Effect = Allow Sid = Principal = { Service = osispipelinesamazonawscom } } ] })}

resource awsopensearchdomain test { domainname = osiexampledomain engineversion = OpenSearch27 clusterconfig { instancetype = r5largesearch } encryptatrest { enabled = true } domainendpointoptions { enforcehttps = true tlssecuritypolicy = PolicyMinTLS12201907 } nodetonodeencryption { enabled = true } ebsoptions { ebsenabled = true volumesize = 10 } accesspolicies = ltltEOF{ Version 20121017 Statement [ { Effect Allow Principal { AWS {awsiamroleexamplearn} } Action es } ]}

EOF}

resource awsiampolicy example { name = osisrolepolicy description = Policy for OSIS pipeline role policy = jsonencode({ Version = 20121017 Statement = [ { Action = [esDescribeDomain] Effect = Allow Resource = arnawses{dataawsregioncurrentname}{localaccountid}domain/ } { Action = [esESHttp] Effect = Allow Resource = arnawses{dataawsregioncurrentname}{localaccountid}domain/ositestdomain/ } ]})}

resource awsiamrolepolicyattachment example { role = awsiamroleexamplename policyarn = awsiampolicyexamplearn}

resource awscloudwatchloggroup example { name = /aws/vendedlogs/OpenSearchIngestion/examplepipeline retentionindays = 365 tags = { Name = AWS Blog OSIS Pipeline Example }}

resource awsosispipeline example { pipelinename = examplepipeline pipelineconfigurationbody = ltltEOT version 2 examplepipeline source http path /testingestionpath processor date fromtimereceived true destination @timestamp sink opensearch hosts [https//{awsopensearchdomaintestendpoint}] index applicationlogs aws stsrolearn {awsiamroleexamplearn} region {dataawsregioncurrentname} EOT maxunits = 1 minunits = 1 logpublishingoptions { isloggingenabled = true cloudwatchlogdestination { loggroup = awscloudwatchloggroupexamplename } } tags = { Name = AWS Blog OSIS Pipeline Example }}

创建资源

初始化目录：

bashterraform init

查看计划以查看将要创建的资源：

bashterraform plan

应用配置并回答 yes 以执行计划：

bashterraform apply

整个过程大约需要 710 分钟。

测试管道

创建资源后，您应该看到 ingestendpointurl 输出显示。复制此值并将其导出为您的环境变量：

bashexport OSISPIPELINEENDPOINTURL=lt将复制的值替换gt

外网VNP加速

使用 awscurl 发送示例日志。将配置文件替换为相应的 AWS 凭证配置文件：

bashawscurl service osis region eucentral1 X POST H ContentType application/json d [{time20140811T1140130000remoteaddr12222622369status404requestGET http//wwwk2proxycom//hellohtml HTTP/11httpuseragentMozilla/40 (compatible WOW64 SLCC2)}] https//OSISPIPELINEENDPOINTURL/testingestionpath

您应该收到 200 OK 的响应。

要验证数据是否已在 OpenSearch 数据采集管道中导入并保存到 OpenSearch 中，请导航至 OpenSearch 并获取其域端点。将下面代码片段中的 ltOPENSEARCH ENDPOINT URLgt 替换为实际域端点并运行：

bashawscurl service es region eucentral1 X GET https//ltOPENSEARCH ENDPOINT URLgt/applicationlogs/search jsonpp

您应该看到如下输出：

介绍 Amazon OpenSearch 数据摄取的 Terraform 支持大数据博客

清理

要销毁您创建的资源，请运行以下命令并在提示时回答 yes：

bashterraform destroy

整个过程大约需要 3035 分钟。

结论

在本文中，我们展示了如何使用 Terraform 部署 OpenSearch 数据采集管道。AWS 提供了各种资源，帮助您快速开始构建 OpenSearch 数据采集管道，并使用 Terraform 进行部署。您可以使用各种内置的管道集成快速将数据导入，例如从 Amazon DynamoDB、Amazon Managed Streaming for Apache KafkaAmazon MSK、[Amazon Security Lake](https//awsamazoncom/securitylake/、Fluent Bit 等。以下 OpenSearch 数据采集蓝图使您能够以最小的配置更改快速构建数据管道，并通过 Terraform 轻松管理它们。要了解更多信息，请查看 Terraform 文档以获取 Amazon OpenSearch 数据采集的支持。