本教程系列将涵盖txtai的主要用例,这是一个 AI 驱动的语义搜索平台。该系列的每章都有相关代码,可也可以在colab 中使用。
colab 地址
txtai API 是由FastAPI支持的基于 Web 的服务。所有 txtai 功能,包括相似性搜索、提取 QA 和零样本标记都可以通过 API 获得。
本文安装了 txtai API 并展示了一个使用 txtai 支持的每种语言绑定的示例。
安装txtai
和所有依赖项。由于本文使用了API,我们需要安装api extras包。
pip install txtai[api]
我们将尝试的第一种方法是通过 Python 直接访问。我们将在此处的所有示例中使用零样本标记。有关零样本分类的更多详细信息,请参阅此文章。
import os
from IPython.core.display import display, HTML
from txtai.pipeline import Labels
def table(rows):
html = """
<style type='text/css'>
@import url('https://fonts.googleapis.com/css?family=Oswald&display=swap');
table {
border-collapse: collapse;
width: 900px;
}
th, td {
border: 1px solid #9e9e9e;
padding: 10px;
font: 20px Oswald;
}
</style>
"""
html += "<table><thead><tr><th>Text</th><th>Label</th></tr></thead>"
for text, label in rows:
html += "<tr><td>%s</td><td>%s</td></tr>" % (text, label)
html += "</table>"
display(HTML(html))
# Create labels model
labels = Labels()
data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"]
# List of labels
tags = [" Santa Clause", "歷 Reindeer", " Cookies", " Christmas Tree", " Gifts", " Family"]
# Render output to table
table([(text, tags[labels(text, tags)[0][0]]) for text in data])
文本 | 标签 |
---|---|
穿着红色西装说ho ho | 圣诞老人 |
拉着飞行的雪橇 | 歷 驯鹿 |
这是削减和装饰 | 圣诞树 |
圣诞老人把这些放在树下 | 礼物 |
度过假期的最佳方式 | 家人 |
我们再次看到零样本标记的力量。该模型未针对此示例的任何特定数据进行训练。仍然对大型 NLP 模型中存储了多少知识感到惊讶。
现在我们将启动一个 API 实例来运行其余的示例。API 需要一个配置文件才能运行。下面的示例已简化为仅包含标签。有关更详细的配置示例,请参阅此链接。
API 实例在后台启动。
CONFIG=index.yml nohup uvicorn "txtai.api:app" &> api.log & sleep 90
txtai.js 可通过 NPM 获得,可以按如下方式安装。
npm install txtai
对于此示例,我们将克隆 txtai.js 项目以导入示例构建配置。
git clone https://github.com/neuml/txtai.js
以下文件是标签示例的 JavaScript 版本。
import {Labels} from "txtai";
import {sprintf} from "sprintf-js";
const run = async () => {
try {
let labels = new Labels("http://localhost:8000");
let data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"];
// List of labels
let tags = [" Santa Clause", "歷 Reindeer", " Cookies", " Christmas Tree", " Gifts", " Family"];
console.log(sprintf("%-40s %s", "Text", "Label"));
console.log("-".repeat(75))
for (let text of data) {
let label = await labels.label(text, tags);
label = tags[label[0].id];
console.log(sprintf("%-40s %s", text, label));
}
}
catch (e) {
console.trace(e);
}
};
run();
cd txtai.js/examples/node
npm install
npm run build
node dist/labels.js
Text Label
---------------------------------------------------------------------------
Wears a red suit and says ho ho Santa Clause
Pulls a flying sleigh 歷 Reindeer
This is cut down and decorated Christmas Tree
Santa puts these under the tree Gifts
Best way to spend the holidays Family
JavaScript 程序显示的结果与通过 Python 本地运行时相同!
txtai.java 与标准 Java 构建工具(Gradle、Maven、SBT)集成。下面展示了如何将 txtai 作为依赖添加到 Gradle。
implementation 'com.github.neuml:txtai.java:v2.0.0'
对于此示例,我们将克隆 txtai.java 项目以导入示例构建配置。
git clone https://github.com/neuml/txtai.java
以下文件是标签示例的 Java 版本。
import java.util.Arrays;
import java.util.ArrayList;
import java.util.List;
import txtai.API.IndexResult;
import txtai.Labels;
public class LabelsDemo {
public static void main(String[] args) {
try {
Labels labels = new Labels("http://localhost:8000");
List <String> data =
Arrays.asList("Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays");
// List of labels
List<String> tags = Arrays.asList(" Santa Clause", "歷 Reindeer", " Cookies", " Christmas Tree", " Gifts", " Family");
System.out.printf("%-40s %s%n", "Text", "Label");
System.out.println(new String(new char[75]).replace("\0", "-"));
for (String text: data) {
List<IndexResult> label = labels.label(text, tags);
System.out.printf("%-40s %s%n", text, tags.get(label.get(0).id));
}
}
catch (Exception ex) {
ex.printStackTrace();
}
}
}
cd txtai.java/examples
../gradlew -q --console=plain labels 2> /dev/null
Text Label
---------------------------------------------------------------------------
Wears a red suit and says ho ho Santa Clause
Pulls a flying sleigh 歷 Reindeer
This is cut down and decorated Christmas Tree
Santa puts these under the tree Gifts
Best way to spend the holidays Family
Java 程序显示的结果与通过 Python 本地运行时相同!
txtai.rs 可以通过 crates.io 获得,并且可以通过将以下内容添加到您的 cargo.toml 文件来安装
[dependencies]
txtai = { version = "2.0" }
tokio = { version = "0.2", features = ["full"] }
对于此示例,我们将克隆 txtai.rs 项目以导入示例构建配置。首先我们需要安装 Rust。
apt-get install rustc
git clone https://github.com/neuml/txtai.rs
以下文件是标签示例的 Rust 版本。
use std::error::Error;
use txtai::labels::Labels;
pub async fn labels() -> Result<(), Box<dyn Error>> {
let labels = Labels::new("http://localhost:8000");
let data = ["Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"];
println!("{:<40} {}", "Text", "Label");
println!("{}", "-".repeat(75));
for text in data.iter() {
let tags = vec![" Santa Clause", "歷 Reindeer", " Cookies", " Christmas Tree", " Gifts", " Family"];
let label = labels.label(text, &tags).await?[0].id;
println!("{:<40} {}", text, tags[label]);
}
Ok(())
}
cd txtai.rs/examples/demo
cargo build
cargo run labels
Text Label
--------------------------------------------------------------------------------
Wears a red suit and says ho ho Santa Clause
Pulls a flying sleigh 歷 Reindeer
This is cut down and decorated Christmas Tree
Santa puts these under the tree Gifts
Best way to spend the holidays Family
Rust 程序显示的结果与通过 Python 本地运行时相同!
txtai.go 可以通过添加以下导入语句来安装。使用模块时,会自动安装txtai.go。否则使用go get
.
import "github.com/neuml/txtai.go"
对于此示例,我们将创建一个独立的标签流程。首先我们需要安装 Go。
apt install golang-go
go get "github.com/neuml/txtai.go"
以下文件是标签示例的 Go 版本。
package main
import (
"fmt"
"strings"
"github.com/neuml/txtai.go"
)
func main() {
labels := txtai.Labels("http://localhost:8000")
data := []string{"Wears a red suit and says ho ho",
"Pulls a flying sleigh",
"This is cut down and decorated",
"Santa puts these under the tree",
"Best way to spend the holidays"}
// List of labels
tags := []string{" Santa Clause", "歷 Reindeer", " Cookies", " Christmas Tree", " Gifts", " Family"}
fmt.Printf("%-40s %s\n", "Text", "Label")
fmt.Println(strings.Repeat("-", 75))
for _, text := range data {
label := labels.Label(text, tags)
fmt.Printf("%-40s %s\n", text, tags[label[0].Id])
}
}
go run labels.go
Text Label
--------------------------------------------------------------------------------
Wears a red suit and says ho ho Santa Clause
Pulls a flying sleigh 歷 Reindeer
This is cut down and decorated Christmas Tree
Santa puts these under the tree Gifts
Best way to spend the holidays Family
https://dev.to/neuml/tutorial-series-on-txtai-ibg