postgres 搜索_如何使用Postgres构建实时React和Express Fullstack搜索引擎

钱跃
2023-12-01

postgres 搜索

In this tutorial we will go through and setup a full stack search engine with React as the front end, Node and Express for the server, and PostgreSQL for the database.

在本教程中,我们将逐步设置一个完整的堆栈搜索引擎,以React为前端,以Node和Express为服务器,以PostgreSQL为数据库。

This search engine will be slightly more complex than a simple text search setup. For example, a user will be able to get pluralized forms of words as well as past and present tenses of words. A search of "cats" will also return results for "cat". A search of "walked" will return a result of "walk", and so on.

该搜索引擎将比简单的文本搜索设置稍微复杂一些。 例如,用户将能够获得单词的复数形式以及单词的过去和现在时态。 搜索“猫”也将返回“猫”的结果。 搜索“ walked”将返回“ walk”的结果,依此类推。

Instead of starting from scratch, we can use a simple starter project:

除了从头开始,我们可以使用一个简单的入门项目:

https://github.com/iqbal125/react-hooks-complete-fullstack

https://github.com/iqbal125/react-hooks-complete-fullstack

You can watch a fullstack video version of this tutorial herehttps://www.youtube.com/playlist?list=PLMc67XEAt-yzxRboCFHza4SBOxNr7hDD5

您可以在此处观看本教程的完整视频版本, 网址为https://www.youtube.com/playlist?list=PLMc67XEAt-yzxRboCFHza4SBOxNr7hDD5

PostgreSQLTS向量和TS查询 (PostgreSQL's TS vector and TS query)

To accomplish this complex search functionality we will use PostgreSQL's built in text search functionality.

为了完成这种复杂的搜索功能,我们将使用PostgreSQL内置文本搜索功能

The 2 data types that will make this possible are be PSQL's tsvector and tsquery datatypes.

使这成为可能的2种数据类型是PSQL的tsvectortsquery数据类型。

tsvector: a list of lexemes. A lexeme is a word that allows you to merge different variations of that word.  For example, a text of "walked" will be converted and saved as a lexeme of "walk". This will return results for text searches of "walk", "walking" and "walked".  

tsvector词素列表。 词素是一个单词,可让您合并该单词的不同变体。 例如,文本“ walked”将被转换并保存为“ walk”的词素。 这将返回针对文本搜索“行走”,“行走”和“行走”的结果。

tsquery: is list of lexemes that are compared with tsvectors. A piece of text is first converted to a tsquery then compared with a tsvector to see if there is a match.

tsquery :是与tsvectors比较的词素的列表。 首先将一段文本转换为tsquery,然后与tsvector进行比较以查看是否存在匹配项。

This diagram explains essentially how TS vectorization occurs. When a user submits a post, the post along with the author of the post is converted into a single array of TS vectors and saved as 1 row.

该图从本质上解释了TS矢量化如何发生。 当用户提交帖子时,该帖子以及该帖子的作者将转换为TS向量的单个数组,并保存为1行。

Also duplicates are removed and the base form of the word is used as the lexeme.

同样,也删除了重复项,并将单词的基本形式用作词素。

真实的例子 (Real World Example )

Say you submit a post with the title of "cats" and body of "fishes".

假设您提交的帖子的标题为“猫”,正文为“鱼”。

"cat" will return a search result of "cats"

“ cat”将返回“ cats”的搜索结果

This will also work with non standard pluralization as well, "fish" will return a result for "fishes".

这也适用于非标准的复数形式,“ fish”将返回“ fishes”的结果。

This also applies to present and past tenses of words. Say we have this post of "walking" and "acted":

这也适用于现在时和过去时。 假设我们有“步行”和“执行”的帖子:

"walk" will return a search result of "walking":

“ walk”将返回“ walking”的搜索结果:

Same with "act" and "acted".

与“ act”和“ acted”相同。

If you want to look under the hood, the lexemes look like this in the PSQL database.

如果要深入了解,则词素在PSQL数据库中看起来像这样。

The search vector column is 'cat': 1 'fish':2 'test91': 3. Notice that even though we submitted our post with the title "cats" and body "fishes", the words are converted into the root form.

搜索向量列为'cat':1'fish':2'test91':3.请注意,即使我们提交的帖子标题为“ cats”和正文为“ fishes”,这些词也会转换为根形式。

This is essentially what allows for comparisons with other forms of the word and makes this complex searching possible.  

这本质上是允许与其他形式的单词进行比较,并使这种复杂的搜索成为可能。

If that sounds good we can get started with the code setup.  

如果听起来不错,我们可以开始进行代码设置。

React设置 (React setup)

//posts.js

.... 
const handleSearch = (event) => {
   setState({posts_search: []});
   const search_query = event.target.value
   axios.get('/api/get/searchpost', {params: {search_query: search_query} })
     .then(res => res.data.length !== 0
                    ? setState({posts_search: [...res.data]})
                    : null )
     .catch(function (error) {
       console.log(error);
       })
   }
   
....

    <TextField
      id="search"
      label="Search"
      margin="normal"
      onChange={handleSearch}
    />
    
 ...

We only really need 2 main parts on our front end to make this happen. The function that makes the API call to the server and the input element that fires the function on every keystroke.

我们真的只需要在前端包含两个主要部分即可实现这一目标。 对该服务器进行API调用的函数,以及在每次击键时都会触发该函数的输入元素。

the handleSearch() function essentially extracts the text from the input element and sends it as a parameter in an axios get request.

handleSearch()函数本质上是从输入元素中提取文本,并将其作为参数发送到axios get请求中。

This can easily be inserted into any React component.

可以轻松地将其插入任何React组件中。

This is really it for the React setup. The real magic happens on the Server and database side.

对于React设置来说确实如此。 真正的魔力发生在服务器和数据库端。

数据库设置 (Database Setup)

Here is the SQL schema for the posts. Notice that we only have one column search_vector of data type TSVECTOR. We dont have a TSQUERY column since the query is not stored in our database it is just used as a comparison.

这是帖子的SQL模式。 请注意,我们只有一个列search_vector数据类型TSVECTOR 。 我们没有TSQUERY列,因为查询没有存储在我们的数据库中,它只是用作比较。

CREATE TABLE posts (
  pid SERIAL PRIMARY KEY,
  title VARCHAR(255),
  body VARCHAR,
  search_vector TSVECTOR,
  user_id INT REFERENCES users(uid),
  author VARCHAR REFERENCES users(username),
  date_created TIMESTAMP,
  like_user_id INT[] DEFAULT ARRAY[]::INT[],
  likes INT DEFAULT 0
);

This search vector column will contain the lexemes for the title, body and author of the post combined into 1 array. We can see how this is used on the server setup.

此搜索向量列将包含帖子的标题,正文和作者的词素,这些词素被组合成1个数组。 我们可以看到如何在服务器设置上使用它。

服务器设置 (Server setup)

//Search Posts
router.get('/api/get/searchpost', (req, res, next) => {
  search_query = String(req.query.search_query)
  pool.query(`SELECT * FROM posts
              WHERE search_vector @@ to_tsquery($1)`,
    [ search_query ], (q_err, q_res) => {
    if (q_err) return next(q_err);
    res.json(q_res.rows);
  });
});

//Save posts to db
router.post('/api/post/posttodb', (req, res, next) => {
  const body_vector = String(req.body.body)
  const title_vector = String(req.body.title)
  const username_vector = String(req.body.username)

  const search_array = [title_vector,
                         body_vector, 
                         username_vector]
  
  const values = [req.body.title, 
                  req.body.body, 
                  search_array, 
                  req.body.uid, 
                  req.body.username]
  
  pool.query(`INSERT INTO
              posts(title, body, search_array, user_id, author, date_created)
              VALUES($1, $2, to_tsvector($3), $4, $5, NOW())`,
    values, (q_err, q_res) => {
    if (q_err) return next(q_err);
    res.json(q_res.rows);
  });
});

The search engine works because of what we do at the time we save the posts not when the search is taking place.

搜索引擎之所以能够工作,是因为我们在保存帖子时(而不是在进行搜索时)所做的事情。

You can see in our second function we start by turning our title, body and author of our post into strings then we combine them in an array called search_array.

您可以在第二个函数中看到,我们将标题的标题,正文和作者转成字符串,然后将它们组合到一个名为search_array的数组中。

Then we use a simple SQL insert command to insert the entire post into the database. While we do this we also run the to_tsvector() function on our search_array.

然后,我们使用一个简单SQL插入命令将整个帖子插入数据库。 在执行此操作的同时,我们还在search_array上运行to_tsvector()函数。

to_tsvector() is a given PSQL function and is what turns our array into a tsvector and allows for search later on.

to_tsvector()是给定的PSQL函数,可以将数组变成tsvector并允许以后进行搜索。

Then searching becomes simple at this point. We just get our text from the front end and convert it into a string.

此时搜索变得很简单。 我们只是从前端获取文本并将其转换为字符串。

Then we use the to_tsquery() function to turn it into a tsquery data type. We can then use this ts_query to check the search_vector column and see if there is a match with the @@ operator.

然后,我们使用to_tsquery()函数将其转换为tsquery数据类型。 然后,我们可以使用此ts_query检查search_vector列,并查看是否与@@运算符匹配。

If yes we return the matching posts. Then the matching posts will be returned to our front end as a regular API request and will resolve as a promise.

如果是,我们返回匹配的帖子。 然后,匹配的帖子将作为常规API请求返回到我们的前端,并将作为承诺解决。

Since React is a Single page app the browser will not reload and the search will feel real time.  

由于React是单页应用程序,因此浏览器将不会重新加载,并且搜索将是实时的。

Thanks for Reading!

谢谢阅读!

Connect with me on Twitter for more updates on future tutorials: https://twitter.com/iqbal125sf

在Twitter上与我联系以获取未来教程的更多更新: https//twitter.com/iqbal125sf

翻译自: https://www.freecodecamp.org/news/react-express-fullstack-search-engine-with-psql/

postgres 搜索

 类似资料: