FastAPI+React全栈开发10 MongoDB聚合查询

news/2024/10/18 7:48:20/

Chapter02 Setting Up the Document Store with MongoDB

10 Aggregation framework

FastAPI+React全栈开发10 MongoDB聚合查询

In the following pages, we will try to provide a brief introducton to the MongoDB aggregation framework, what it is, what benefits it offers, and why it is regarded as one of the strongest selling points of the MongoDB ecosystem.

在接下来的几页中,我们将简要介绍MongoDB聚合框架,它是什么,它提供了什么好处,以及为什么它被认为是MongoDB生态系统最强大的卖点之一。

Gentered around the concept of a pipeline (something that you might be familiar with if you have done some analytics or if you have ever connected a few commands in Linux), the aggregation framework is, at its simplest, an alternative way to retrieve sets of documents from a collection, it is similar to the find method that we already used extensively but with the additional benefit of the possibility of data processing in different stages or steps.

聚合框架是围绕管道的概念产生的(如果您做过一些分析,或者您曾经在Linux中连接过几个命令,您可能会熟悉这个概念),它是最简单的一种从集合中检索文档集的替代方法,它类似于我们已经广泛使用的find方法,但具有在不同阶段或步骤中进行数据处理的可能性的额外好处。

With the aggregation pipeline, we basically pull documents from a MongoDB collection and feed them sequentially to various stages of the pipeline where each stage output is fed to the next stage’s input until the final set of documents is returned. Each stage performs some data-processing operations on the currently selected documents, which include modifying documents, so the output documents often have a completely different structure.

使用聚合管道,我们基本上从MongoDB集合中提取文档,并依次将它们提供给管道的各个阶段,其中每个阶段的输出被馈送到下一阶段的输入,直到返回最终的文档集。每个阶段对当前选择的文档执行一些数据处理操作,其中包括修改文档,因此输出文档通常具有完全不同的结构。

1、$match: Match only specific documents, i.e. a particular brand.

2、$project: Selcect existing fields or derive new ones, brand and model.

3、$group: Group according to a categorical feature, like brand.

4、$sort: Sort in ascending or descending order using a field.

5、$limit: Limit the results to a predefined number.

1、$match:只匹配特定的文档,即特定的品牌。

2、$project:选择现有领域或衍生新的领域、品牌和模型。

3、$group:根据分类特征进行分组,如品牌。

4、$sort:使用字段按升序或降序排序。

5、$limit:将结果限制在预定义的数量内。

The operations that can be included in the stages are, for example, match, which is used to include only a subset of the entire collection, sorting, grouping, and projections. The MongoDB documentation site is the best place to start if you want to get acquainted with all the possibilities, but we want to start with a couple of simple examples.

这些阶段中可以包含的操作有,例如匹配,它用于只包含整个集合、排序、分组和投影的一个子集。如果您想了解所有的可能性,MongoDB文档站点是最好的起点,但是我们想从几个简单的示例开始。

The syntax for the aggregation is similar to other methods, we use the aggregate method, which takes a list of stages as a parameter.

聚合的语法与其他方法类似,我们使用aggregate方法,它将阶段列表作为参数。

Probably the best aggregation, to begin with, would be to mimic the find method. Let’s try to get all the Fiat cars in our collection as follows.

首先,最好的聚合可能是模仿find方法。让我们按照下面的方式尝试获取我们收集的所有菲亚特汽车。

db.cars.aggregate([{$match:{brand:"Fiat"}}])
import mongo6client = mongo6.MongoClient('mongodb://zhangdapeng:zhangdapeng520@192.168.234.130:27017/')
db = client["carsDB"]
cars = db["cars"]query = [{"$match": {"brand": "Fiat"}}]
r = cars.aggregate(query)
print(list(r))

This is probably the simplest possible aggregation and it consists of just one stage, the $match stage, which tells MongoDB that we only want the Fiats, so the out put of the first stage is exactly that.

这可能是最简单的聚合,它只包含一个阶段,即$match阶段,它告诉MongoDB我们只需要Fiats,因此第一阶段的输出正是如此。

Let’s say that in the second stage we want to group our Fiat cars by model and then check the average price for every model. The second stage is a bit more complicated, but bear with us, it is not that hard. Run the following lines of code.

假设在第二阶段,我们希望按车型对菲亚特汽车进行分组,然后检查每个车型的平均价格。第二阶段有点复杂,但请耐心等待,这并不难。运行以下代码行。

import mongo6client = mongo6.MongoClient('mongodb://zhangdapeng:zhangdapeng520@192.168.234.130:27017/')
db = client["carsDB"]
cars = db["cars"]query = [{"$match": {"brand": "Fiat"}},  # 找到菲亚特的汽车{"$group": {"_id": "$make", "avg_price": {"$avg": "$price"}}},  # 按照make字段分组,求price的平均值
]
r = cars.aggregate(query)
print(list(r))

The second stage uses the KaTeX parse error: Expected '}', got 'EOF' at end of input: …e part {model:"make"} is a bit counterintuitive, but it just gives MongoDB the following two important pieces of information:

  • model: Without quotes or the dollar sign, it is the key that will be used for the grouping, and in our case, it makes sense that it is called model. We can call it any way we want; it is the key that will indicate the field that we are doing the grouping by.
  • $make: It is actually required to be one of the fields present in the documents. In our case, it is called make and the dollar sign means that it is a field in the document. Other possibilities would be the year, the gearbox, and really any document field that has a categorical or ordinal meaning. The price wouldn’t make much sense.

第二阶段使用KaTeX parse error: Expected '}', got 'EOF' at end of input: …的文档键。部分{model:"make"}有点违反直觉,但它只是给MongoDB以下两个重要的信息:

  • model:没有引号或美元符号,它是将用于分组的键,在我们的例子中,它被称为model是有意义的。我们可以随意称呼它;这是一个键,它将指示我们进行分组的字段。
  • $make:它实际上需要是文档中存在的字段之一。在我们的示例中,它被称为make,美元符号表示它是文档中的一个字段。其他可能是年份、变速箱,以及任何具有分类或顺序含义的文档字段。这个价格不太合理。

The second argument in the group stage is the actual aggregation, as follows:

  • avgPrice: This is the chosen name for the quantity that we wish to map. In our case, it makes sense to call it avgPrice, but we can choose this variable’s name as we please.
  • $avg: This is one of the available aggregation functions such as average, count, sum, maximum, minimum, and so on. In this example, we could have used the minimum function instead of the average function in order to get the cheapest Fiat for every model.
  • $price: like $make in the preceding part of the expression, this is a field belonging to the documents and it should be numeric, since calculating the average or the minimum of a sting doesn’t make much sense.

小组阶段的第二个参数是实际的聚合,如下所示:

  • avgPrice:这是我们希望映射的数量的选择名称。在我们的示例中,将其称为avgPrice是有意义的,但是我们可以根据需要选择这个变量的名称。
  • $avg:这是一个可用的聚合函数,如average, count, sum, maximum, minimum等。在这个例子中,我们可以使用最小函数而不是平均函数,以便为每个型号获得最便宜的菲亚特。
  • p r i c e : 就像表达式前面的 price:就像表达式前面的 price:就像表达式前面的make一样,这是一个属于文档的字段,它应该是数字的,因为计算平均值或最小值没有多大意义。

Pipelines can also include data processing through the project operator, a handy tool for creating entirely new fields, derived from existing document fields, that are then carried into the next stages.

管道还可以包括通过项目操作员进行的数据处理,这是一种方便的工具,用于创建从现有文档字段派生的全新字段,然后将其带入下一阶段。

We will provide just another example to showcase the power of project in a pipeline stage. Let’s consider the following aggregation.

我们将提供另一个例子来展示项目在管道阶段的力量。让我们考虑下面的聚合。

import mongo6client = mongo6.MongoClient('mongodb://zhangdapeng:zhangdapeng520@192.168.234.130:27017/')
db = client["carsDB"]
cars = db["cars"]query = [{"$match": {"brand": "Opel"}},  # 查找{"$project": {"_id": 0, "price": 1, "year": 1, "fullName": {"$concat": ["$make", " ", "$brand"]}}},  # 过滤{"$group": {"_id": {"make": "$fullName"}, "avgPrice": {"$avg": "$price"}}},  # 分组{"$sort": {"avgPrice": -1}},  # 排序,根据平均价格降序{"$limit": 10},  # 限制返回数量
]
r = cars.aggregate(query)
print(list(r))

This might look intimidating at first, but it is mostly composed of elements that we have already seen. There is the $match stage (we select only the Opel cars), and there is sorting by the price in descending order and cutting off at the 10 priciest cars at the end. But the projection in the middle? It is just a way to craft new variables in a stage using existing ones.

乍一看可能有点吓人,但它主要是由我们已经见过的元素组成的。有$match阶段(我们只选择欧宝汽车),还有按价格降序排序,并在最后切断10辆最昂贵的汽车。但是中间的投影呢?这只是一种使用现有变量在阶段中创建新变量的方法。


http://www.ppmy.cn/news/1401943.html

相关文章

使用1panel部署Ollama WebUI(dcoekr版)浅谈

文章目录 说明配置镜像加速Ollama WebUI容器部署Ollama WebUI使用问题解决:访问页面空白 说明 1Panel简化了docker的部署,提供了可视化的操作,但是我在尝试创建Ollama WebUI容器时,遇到了从github拉取镜像网速很慢的问题&#xf…

老项目接入kafka消费信息另一种方式

前言 这次跟大家分享kafka消费的另一种接入实现。其实原因是因为目前这个项目的框架太老了,springboot还是1.5的,直接用注解KafkaListener无法消费的问题。我也不想调这个框架,没工时不说,万一再整出兼容性问题,那问题…

java Web 疫苗预约管理系统用eclipse定制开发mysql数据库BS模式java编程jdbc

一、源码特点 JSP 疫苗预约管理系统是一套完善的web设计系统,对理解JSP java 编程开发语言有帮助,系统具有完整的源代码和数据库,系统主要采用B/S模式开发。开发环境为TOMCAT7.0,eclipse开发,数据库为Mysql5.0,使…

二维码门楼牌管理应用平台建设:构建智慧警务新生态

文章目录 前言一、背景与意义二、平台架构与功能设计三、业务地址与标准地址的关联四、数据关联与应用场景五、总结与展望六、挑战与对策 前言 随着信息化技术的快速发展,二维码门楼牌管理应用平台已成为智慧城市建设的重要组成部分。本文将详细探讨如何通过该平台…

boot整合xfire

最近换了项目组&#xff0c;框架使用的boot整合的xfire&#xff0c;之前没使用过xfire&#xff0c;所以写个例子记录下&#xff0c;看 前辈的帖子 整理下 pom文件 <parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot…

Dockerfile:自定义镜像

Dockerfile 是一个文本文件&#xff0c;其中包含了一系列用于自动化构建Docker镜像的指令。通过编写Dockerfile&#xff0c;开发者能够明确地定义一个软件应用及其运行环境应该如何被封装进一个可移植、可重复构建的Docker镜像中。 第一步&#xff1a;在/tmp文件下新建docker…

[创业之路-102] :结构化思考:产学研人才联合创业公司的特点、优点与困境

目录 前言&#xff1a; 一、什么是产学研 1.1 什么是产学研 1.2 什么是产学研人才联合创业 二、产、学、研的区别、各自的特点 2.1 产业&#xff08;产&#xff09;特点 2.2 其次&#xff0c;学术&#xff08;学&#xff09;特点 2.3 科学研究&#xff08;研&#xff0…

TheMoon 恶意软件短时间感染 6,000 台华硕路由器以获取代理服务

文章目录 针对华硕路由器Faceless代理服务预防措施 一种名为"TheMoon"的新变种恶意软件僵尸网络已经被发现正在侵入全球88个国家数千台过时的小型办公室与家庭办公室(SOHO)路由器以及物联网设备。 "TheMoon"与“Faceless”代理服务有关联&#xff0c;该服务…