Avoiding Row-by-Row Processing 避免逐行处理

news/2024/12/22 15:09:08/
Avoiding Row-by-Row Processing 避免逐行处理

A set-based program and row-by-row processing are not mutually exclusive: some rules do call for rowby-row processing, but these rules are the exceptions. You can have a row-by-row component within a mostly set-based program.


For example, suppose your program contains five rules that you will run against your data. Four of those rules lend themselves well to a set-based approach, while the fifth requires a row-by-row process. In this situation, run the four set-based steps or rules first, and then run the row-by-row step last to resolve the exceptions. Although not pure set-based processing, you will obtain better performance than if the entire program used a row-by-row approach.


When performing a row-by-row update, reduce the number of rows and the number of columns that you select to an absolute minimum to decrease the data transfer time.


For logic that cannot be coded entirely in set, try to process most of the transactions in set and process only the exceptions in a row-by-row loop. A good example of an exception is the sequence numbering of detail lines within a transaction when most transactions have only a single detail line. You can set the sequence number on all the detail lines to 1 by default in an initial set-based operation, and then carry out a Select statement to retrieve only the exceptions (duplicates) and update their sequence numbers to 2, 3, and so on.


Avoid the tendency to expand row-by-row processing for more than is necessary. For example, if you are touching all of the rows in a table in a specific row-based process, you do not necessarily gain efficiency by running the rest of your logic on that table in a row-based manner.


When updating a table, you can add another column to be set in the Update statement. However, do not add another SQL statement to your loop simply because your program is looping. If you can apply that SQL in a set-based manner, then in most cases you achieve better performance with a set-based SQL statement outside the loop.


The rest of this section describes techniques for avoiding row-by-row processing and enhancing performance.


Filtering 过滤

Using SQL, filter the set to contain only those rows that are affected or meet the criteria and then run the rule on them. Use a Where clause to minimize the number of rows to reflect only the set of affected rows.


Two-Pass Approach 双通道法

Use a two-pass approach, wherein the first pass runs a rule on all of the rows and the second pass resolves any rows that are exceptions to the rule. For instance, bypass exceptions to the rule during the first pass, and then address the exceptions individually in a row-by-row manner.


Parallel Processes 并行处理

Divide sets into distinct groups and then run the appropriate rules or logic against each set in parallel processes. For example, you could split an employee data population into distinct sets of hourly and salary employees, and then you could run the appropriate logic for each set in parallel.


Flat Temporary Tables 平面临时表

Flatten your temporary tables. The best temporary tables are denormalized and follow a flat file model for improved transaction processing.


For example, payroll control data might be keyed by setID and effective dates rather than by business unit and accounting date. Use the temporary table to denormalize the data and switch the keys to business unit and accounting date. Afterwards, you can construct a straight join to the Time Clock table and key it by business unit and date.

例如,工资控制数据可能按setID和有效日期而不是按业务单位和会计日期键入。使用临时表对数据进行非规范化,并将键切换到业务单位和会计日期。然后,您可以构造到Time Clock表的直接联接,并按业务单元和日期对其进行键。

Techniques to Avoid 避免的技巧

Note that:


  • If you have a series of identical temporary tables, examine your refinement process.
  • 如果您有一系列相同的临时表,请检查您的细化过程。
  • You should not attempt to accomplish a task that your database platform does not support, as in complex mathematics, non-standard SQL, and complex analytical modeling.
  • 你不应该试图完成你的数据库平台不支持的任务,如复杂的数学,非标准SQL和复杂的分析建模。

Use standard SQL for set processing.


  • Although subqueries are a useful tool for refining your set, make sure that you are not using the same one multiple times.
  • 虽然子查询是一个有用的工具,为完善您的集,请确保您没有使用同一个多次。

If you are using the same subquery in more than one statement, you should probably have denormalized the query results into a temporary table. Identify the subqueries that appear frequently and, if possible, denormalize the queried data into a temporary table.





基于 Spring Boot 搭建一个定时发送邮件的项目可以按照以下步骤进行: 创建一个新的 Spring Boot 项目,并添加所需的依赖。在 pom.xml 文件中添加以下依赖项(根据你的需要进行调整): xml org.springframework.boot sp…

crmchat安装搭建教程文档 bug问题调试

一、安装PHP插件:fileinfo、redis、swoole4。 二、删除PHP对应版本中的 proc_open禁用函数。 一、设置网站运行目录public, 二、设置PHP版本选择纯静态。 三、可选项如有需求则开启SSL,配置SSL证书,开启强制https域名。 四、添加反向代理。 …

C#,怎么修改(VS)Visual Studio 2022支持的C#版本

一些文字来自于 Microsoft . (只需要读下面的红色文字即可!) 1 C# 语言版本控制 最新的 C# 编译器根据项目的一个或多个目标框架确定默认语言版本。 Visual Studio 不提供用于更改值的 UI,但可以通过编辑 .csproj 文件来更改值。…

Apache POI(Java)

一、Apache POI介绍 Apache POI是Apache组织提供的开源的工具包(jar包)。大多数中小规模的应用程序开发主要依赖于Apache POI(HSSF XSSF)。它支持Excel 库的所有基本功能; 文本的导入和导出是它的主要特点。 我们可以使用 POI 在…


说在前面 不管前端还是后端,大家或多或少都了解使用过echarts图表吧,很多时候我们只是需要展示指定区间的数据,但有时我们希望在图表上能够轻松地切换数据的展示区间,以便更清晰地观察特定时间段或区域的变化。在本文中&#xff0…


现状 社区不支持喔,以后也不会有了。曾经尝试过,难道是是太难了,无法实现吗?因为他们企业版支持了,可能是利益相关吧,谁知道呢,毕竟开源也要赚钱,谁乐意一直付出没有回报呢。 社区…

【鸿蒙应用ArkTS开发系列】- 云开发入门简介

目录 概述开发流程工程概览工程模板工程结构 工程创建与配置 概述 HarmonyOS云开发是DevEco Studio新推出的功能,可以让您在一个项目工程中,使用一种语言完成端侧和云侧功能的开发。 基于AppGallery Connect Serverless构建的云侧能力,开发…


引言: 在Java编程中,有时候我们需要处理一些特定的错误或非预期情况,而Java提供了自定义异常类的机制,使得我们能够根据业务需求创建和管理自定义的异常。本篇博客将详细讨论Java中自定义异常类的相关知识,包括…