pg 提升子链接源码分析

news/2024/12/2 22:35:52/

pg 上拉子链接源码分析

此分析基于 version 13.8, 在pg 中把子查询分为两类,一类是from 中的子查询; 一类是where 中的子查询,叫做子链接(sublink)。

下面用sublink(子链接)指代where中的子查询, subquery(子查询) 指代from 中的子查询。

1. pull_up_sublinks

void pull_up_sublinks(PlannerInfo *root)
  • 被 subquery_planner 调用;也可能被pull_up_simple_subquery 调用,用于在 pull_up_subqueries 时处理子查询中的sublink。
  • 上拉子链接的入口
  • 调用 pull_up_sublinks_jointree_recurse 来递归的上拉子链接, 传入parse->jointree,返回处理后的jointree
  • 设置 root->parse->jointree 为处理后的 jointree (不为 FromExpr节点类型 则构建为FromExpr节点)

1.1 pull_up_sublinks_jointree_recurse

/** Recurse through jointree nodes for pull_up_sublinks()** In addition to returning the possibly-modified jointree node, we return* a relids set of the contained rels into *relids.*/
static Node *
pull_up_sublinks_jointree_recurse(PlannerInfo *root, Node *jtnode,Relids *relids)
{if (jtnode == NULL){*relids = NULL;}else if (IsA(jtnode, RangeTblRef)){int			varno = ((RangeTblRef *) jtnode)->rtindex;*relids = bms_make_singleton(varno);/* jtnode is returned unmodified */}else if (IsA(jtnode, FromExpr)){FromExpr   *f = (FromExpr *) jtnode;List	   *newfromlist = NIL;Relids		frelids = NULL;FromExpr   *newf;Node	   *jtlink;ListCell   *l;/* First, recurse to process children and collect their relids */foreach(l, f->fromlist){Node	   *newchild;Relids		childrelids;newchild = pull_up_sublinks_jointree_recurse(root,lfirst(l),&childrelids);newfromlist = lappend(newfromlist, newchild);frelids = bms_join(frelids, childrelids);}/* Build the replacement FromExpr; no quals yet */newf = makeFromExpr(newfromlist, NULL);/* Set up a link representing the rebuilt jointree */jtlink = (Node *) newf;/* Now process qual --- all children are available for use */newf->quals = pull_up_sublinks_qual_recurse(root, f->quals,&jtlink, frelids,NULL, NULL);/** Note that the result will be either newf, or a stack of JoinExprs* with newf at the base.  We rely on subsequent optimization steps to* flatten this and rearrange the joins as needed.** Although we could include the pulled-up subqueries in the returned* relids, there's no need since upper quals couldn't refer to their* outputs anyway.*/*relids = frelids;jtnode = jtlink;}else if (IsA(jtnode, JoinExpr)){JoinExpr   *j;Relids		leftrelids;Relids		rightrelids;Node	   *jtlink;/** Make a modifiable copy of join node, but don't bother copying its* subnodes (yet).*/j = (JoinExpr *) palloc(sizeof(JoinExpr));memcpy(j, jtnode, sizeof(JoinExpr));jtlink = (Node *) j;/* Recurse to process children and collect their relids */j->larg = pull_up_sublinks_jointree_recurse(root, j->larg,&leftrelids);j->rarg = pull_up_sublinks_jointree_recurse(root, j->rarg,&rightrelids);/** Now process qual, showing appropriate child relids as available,* and attach any pulled-up jointree items at the right place. In the* inner-join case we put new JoinExprs above the existing one (much* as for a FromExpr-style join).  In outer-join cases the new* JoinExprs must go into the nullable side of the outer join. The* point of the available_rels machinations is to ensure that we only* pull up quals for which that's okay.** We don't expect to see any pre-existing JOIN_SEMI or JOIN_ANTI* nodes here.*/switch (j->jointype){case JOIN_INNER:j->quals = pull_up_sublinks_qual_recurse(root, j->quals,&jtlink,bms_union(leftrelids,rightrelids),NULL, NULL);break;case JOIN_LEFT:j->quals = pull_up_sublinks_qual_recurse(root, j->quals,&j->rarg,rightrelids,NULL, NULL);break;case JOIN_FULL:/* can't do anything with full-join quals */break;case JOIN_RIGHT:j->quals = pull_up_sublinks_qual_recurse(root, j->quals,&j->larg,leftrelids,NULL, NULL);break;default:elog(ERROR, "unrecognized join type: %d",(int) j->jointype);break;}/** Although we could include the pulled-up subqueries in the returned* relids, there's no need since upper quals couldn't refer to their* outputs anyway.  But we *do* need to include the join's own rtindex* because we haven't yet collapsed join alias variables, so upper* levels would mistakenly think they couldn't use references to this* join.*/*relids = bms_join(leftrelids, rightrelids);if (j->rtindex)*relids = bms_add_member(*relids, j->rtindex);jtnode = jtlink;}elseelog(ERROR, "unrecognized node type: %d",(int) nodeTag(jtnode));return jtnode;
}
  • 根据节点类型进行处理,返回修改后的新 jtnode 及 以此jtnode作为root涉及的 relids。
  • 第一次调用,节点类型必为 FromExpr。
  • 节点类型为 RangeTblRef – 叶子节点
    • 获取relids(bitmap), 返回
    • 可能为subquery 表, 这里不处理子查询中的sublink。 在调用pull_up_subqueries (subquery_planner)时处理
  • 节点类型为 FromExpr f=(FromExpr *) jtnode
    • 遍历 f->fromlist, 对于fromlist 中的每个成员递归调用 pull_up_sublinks_jointree_recurse处理(一直处理到叶子节点 RangeTblRef 才返回), 返回处理后的成员。根据处理后的结构构建 newfromlist 和 frelids (涉及的关系集合)
    • 根据 newfromlist 创建新的 FromExpr(newf)。
    • 调用 pull_up_sublinks_qual_recurse处理约束 f->quals(包含子链接,即实际在pull_up_sublinks_qual_recurse 中处理子链接) , 返回新的quals,设置到 newf->quals。
    • *relids = frelids; 返回涉及的relids
    • jtnode = jtlink; 返回新的jtnode
  • 节点类型为 JoinExpr j=jtnode actually copy
    • 对于join的左右子树(j->larg,j->rarg)递归调用 pull_up_sublinks_jointree_recurse处理, 并赋为新值。
    • 根据 j->jointype 处理 join 的关联条件(约束),都调用 pull_up_sublinks_qual_recurse, 传参不同
      • JOIN_INNER 内连接:传入 jtlink, bms_union(leftrelids,rightrelids)
      • JOIN_LEFT 左连接:传入 j->rarg,rightrelids, 只处理右表(nullable)相关子链接,并上拉子链接到rarg
      • JOIN_FULL 全连接, 不处理
      • JOIN_RIGHT 右连接:传入 j->larg,leftrelids, 只处理左表(nullable)相关子链接, 并上拉子链接到larg
    • 构造 relids(leftrelids, rightrelids,j->rtindex)返回
    • 返回新jtnode

1.2 pull_up_sublinks_qual_recurse

/** Recurse through top-level qual nodes for pull_up_sublinks()** jtlink1 points to the link in the jointree where any new JoinExprs should* be inserted if they reference available_rels1 (i.e., available_rels1* denotes the relations present underneath jtlink1).  Optionally, jtlink2 can* point to a second link where new JoinExprs should be inserted if they* reference available_rels2 (pass NULL for both those arguments if not used).* Note that SubLinks referencing both sets of variables cannot be optimized.* If we find multiple pull-up-able SubLinks, they'll get stacked onto jtlink1* and/or jtlink2 in the order we encounter them.  We rely on subsequent* optimization to rearrange the stack if appropriate.** Returns the replacement qual node, or NULL if the qual should be removed.*/
static Node *
pull_up_sublinks_qual_recurse(PlannerInfo *root, Node *node,Node **jtlink1, Relids available_rels1,Node **jtlink2, Relids available_rels2)
{if (node == NULL)return NULL;if (IsA(node, SubLink)){SubLink    *sublink = (SubLink *) node;JoinExpr   *j;Relids		child_rels;/* Is it a convertible ANY or EXISTS clause? */if (sublink->subLinkType == ANY_SUBLINK){if ((j = convert_ANY_sublink_to_join(root, sublink,available_rels1)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink1;*jtlink1 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Any inserted* joins can get stacked onto either j->larg or j->rarg,* depending on which rels they reference.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->larg,available_rels1,&j->rarg,child_rels);/* Return NULL representing constant TRUE */return NULL;}if (available_rels2 != NULL &&(j = convert_ANY_sublink_to_join(root, sublink,available_rels2)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink2;*jtlink2 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Any inserted* joins can get stacked onto either j->larg or j->rarg,* depending on which rels they reference.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->larg,available_rels2,&j->rarg,child_rels);/* Return NULL representing constant TRUE */return NULL;}}else if (sublink->subLinkType == EXISTS_SUBLINK){if ((j = convert_EXISTS_sublink_to_join(root, sublink, false,available_rels1)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink1;*jtlink1 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Any inserted* joins can get stacked onto either j->larg or j->rarg,* depending on which rels they reference.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->larg,available_rels1,&j->rarg,child_rels);/* Return NULL representing constant TRUE */return NULL;}if (available_rels2 != NULL &&(j = convert_EXISTS_sublink_to_join(root, sublink, false,available_rels2)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink2;*jtlink2 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Any inserted* joins can get stacked onto either j->larg or j->rarg,* depending on which rels they reference.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->larg,available_rels2,&j->rarg,child_rels);/* Return NULL representing constant TRUE */return NULL;}}/* Else return it unmodified */return node;}if (is_notclause(node)){/* If the immediate argument of NOT is EXISTS, try to convert */SubLink    *sublink = (SubLink *) get_notclausearg((Expr *) node);JoinExpr   *j;Relids		child_rels;if (sublink && IsA(sublink, SubLink)){if (sublink->subLinkType == EXISTS_SUBLINK){if ((j = convert_EXISTS_sublink_to_join(root, sublink, true,available_rels1)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink1;*jtlink1 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Because* we are underneath a NOT, we can't pull up sublinks that* reference the left-hand stuff, but it's still okay to* pull up sublinks referencing j->rarg.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->rarg,child_rels,NULL, NULL);/* Return NULL representing constant TRUE */return NULL;}if (available_rels2 != NULL &&(j = convert_EXISTS_sublink_to_join(root, sublink, true,available_rels2)) != NULL){/* Yes; insert the new join node into the join tree */j->larg = *jtlink2;*jtlink2 = (Node *) j;/* Recursively process pulled-up jointree nodes */j->rarg = pull_up_sublinks_jointree_recurse(root,j->rarg,&child_rels);/** Now recursively process the pulled-up quals.  Because* we are underneath a NOT, we can't pull up sublinks that* reference the left-hand stuff, but it's still okay to* pull up sublinks referencing j->rarg.*/j->quals = pull_up_sublinks_qual_recurse(root,j->quals,&j->rarg,child_rels,NULL, NULL);/* Return NULL representing constant TRUE */return NULL;}}}/* Else return it unmodified */return node;}if (is_andclause(node)){/* Recurse into AND clause */List	   *newclauses = NIL;ListCell   *l;foreach(l, ((BoolExpr *) node)->args){Node	   *oldclause = (Node *) lfirst(l);Node	   *newclause;newclause = pull_up_sublinks_qual_recurse(root,oldclause,jtlink1,available_rels1,jtlink2,available_rels2);if (newclause)newclauses = lappend(newclauses, newclause);}/* We might have got back fewer clauses than we started with */if (newclauses == NIL)return NULL;else if (list_length(newclauses) == 1)return (Node *) linitial(newclauses);elsereturn (Node *) make_andclause(newclauses);}/* Stop if not an AND */return node;
}
  • 分为三块来处理:1. sublink 2. not clause ; 3. and clause

  • 节点类型为 SubLink pull up any/some/in/exists

    • 只对 ANY_SUBLINK(包括any, some, in) 和 EXISTS_SUBLINK 类型进行处理

    • sublink->subLinkType 类型为 ANY_SUBLINK,则依次对available_rels1 和 available_rels2 调用 convert_ANY_sublink_to_join,若成功则构造 JoinExpr并返回。此时把 sublink 作为一个subquery给 pull up 为 join 的右表, 左表为入参的 jtlink1(fromexpr or JoinExpr)

    • 新JoinExpr的构造(以available_rels1为例):

      • convert_ANY_sublink_to_join 中会创建新的 JoinExpr, 并设置rarg 为sublink转化的subquery, quals设置为对于的约束。
      • 把 jtlink1 作为 larg , 并复制给jtlink1(替换jtlink1)。然后继续对上拉的j->rarg 调用pull_up_sublinks_jointree_recurse,尝试上拉rarg中的子链接。最后对上拉的quals,尝试继续上拉其中的表,与larg 或rarg 关联,构成新的join关系。
    • 若 sublink->subLinkType 为 EXISTS_SUBLINK, 则是调用 convert_EXISTS_sublink_to_join,直接上拉sublink 中的jointree( from clause), 其他逻辑同ANY_SUBLINK

  • 节点为 BoolExpr && NOT_EXPR is_notclause pull up not exists

    • 若为 not sublink 且type 为 EXISTS_SUBLINK,则进行处理。即只支持对not exists 进行pullup
    • 处理逻辑类似上面,只是调用convert_EXISTS_sublink_to_join时设置under_not 参数为true(表明使用antijoin),对上拉的quals调用pull_up_sublinks_qual_recurse 尝试上拉时,只支持上拉后与 rargs join。
  • 节点为 BoolExpr && AND_EXPR is_andclause

    • 遍历((BoolExpr *) node)->args, 对每个条件调用pull_up_sublinks_qual_recurse 处理。

1.2.1 convert_ANY_sublink_to_join

/** convert_ANY_sublink_to_join: try to convert an ANY SubLink to a join** The caller has found an ANY SubLink at the top level of one of the query's* qual clauses, but has not checked the properties of the SubLink further.* Decide whether it is appropriate to process this SubLink in join style.* If so, form a JoinExpr and return it.  Return NULL if the SubLink cannot* be converted to a join.** The only non-obvious input parameter is available_rels: this is the set* of query rels that can safely be referenced in the sublink expression.* (We must restrict this to avoid changing the semantics when a sublink* is present in an outer join's ON qual.)  The conversion must fail if* the converted qual would reference any but these parent-query relids.** On success, the returned JoinExpr has larg = NULL and rarg = the jointree* item representing the pulled-up subquery.  The caller must set larg to* represent the relation(s) on the lefthand side of the new join, and insert* the JoinExpr into the upper query's jointree at an appropriate place* (typically, where the lefthand relation(s) had been).  Note that the* passed-in SubLink must also be removed from its original position in the* query quals, since the quals of the returned JoinExpr replace it.* (Notionally, we replace the SubLink with a constant TRUE, then elide the* redundant constant from the qual.)** On success, the caller is also responsible for recursively applying* pull_up_sublinks processing to the rarg and quals of the returned JoinExpr.* (On failure, there is no need to do anything, since pull_up_sublinks will* be applied when we recursively plan the sub-select.)** Side effects of a successful conversion include adding the SubLink's* subselect to the query's rangetable, so that it can be referenced in* the JoinExpr's rarg.*/
JoinExpr *
convert_ANY_sublink_to_join(PlannerInfo *root, SubLink *sublink,Relids available_rels)
{JoinExpr   *result;Query	   *parse = root->parse;Query	   *subselect = (Query *) sublink->subselect;Relids		upper_varnos;int			rtindex;ParseNamespaceItem *nsitem;RangeTblEntry *rte;RangeTblRef *rtr;List	   *subquery_vars;Node	   *quals;ParseState *pstate;Assert(sublink->subLinkType == ANY_SUBLINK);/** The sub-select must not refer to any Vars of the parent query. (Vars of* higher levels should be okay, though.)*/if (contain_vars_of_level((Node *) subselect, 1))return NULL;/** The test expression must contain some Vars of the parent query, else* it's not gonna be a join.  (Note that it won't have Vars referring to* the subquery, rather Params.)*/upper_varnos = pull_varnos(root, sublink->testexpr);if (bms_is_empty(upper_varnos))return NULL;/** However, it can't refer to anything outside available_rels.*/if (!bms_is_subset(upper_varnos, available_rels))return NULL;/** The combining operators and left-hand expressions mustn't be volatile.*/if (contain_volatile_functions(sublink->testexpr))return NULL;/* Create a dummy ParseState for addRangeTableEntryForSubquery */pstate = make_parsestate(NULL);/** Okay, pull up the sub-select into upper range table.** We rely here on the assumption that the outer query has no references* to the inner (necessarily true, other than the Vars that we build* below). Therefore this is a lot easier than what pull_up_subqueries has* to go through.*/nsitem = addRangeTableEntryForSubquery(pstate,subselect,makeAlias("ANY_subquery", NIL),false,false);rte = nsitem->p_rte;parse->rtable = lappend(parse->rtable, rte);rtindex = list_length(parse->rtable);/** Form a RangeTblRef for the pulled-up sub-select.*/rtr = makeNode(RangeTblRef);rtr->rtindex = rtindex;/** Build a list of Vars representing the subselect outputs.*/subquery_vars = generate_subquery_vars(root,subselect->targetList,rtindex);/** Build the new join's qual expression, replacing Params with these Vars.*/quals = convert_testexpr(root, sublink->testexpr, subquery_vars);/** And finally, build the JoinExpr node.*/result = makeNode(JoinExpr);result->jointype = JOIN_SEMI;result->isNatural = false;result->larg = NULL;		/* caller must fill this in */result->rarg = (Node *) rtr;result->usingClause = NIL;result->quals = quals;result->alias = NULL;result->rtindex = 0;		/* we don't need an RTE for it */return result;
}
  • 转为 JOIN_SEMI
  • 把整个 subquery 作为rargs 提升,而不是提升subquery 中的表,提升后为表和subquery 进行semijoin。subquery 可以通过pull_up_subquery 继续提升。
    • 通过调用addRangeTableEntryForSubquery生成RangeTblEntry
  • ANY 类型子链接如果是相关子链接,即子查询中引用 了父查询的列属性,则不能提升。
  • ANY 类型子链接 sublink->testexpr(Opexpr arg1(left) arg2)没有引用上一层的列,不能提升,这种情况下上层的表和子链接中的表不能构成连接关系。if (bms_is_empty(upper_varnos))
  • 只能与指定的表 构成连接关系 , 即 sublink->testexpr 只能包含 available_rels 中的关系 if (!bms_is_subset(upper_varnos, available_rels))
  • sublink->testexpr 没有列属性也不能提升
  • sublink->testexpr 中不能包含volatile functions
  • 返回设置了rarg(新创建的RangeTblRef,指向新创建的rte(subquery)),quals(sublink->testexpr转换),jointype的 JoinExpr 对象

1.2.2 convert_EXISTS_sublink_to_join

/** convert_EXISTS_sublink_to_join: try to convert an EXISTS SubLink to a join** The API of this function is identical to convert_ANY_sublink_to_join's,* except that we also support the case where the caller has found NOT EXISTS,* so we need an additional input parameter "under_not".*/
JoinExpr *
convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,bool under_not, Relids available_rels)
{JoinExpr   *result;Query	   *parse = root->parse;Query	   *subselect = (Query *) sublink->subselect;Node	   *whereClause;int			rtoffset;int			varno;Relids		clause_varnos;Relids		upper_varnos;Assert(sublink->subLinkType == EXISTS_SUBLINK);/** Can't flatten if it contains WITH.  (We could arrange to pull up the* WITH into the parent query's cteList, but that risks changing the* semantics, since a WITH ought to be executed once per associated query* call.)  Note that convert_ANY_sublink_to_join doesn't have to reject* this case, since it just produces a subquery RTE that doesn't have to* get flattened into the parent query.*/if (subselect->cteList)return NULL;/** Copy the subquery so we can modify it safely (see comments in* make_subplan).*/subselect = copyObject(subselect);/** See if the subquery can be simplified based on the knowledge that it's* being used in EXISTS().  If we aren't able to get rid of its* targetlist, we have to fail, because the pullup operation leaves us* with noplace to evaluate the targetlist.*/if (!simplify_EXISTS_query(root, subselect))return NULL;/** Separate out the WHERE clause.  (We could theoretically also remove* top-level plain JOIN/ON clauses, but it's probably not worth the* trouble.)*/whereClause = subselect->jointree->quals;subselect->jointree->quals = NULL;/** The rest of the sub-select must not refer to any Vars of the parent* query.  (Vars of higher levels should be okay, though.)*/if (contain_vars_of_level((Node *) subselect, 1))return NULL;/** On the other hand, the WHERE clause must contain some Vars of the* parent query, else it's not gonna be a join.*/if (!contain_vars_of_level(whereClause, 1))return NULL;/** We don't risk optimizing if the WHERE clause is volatile, either.*/if (contain_volatile_functions(whereClause))return NULL;/** The subquery must have a nonempty jointree, but we can make it so.*/replace_empty_jointree(subselect);/** Prepare to pull up the sub-select into top range table.** We rely here on the assumption that the outer query has no references* to the inner (necessarily true). Therefore this is a lot easier than* what pull_up_subqueries has to go through.** In fact, it's even easier than what convert_ANY_sublink_to_join has to* do.  The machinations of simplify_EXISTS_query ensured that there is* nothing interesting in the subquery except an rtable and jointree, and* even the jointree FromExpr no longer has quals.  So we can just append* the rtable to our own and use the FromExpr in our jointree. But first,* adjust all level-zero varnos in the subquery to account for the rtable* merger.*/rtoffset = list_length(parse->rtable);OffsetVarNodes((Node *) subselect, rtoffset, 0);OffsetVarNodes(whereClause, rtoffset, 0);/** Upper-level vars in subquery will now be one level closer to their* parent than before; in particular, anything that had been level 1* becomes level zero.*/IncrementVarSublevelsUp((Node *) subselect, -1, 1);IncrementVarSublevelsUp(whereClause, -1, 1);/** Now that the WHERE clause is adjusted to match the parent query* environment, we can easily identify all the level-zero rels it uses.* The ones <= rtoffset belong to the upper query; the ones > rtoffset do* not.*/clause_varnos = pull_varnos(root, whereClause);upper_varnos = NULL;while ((varno = bms_first_member(clause_varnos)) >= 0){if (varno <= rtoffset)upper_varnos = bms_add_member(upper_varnos, varno);}bms_free(clause_varnos);Assert(!bms_is_empty(upper_varnos));/** Now that we've got the set of upper-level varnos, we can make the last* check: only available_rels can be referenced.*/if (!bms_is_subset(upper_varnos, available_rels))return NULL;/* Now we can attach the modified subquery rtable to the parent */parse->rtable = list_concat(parse->rtable, subselect->rtable);/** And finally, build the JoinExpr node.*/result = makeNode(JoinExpr);result->jointype = under_not ? JOIN_ANTI : JOIN_SEMI;result->isNatural = false;result->larg = NULL;		/* caller must fill this in *//* flatten out the FromExpr node if it's useless */if (list_length(subselect->jointree->fromlist) == 1)result->rarg = (Node *) linitial(subselect->jointree->fromlist);elseresult->rarg = (Node *) subselect->jointree;result->usingClause = NIL;result->quals = whereClause;result->alias = NULL;result->rtindex = 0;		/* we don't need an RTE for it *//*LightDB add, 2022/2/9, for S202201136600, semijoin hints*/result->qb_name = ((Query *) sublink->subselect)->qb_name;return result;
}
  • JOIN_ANTI(not exists) 或JOIN_SEMI

  • 直接提升subquery 中的表, 提升后是表之间join

    	if (list_length(subselect->jointree->fromlist) == 1)result->rarg = (Node *) linitial(subselect->jointree->fromlist);elseresult->rarg = (Node *) subselect->jointree;
    
  • 不能包含CTE

  • simplify_EXISTS_query(其他地方也会使用, modify cte 仍需判断)

    • 不能含有set operation(UNION/INTERSECT/EXCEPT), agg, group sets(group by grouping sets((brand,segment),(brand),(segment),())), 窗口函数, set-returning function(返回多行), modify ctes(有insert/update/delete), having 条件, offset (limit), rowmark(for update/share)

    • 对于limit 不为常量不支持, 为常量但不大于0也不支持pull up, 其他情况会把limit 优化掉(直接去掉,不影响语义),然后就可以pull up了.

    • 其他优化:

        query->targetList = NIL;query->groupClause = NIL;query->windowClause = NIL;query->distinctClause = NIL;query->sortClause = NIL;query->hasDistinctOn = false;
      
  • 去除了whereclause(subselect->jointree->quals)的子链接中不能引用外层对象,The rest of the sub-select must not refer to any Vars of the parent query. (Vars of higher levels should be okay, though.)

  • whereclause(subselect->jointree->quals) 需要引用外层对象,即需要为相关子链接。非相关子链接不能pull up

  • whereclause(subselect->jointree->quals)中不能包含volatile 函数

  • 检测引用的上层的对象都在 available_rels 中, 不在则不支持提升。Now that we’ve got the set of upper-level varnos, we can make the last check: only available_rels can be referenced.

  • 返回设置了rarg(subselect->jointree->fromlist| subselect->jointree),quals(subselect->jointree->quals),jointype的 JoinExpr 对象

2. 总结

pg 支持对any 类型(包括in, any, some)的子链接提升,把子链接中的查询提升到from中作为一个子查询与表构成semijoin。此子查询可以被 pull up subquery 流程继续提升。

提升条件:

  • 子链接的查询中不能引用父查询中表的列属性 any (select * from t2 where )

    chuhx@postgres=# explain select * from t1 a where key1=any (select key1 from t2 where a.key1=t2.key1);QUERY PLAN                                   
    -------------------------------------------------------------------------------Seq Scan on t1 a  (cost=0.00..2671.15 rows=1130 width=8)Filter: (SubPlan 1)SubPlan 1->  Index Only Scan using t2_pkey on t2  (cost=0.15..2.17 rows=1 width=4)Index Cond: (key1 = a.key1)
    (5 rows)chuhx@postgres=# 
  • 子链接的条件需要引用父查询中的表的列属性 t1.key1 = any

  • 子链接的条件引用的表需要符合要求available_rels,比如左连接,子链接的条件只能使用右表的字段才能提升

    chuhx@postgres=# explain select * from t1 a left join t3 c on a.key1=c.key1 and a.key2=any (select key1 from t2);QUERY PLAN                                  
    -----------------------------------------------------------------------------Merge Left Join  (cost=196.91..694.38 rows=12769 width=16)Merge Cond: (a.key1 = c.key1)Join Filter: (hashed SubPlan 1)->  Index Scan using key1_i on t1 a  (cost=0.15..45.06 rows=2260 width=8)->  Sort  (cost=158.51..164.16 rows=2260 width=8)Sort Key: c.key1->  Seq Scan on t3 c  (cost=0.00..32.60 rows=2260 width=8)SubPlan 1->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=4)
    (9 rows)chuhx@postgres=# explain select * from t1 a left join t3 c on a.key1=c.key1 and c.key2=any (select key1 from t2);QUERY PLAN                                  
    ------------------------------------------------------------------------------Merge Left Join  (cost=225.45..659.07 rows=25538 width=16)Merge Cond: (a.key1 = c.key1)->  Index Scan using key1_i on t1 a  (cost=0.15..45.06 rows=2260 width=8)->  Sort  (cost=225.30..230.95 rows=2260 width=8)Sort Key: c.key1->  Hash Join  (cost=60.85..99.39 rows=2260 width=8)Hash Cond: (c.key2 = t2.key1)->  Seq Scan on t3 c  (cost=0.00..32.60 rows=2260 width=8)->  Hash  (cost=32.60..32.60 rows=2260 width=4)->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=4)
    (10 rows)
    
  • 子链接的条件中不能使用volatile 函数, 如下使用了random

    chuhx@postgres=# explain select * from t1 a where key1+random()=any (select key1 from t2);QUERY PLAN                          
    --------------------------------------------------------------Seq Scan on t1 a  (cost=38.25..99.10 rows=1130 width=8)Filter: (hashed SubPlan 1)SubPlan 1->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=4)
    (4 rows)chuhx@postgres=# explain select * from t1 a where key1=any (select key1 from t2);QUERY PLAN                            
    ------------------------------------------------------------------Hash Join  (cost=60.85..99.39 rows=2260 width=8)Hash Cond: (a.key1 = t2.key1)->  Seq Scan on t1 a  (cost=0.00..32.60 rows=2260 width=8)->  Hash  (cost=32.60..32.60 rows=2260 width=4)->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=4)
    (5 rows)chuhx@postgres=# 
    

pg 支持对exists/not exists 类型的子链接提升,把子链接中的表提升到from中与表构成semijoin/antijoin。

  • 不能包含CTE

  • 不能含有set operation(UNION/INTERSECT/EXCEPT), agg, group sets(group by grouping sets((brand,segment),(brand),(segment),())), 窗口函数, set-returning function(返回多行), modify ctes(有insert/update/delete), having 条件, offset (limit), rowmark(for update/share)

  • 对于limit 不为常量不支持, 为常量但不大于0也不支持pull up, 其他情况会把limit 优化掉(直接去掉,不影响语义)

    chuhx@postgres=# explain select * from t1 a where exists (select * from t2 where a.key1=t2.key1 limit 1);QUERY PLAN                            
    ------------------------------------------------------------------Hash Join  (cost=60.85..99.39 rows=2260 width=8)Hash Cond: (a.key1 = t2.key1)->  Seq Scan on t1 a  (cost=0.00..32.60 rows=2260 width=8)->  Hash  (cost=32.60..32.60 rows=2260 width=4)->  Seq Scan on t2  (cost=0.00..32.60 rows=2260 width=4)
    (5 rows)chuhx@postgres=# explain select * from t1 a where exists (select * from t2 where a.key1=t2.key1 limit -1);QUERY PLAN                                   
    --------------------------------------------------------------------------------Seq Scan on t1 a  (cost=0.00..4942.45 rows=1130 width=8)Filter: (SubPlan 1)SubPlan 1->  Limit  (cost=0.15..2.17 rows=1 width=8)->  Index Scan using t2_pkey on t2  (cost=0.15..2.17 rows=1 width=8)Index Cond: (key1 = a.key1)
    (6 rows)
    
  • 去除了whereclause(subselect->jointree->quals)的子链接中不能引用外层对象

    chuhx@postgres=# explain select * from t1 a where exists (select * from (select * from t2 where t2.key2=a.key1) x where a.key1=x.key1);QUERY PLAN                                
    --------------------------------------------------------------------------Seq Scan on t1 a  (cost=0.00..4948.10 rows=1130 width=8)Filter: (SubPlan 1)SubPlan 1->  Index Scan using t2_pkey on t2  (cost=0.15..2.18 rows=1 width=0)Index Cond: (key1 = a.key1)Filter: (key2 = a.key1)
    (6 rows)
    
  • whereclause(subselect->jointree->quals) 需要引用外层对象,即需要为相关子链接

    chuhx@postgres=# explain select * from t1 a where exists (select * from t2 where key1>10);QUERY PLAN                          
    --------------------------------------------------------------Result  (cost=0.05..32.65 rows=2260 width=8)One-Time Filter: $0InitPlan 1 (returns $0)->  Seq Scan on t2  (cost=0.00..38.25 rows=753 width=0)Filter: (key1 > 10)->  Seq Scan on t1 a  (cost=0.05..32.65 rows=2260 width=8)
    (6 rows)
    
  • whereclause(subselect->jointree->quals)中不能包含volatile 函数

  • 检测引用的上层的对象都在 available_rels 中, 不在则不支持提升


http://www.ppmy.cn/news/59175.html

相关文章

2023年第二十届五一数学建模竞赛题目 B题超详细思路

详细思路以及发布视频版&#xff0c;大家可以去观看&#xff0c;这里是对应的文字版&#xff0c;内容相差不多。 B题&#xff1a;快递需求分析问题 B题的问题难度不大&#xff0c;难点就在于后几问的模型求解。问题多、模型多、冗杂&#xff0c;就是B题的特点。 难度 A>B…

Flutter 组件抽取:日期(DatePicker)、时间(TimePicker)弹窗选择器【仿照】

简介 仿照《Flutter 仿ios自定义一个DatePicker》实行的日期弹窗选择器&#xff08;DatePicker&#xff09;、时间弹窗选择器&#xff08;TimePicker&#xff09; 效果 范例 class _TestPageState extends State<TestPage> {overridevoid initState() {super.initStat…

每日一题141——字符串中的最大奇数

给你一个字符串 num &#xff0c;表示一个大整数。请你在字符串 num 的所有 非空子字符串 中找出 值最大的奇数 &#xff0c;并以字符串形式返回。如果不存在奇数&#xff0c;则返回一个空字符串 "" 。 子字符串 是字符串中的一个连续的字符序列。 示例 1&#xff1…

MySQL索引概述

MySQL索引概述 当表中的数据量到达几十万甚至上百万的时候&#xff0c;SQL查询所花费的时间会很长&#xff0c;导致业务超时出错&#xff0c;此时就需要用索引来加速SQL查询。 由于索引也是需要存储成索引文件的&#xff0c;因此对索引的使用也会涉及磁盘I/O操作。如果索引创建…

探索三维世界【2】:Three.js 的 Texture 纹理

缤纷三维世界大揭秘&#xff1a;探索 Three.js 的 Texture 纹理 1、Texture纹理2、TextureLoader 纹理加载器2.1、创建纹理加载器2.2、纹理属性设置2.3、设置纹理渲染2.4、打光 3、完整代码与展示 1、Texture纹理 Texture 是 three.js 中的“纹理”概念。纹理是指将一张图像映…

Addictive Multiplicative in NN

特征交叉是特征工程中的重要环节&#xff0c;在以表格型&#xff08;或结构化&#xff09;数据为输入的建模中起到了很关键的作用。 特征交互的作用&#xff0c;一是尽可能挖掘对目标有效的模式、特征&#xff0c;二是具有较好的可解释性&#xff0c;三是能够将对数据的洞见引…

基于max30102的物联网病房监测系统(中断处理和主题逻辑)

目录 五、中断处理 六、主体框架 对采集数据的初始化 核心功能的实现 烟雾 通信帧格式 wifi接收数据的处理 OLED显示 五、中断处理 void SysTick_Handler(void) {TimingDelay_Decrement(); }void ESP8266_USART_INT_FUN(void) {uint8_t ucCh;if ( USART_GetITStatus (…

CSS(二)-- 选择器的运用(针对基本选择器和复合选择器的详细解析)

目录 1. 选择器的作用 2. 选择器的分类 3. 基本选择器 3.1 标签选择器 3.2 类选择器