AlphaFold3 feature_processing_multimer模块的crop_chains函数的功能是对多条链的蛋白质结构预测任务中的MSA(多序列比对)特征和模板特征进行裁剪(cropping)。裁剪的目的是为了控制输入模型的MSA序列数量和模板数量,以适应模型的输入限制或优化计算效率。
源代码:
def crop_chains(chains_list: List[Mapping[str, np.ndarray]],msa_crop_size: int,pair_msa_sequences: bool,max_templates: int
) -> List[Mapping[str, np.ndarray]]:"""Crops the MSAs for a set of chains.Args:chains_list: A list of chains to be cropped.msa_crop_size: The total number of sequences to crop from the MSA.pair_msa_sequences: Whether we are operating in sequence-pairing mode.max_templates: The maximum templates to use per chain.Returns:The chains cropped."""# Apply the cropping.cropped_chains = []for chain in chains_list:cropped_chain = _crop_single_chain(chain,msa_crop_size=msa_crop_size,pair_msa_sequences=pair_msa_sequences,max_templates=max_templates)cropped_chains.append(cropped_chain)return cropped_chainsdef _crop_single_chain(chain: Mapping[str, np.ndarray],msa_crop_size: int,pair_msa_sequences: bool,max_templates: int) -> Mapping[str, np.ndarray]:"""Crops msa sequences to `msa_crop_size`."""msa_size = chain['num_alignments']if pair_msa_sequences:msa_size_all_seq = chain['num_alignments_all_seq']msa_crop_size_all_seq = np.minimum(msa_size_all_seq, msa_crop_size // 2)# We reduce the number of un-paired sequences, by the nu