Partition Manager#
The partitioning module (par
) is based on TritonPart, an open-source
constraints-driven partitioner. par
can be used
to partition a hypergraph or a gate-level netlist.
Highlights#
Start of the art multiple-constraints driven partitioning “multi-tool”
Optimizes cost function based on user requirement
Permissive open-source license
Solves multi-way partitioning with following features:
Multidimensional real-value weights on vertices and hyperedges
Multilevel coarsening and refinement framework
Fixed vertices constraint
Timing-driven partitioning framework
Group constraint: Groups of vertices need to be in same block
Embedding-aware partitioning
Dependency#
We use Google OR-Tools as our ILP solver.
Our recommendation is to follow the OpenROAD DependencyInstaller for installation of this requirement.
Alternatively, you may also install Google OR-Tools following these instructions.
Warning
Due to a build issue, TritonPart is not supported for macOS. Stay tuned to this page for updates!
Main Algorithm#
An overview of the TritonPart algorithm is shown below. It takes as inputs
Hypergraph \(H(V,E)\) in
.hgr
format.Vertex weight \(w_v \in \mathcal{R}_+^m\)
Hyperedge weight \(w_e \in \mathcal{R}_+^n\)
Number of blocks \(K\).
Imbalance factor \(\epsilon\).
User-specified cost function \(\phi\).
There are five main steps in the main algorithm, mainly 1) constraints-driven coarsening, 2) initial partitioning, 3) refinement, 4) cut-overlay clustering and partitioning (COCP), and 5) V-cycle refinement. The steps for the timing-aware algorithm may be found in the next section.
Constraints-Driven Coarsening
The first step involves multilevel coarsening. Specifically, at each level, clusters of vertices are identified, and the merged and represented as a single vertex in the resulting coarser hypergraph. In this algorithm, the First-Choice scheme is used, which traverses the vertices in the hypergraph according to a given ordering and merges pairs of vertices with high connectivity. The connectivity between a pair of vertices \((u,v)\) is measured as follows:
To efficiently manage multiple constraints, the following enhancements are made to the coarsening scheme above:
Fixed Vertex Constraint: Fixed vertices that belong to the same partitioning block are merged into a single vertex.
Grouping Constraint: Vertices that belong to the same group are merged into a single vertex.
Embedding Constraint: The embedding information is incorporated into the heavy-edge rating function. The new connectivity is updated as follows:
where \(\rho\) is a normalization factor set to the average distance between two vertex embeddings. When vertices \(v_1, ... , v_t\) are merged into a single vertex \(v_{coarse}\), the corresponding vertex embedding \(X_{v_{coarse}}\) is defined as the center of gravity of \(t\) vertices:
Community Guidance: Only vertices within the same community are considered for merging.
Tie-breaking mechanism: If multiple neighbor pairs have the same rating score, combine the lexicographically first unmatched vertex to break ties.
Initial Partitioning
After completing the coarsening process, an initial partitioning solution for the coarsest hypergraph \(H_c\) is derived. Two sub-steps are involved in this: the best partitioning solution from random and VILE partitioning is chosen from \(\eta = 50\) runs as a warm-start to the ILP-based partitioner. The optimization is based on only the cut size rather than the cost function \(\phi\) to keep the runtime reasonable.
Refinement
After a feasible solution \(H_{c_\xi}\) is obtained by initial partitioning, uncoarsening and move-based refinement is performed to improve the partitioning solution. Three refinement heuristics are applied in sequence:
\(K\)-way pairwise FM (PM): This addresses multi-way partitioning as concurrent bi-partitioning problems in a restricted version of K-way Fiduccia–Mattheyses (FM) algorithm. First, \(\lfloor K/2\rfloor \) pairs of blocks are obtained, with refinement-specific vertex movements restricted to associated paired blocks. Next, two-way FM is concurrently performed on all the block pairs. finally, a new configuration of block pairs is computed at the end of the PM.
Direct \(K\)-way FM: Using \(K\) priority queues, for each block \(V_i\), establish a priority queue that stores the vertices that can be potentially moved from the current block to block \(V_i\). This queue is ordered according to the gain of the vertices. Gain is defined as the reduction in cost function from the movement of the vertex from the current block to \(V_i\). Next, after a vertex move, each priority queue is updated independently, thus enabling parallel updates via multi-threading. Next, early-stop is implemented by limiting the maximum number of vertices moved to 100 per pass. Finally, the corking effect is mitigated by traversing the priority queue belonging to the vertex with the highest gain and identifying a feasible vertex move.
Greedy Hyperedge Refinement (HER): First, randomly visit all hyperedges. For each hyperedge \(e\) that crosses the partition boundary, determine whether a subset of vertices in \(e\) can be moved without violating the multi-dimensional balance constraints. The objective is to make \(e\) entirely constrained in a block.
Cut-Overlay Clustering and Partitioning (COCP)
Cut-overlay Clustering and Partitioning (COCP) is a mechanism to combine multiple good-quality partitioning solutions to generate an improved solution. To begin, the sets of hyperedges cut in the \(\theta\) candidate solutions are denoted as \(E_1,..., E_\theta \subset E\). First, \(\cup_{i=1}^\theta E_i\) is removed from the hypergraph \(H(V, E)\), resulting in a number of connected components. Next, all vertices within each connected component are merged to form a coarser hypergraph \(H_{overlay}\). If the number of vertices in \(H_{overlay}\) is less than \(thr_{ilp}\), ILP-based partitioning is performed. If not, a single round of constraints-driven coarsening is conducted to further reduce the size of \(H_{overlay}\) and generate a coarser hypergraph \(H_{overlay}^{'}\). Finally, multilevel refinement is performed to further improve the partitioning solution at each level of the hierarchy and return the improved solution \(S^{'}\).
V-Cycle Refinement
Cut-overlay clustering and partitioning produces a high-quality partitioning solution \(S^{'}\). To improve it, there are three phases similar to hMETIS, namely multilevel coarsening, ILP-based partitioning, and refinement. Firstly, in multilevel partitioning, \(S^{'}\) is used as a community guidance for the constraints-driven coarsening. Only vertices within the same block are permitted to be merged to ensure that the current solution \(S^{'}\) is preserved in the coarsest hypergraph \(H_{c_\xi}\). In the ILP-based partitioning phase, if the number of vertices in \(H_{c_\xi}\) does not exceed \(thr_{ilp}\), run ILP-based partitioning to improve \(S^{'}\). Otherwise, continue with \(S^{'}\) in successive iterations of these two steps (default set to 2). The refinement phase is conducted as per step 3.
Timing Aware Algorithm#
par
can also be used as a timing-aware partitioning framework. A slack
propagation methodology is used that optimizes cuts for both timing-critical
and timing-noncritical paths.
Extraction of Timing Paths and Slack Information
First, the top \(P\) timing-critical paths and the slack information of each hyperedge using the wireload model (WLM) is obtained from OpenSTA. The timing cost of each path is then calculated:
where a fixed extra delay \(\Delta\) is introduced for timing guardband, and \(\mu\) (default 2) is the exponent.
The snaking factor of a path \(SF(p)\) is defined as the maximum number of block reentries along the path \(p\). The timing cost of a hyperedge is computed using the timing weight corresponding to hyperedge slack \(slack_e\) and the accumulated timing cost of all paths traversing the hyperedge.
Timing-aware Coarsening
The timing-aware feature is achieved by adding a timing cost of hyperedge \(t_e\) to the connectivity score earlier mentioned. If vertices \((u,v)\) are associated with multiple critical paths, then they are more likely to be merged. This is reflected in the update score function:
Timing-aware Refinement
Timing-aware refinement is based on a similar cost function as the main algorithm. Instead, an additional slack propagation step is performed at the end of each PM/FM/HER pass.
Commands#
Note
Parameters in square brackets
[-param param]
are optional.Parameters without square brackets
-param2 param2
are required.
Partition Netlist#
triton_part_hypergraph
-hypergraph_file hypergraph_file
-num_parts num_parts
-balance_constraint balance_constraint
[-base_balance base_balance]
[-seed seed]
[-vertex_dimension vertex_dimension]
[-hyperedge_dimension hyperedge_dimension]
[-placement_dimension placement_dimension]
[-fixed_file fixed_file]
[-community_file community_file]
[-group_file group_file]
[-placement_file placement_file]
[-e_wt_factors e_wt_factors]
[-v_wt_factors <v_wt_factors>]
[-placement_wt_factors <placement_wt_factors>]
[-thr_coarsen_hyperedge_size_skip thr_coarsen_hyperedge_size_skip]
[-thr_coarsen_vertices thr_coarsen_vertices]
[-thr_coarsen_hyperedges thr_coarsen_hyperedges]
[-coarsening_ratio coarsening_ratio]
[-max_coarsen_iters max_coarsen_iters]
[-adj_diff_ratio adj_diff_ratio]
[-min_num_vertices_each_part min_num_vertices_each_part]
[-num_initial_solutions num_initial_solutions]
[-num_best_initial_solutions num_best_initial_solutions]
[-refiner_iters refiner_iters]
[-max_moves max_moves]
[-early_stop_ratio early_stop_ratio]
[-total_corking_passes total_corking_passes]
[-v_cycle_flag v_cycle_flag ]
[-max_num_vcycle max_num_vcycle]
[-num_coarsen_solutions num_coarsen_solutions]
[-num_vertices_threshold_ilp num_vertices_threshold_ilp]
[-global_net_threshold global_net_threshold]
Options#
Switch Name |
Description |
---|---|
|
Number of partitions. The default value is |
|
Allowed imbalance between blocks. The default value is |
|
Tcl list of baseline imbalance between partitions. The default value is |
|
Random seed. The default value is |
|
Number of vertices in the hypergraph. The default value is |
|
Number of hyperedges in hypergraph. The default value is |
|
Number of dimensions for canvas if placement information is provided. The default value is |
|
Path to hypergraph file. |
|
Path to fixed vertices constraint file. |
|
Path to |
|
Path to |
|
Placement information file, each line corresponds to a group fixed vertices, community, and placement attributes following the hMETIS format. |
|
Hyperedge weight factor. |
|
Vertex weight factors. |
|
Placement weight factors. |
|
Threshold for ignoring large hyperedge (default 200, integer). |
|
Number of vertices of coarsest hypergraph (default 10, integer). |
|
Number of vertices of coarsest hypergraph (default 50, integer). |
|
Coarsening ratio of two adjacent hypergraphs (default 1.6, float). |
|
Number of iterations (default 30, integer). |
|
Minimum difference of two adjacent hypergraphs (default 0.0001, float). |
|
Minimum number of vertices in each partition (default 4, integer). |
|
Number of initial solutions (default 50, integer). |
|
Number of top initial solutions to filter out (default 10, integer). |
|
Refinement iterations (default 10, integer). |
|
The allowed moves for each Fiduccia-Mattheyes (FM) algorithm pass or greedy refinement (default 60, integer). |
|
Describes the ratio \(e\) where if the \(n_{moved vertices} > n_{vertices} * e\), the tool exits the current FM pass. The intention behind this is that most of the gains are achieved by the first few FM moves. (default 0.5, float). |
|
Maximum level of traversing the buckets to solve the “corking effect” (default 25, integer). |
|
Disables v-cycle is used to refine partitions (default true, bool). |
|
Maximum number of |
|
Number of coarsening solutions with different randoms seed (default 3, integer). |
|
Describes threshold \(t\), the number of vertices used for integer linear programming (ILP) partitioning. if \(n_{vertices} > t\), do not use ILP-based partitioning.(default 50, integer). |
|
If the net is larger than this, it will be ignored by TritonPart (default 1000, integer). |
Evaluate Hypergraph Partition#
evaluate_hypergraph_solution
-num_parts num_parts
-balance_constraint balance_constraint
-hypergraph_file hypergraph_file
-solution_file solution_file
[-base_balance base_balance]
[-vertex_dimension vertex_dimension]
[-hyperedge_dimension hyperedge_dimension]
[-fixed_file fixed_file]
[-group_file group_file]
[-e_wt_factors e_wt_factors]
[-v_wt_factors v_wt_factors]
Options#
Switch Name |
Description |
---|---|
|
Number of partitions. The default value is |
|
Allowed imbalance between blocks. The default value is |
|
Number of vertices in the hypergraph. The default value is |
|
Number of hyperedges in hypergraph. The default value is |
|
Path to hypergraph file. |
|
Path to solution file. |
|
Tcl list of baseline imbalance between partitions. The default value is |
|
Path to fixed vertices constraint file. |
|
Path to |
|
Hyperedge weight factor. |
|
Vertex weight factor. |
Partition Netlist#
triton_part_design
[-num_parts num_parts]
[-balance_constraint balance_constraint]
[-base_balance base_balance]
[-seed seed]
[-timing_aware_flag timing_aware_flag]
[-top_n top_n]
[-placement_flag placement_flag]
[-fence_flag fence_flag]
[-fence_lx fence_lx]
[-fence_ly fence_ly]
[-fence_ux fence_ux]
[-fence_uy fence_uy]
[-fixed_file fixed_file]
[-community_file community_file]
[-group_file group_file]
[-solution_file solution_file]
[-net_timing_factor net_timing_factor]
[-path_timing_factor path_timing_factor]
[-path_snaking_factor path_snaking_factor]
[-timing_exp_factor timing_exp_factor]
[-extra_delay extra_delay]
[-guardband_flag guardband_flag]
[-e_wt_factors e_wt_factors]
[-v_wt_factors v_wt_factors]
[-placement_wt_factors placement_wt_factors]
[-thr_coarsen_hyperedge_size_skip thr_coarsen_hyperedge_size_skip]
[-thr_coarsen_vertices thr_coarsen_vertices]
[-thr_coarsen_hyperedges thr_coarsen_hyperedges]
[-coarsening_ratio coarsening_ratio]
[-max_coarsen_iters max_coarsen_iters]
[-adj_diff_ratio adj_diff_ratio]
[-min_num_vertices_each_part min_num_vertices_each_part]
[-num_initial_solutions num_initial_solutions]
[-num_best_initial_solutions num_best_initial_solutions]
[-refiner_iters refiner_iters]
[-max_moves max_moves]
[-early_stop_ratio early_stop_ratio]
[-total_corking_passes total_corking_passes]
[-v_cycle_flag v_cycle_flag ]
[-max_num_vcycle max_num_vcycle]
[-num_coarsen_solutions num_coarsen_solutions]
[-num_vertices_threshold_ilp num_vertices_threshold_ilp]
[-global_net_threshold global_net_threshold]
Options#
Switch Name |
Description |
---|---|
|
Number of partitions. The default value is |
|
Allowed imbalance between blocks. The default value is |
|
Tcl list of baseline imbalance between partitions. The default value is |
|
Random seed. The default value is |
|
Enable timing-driven mode. The default value is |
|
Extract the top n critical timing paths. The default value is |
|
Enable placement driven partitioning. The default value is |
|
Consider fences in the partitioning. The default value is |
|
Fence lower left x in microns. The default value is |
|
Fence lower left y in microns. The default value is |
|
Fence upper right x in microns. The default value is |
|
Fence upper right y in microns. The default value is |
|
Path to fixed vertices constraint file |
|
Path to |
|
Path to |
|
Path to solution file. |
|
Hyperedge timing weight factor (default 1.0, float). |
|
Cutting critical timing path weight factor (default 1.0, float). |
|
Snaking a critical path weight factor (default 1.0, float). |
|
Timing exponential factor for normalized slack (default 1.0, float). |
|
Extra delay introduced by a cut (default 1e-9, float). |
|
Enable timing guardband option (default false, bool). |
|
Hyperedge weight factor. |
|
Vertex weight factor. |
|
Placement weight factor. |
|
Threshold for ignoring large hyperedge. The default value is |
|
Number of vertices of coarsest hypergraph. The default value is |
|
Number of vertices of the coarsest hypergraph. The default value is |
|
Coarsening ratio of two adjacent hypergraphs. The default value is |
|
Number of iterations. The default value is |
|
Minimum ratio difference of two adjacent hypergraphs. The default value is |
|
Minimum number of vertices in each partition. The default value is |
|
Number of initial solutions. The default value is |
|
Number of top initial solutions to filter out. The default value is |
|
Refinement iterations. The default value is |
|
The allowed moves for each Fiduccia-Mattheyes (FM) algorithm pass or greedy refinement. The default value is |
|
Describes the ratio \(e\) where if the \(n_{moved vertices} > n_{vertices} * e\), the tool exists the current FM pass. The intention behind this is that most of the gains are achieved by the first few FM moves. The default value is |
|
Maximum level of traversing the buckets to solve the “corking effect”. The default value is |
|
Disables v-cycle is used to refine partitions. The default value is |
|
Maximum number of vcycles. The default value is |
|
Number of coarsening solutions with different randoms seed. The default value is |
|
Describes threshold \(t\), the number of vertices used for integer linear programming (ILP) partitioning. if \(n_{vertices} > t\), do not use ILP-based partitioning. The default value is |
|
If the net is larger than this, it will be ignored by TritonPart. The default value is |
Evaluation Netlist Partition#
evaluate_part_design_solution
[-num_parts num_parts]
[-balance_constraint balance_constraint]
[-base_balance base_balance]
[-timing_aware_flag timing_aware_flag]
[-top_n top_n]
[-fence_flag fence_flag]
[-fence_lx fence_lx]
[-fence_ly fence_ly]
[-fence_ux fence_ux]
[-fence_uy fence_uy]
[-fixed_file fixed_file]
[-community_file community_file]
[-group_file group_file]
[-hypergraph_file hypergraph_file]
[-hypergraph_int_weight_file hypergraph_int_weight_file]
[-solution_file solution_file]
[-net_timing_factor net_timing_factor]
[-path_timing_factor path_timing_factor]
[-path_snaking_factor path_snaking_factor]
[-timing_exp_factor timing_exp_factor]
[-extra_delay extra_delay]
[-guardband_flag guardband_flag]
[-e_wt_factors e_wt_factors]
[-v_wt_factors v_wt_factors]
Options#
Switch Name |
Description |
---|---|
|
Number of partitions. The default value is |
|
Allowed imbalance between blocks. The default value is |
|
Tcl list of baseline imbalance between partitions. The default value is |
|
Enable timing-driven mode. The default value is |
|
Extract the top n critical timing paths. The default value is |
|
Consider fences in the partitioning. The default value is |
|
Fence lower left x in microns. The default value is |
|
Fence lower left y in microns. The default value is |
|
Fence upper right x in microns. The default value is |
|
Fence upper right y in microns. The default value is |
|
Path to fixed vertices constraint file. |
|
Path to |
|
Path to |
|
Path to hypergraph file. |
|
Path to |
|
Path to solution file. |
|
Hyperedge timing weight factor. The default value is |
|
Cutting critical timing path weight factor. The default value is |
|
Snaking a critical path weight factor. The default value is |
|
Timing exponential factor for normalized slack. The default value is |
|
Extra delay introduced by a cut. The default value is |
|
Enable timing guardband option. The default value is 1 |
|
Hyperedge weight factors. |
|
Vertex weight factors. |
Write Partition to Verilog#
write_partition_verilog
[-port_prefix prefix]
[-module_suffix suffix]
[file]
Options#
Switch Name |
Description |
---|---|
|
Port name prefix. |
|
Module name suffix. |
|
Filename to write partition verilog to. |
Read the Partition file#
read_partitioning
-read_file name
[-instance_map_file file_path]
## Example Scripts
### How to partition a hypergraph in the way you would using hMETIS (min-cut partitioning)
```tcl
triton_part_hypergraph -hypergraph_file des90.hgr -num_parts 5 -balance_constraint 2 -seed 2
You can also check the provided example here.
How to perform the embedding-aware partitioning#
set num_parts 2
set balance_constraint 2
set seed 0
set design sparcT1_chip2
set hypergraph_file "${design}.hgr"
set placement_file "${design}.hgr.ubfactor.2.numparts.2.embedding.dat"
set solution_file "${design}.hgr.part.${num_parts}"
triton_part_hypergraph -hypergraph_file $hypergraph_file -num_parts $num_parts \
-balance_constraint $balance_constraint \
-seed $seed \
-placement_file ${placement_file} -placement_wt_factors { 0.00005 0.00005 } \
-placement_dimension 2
You can find the provided example here.
How to partition a netlist#
# set technology information
set ALL_LEFS “list_of_lefs”
set ALL_LIBS “list_of_libs”
# set design information
set design “design_name”
set top_design “top_design”
set netlist “netlist.v”
set sdc “timing.sdc”
foreach lef_file ${ALL_LEFS} {
read_lef $lef_file
}
foreach lib_file ${ALL_LIBS} {
read_lib $lib_file
}
read_verilog $netlist
link_design $top_design
read_sdc $sdc
set num_parts 5
set balance_constraint 2
set seed 0
set top_n 100000
# set the extra_delay_cut to 20% of the clock period
# the extra_delay_cut is introduced for each cut hyperedge
set extra_delay_cut 9.2e-10
set timing_aware_flag true
set timing_guardband true
set part_design_solution_file "${design}_part_design.hgr.part.${num_parts}"
##############################################################################################
### TritonPart with slack progagation
##############################################################################################
puts "Start TritonPart with slack propagation"
# call triton_part to partition the netlist
triton_part_design -num_parts $num_parts -balance_constraint $balance_constraint \
-seed $seed -top_n $top_n \
-timing_aware_flag $timing_aware_flag -extra_delay $extra_delay_cut \
-guardband_flag $timing_guardband \
-solution_file $part_design_solution_file
You can find the provided example here.
Regression tests#
There are a set of regression tests in ./test
. For more information, refer to this section.
Simply run the following script:
./test/regression
References#
Bustany, I., Kahng, A. B., Koutis, I., Pramanik, B., & Wang, Z. (2023). K-SpecPart: A Supervised Spectral Framework for Multi-Way Hypergraph Partitioning Solution Improvement. arXiv preprint arXiv:2305.06167. (.pdf)
Bustany, I., Gasparyan, G., Kahng, A. B., Koutis, I., Pramanik, B., & Wang, Z. (2023). “An Open-Source Constraints-Driven General Partitioning Multi-Tool for VLSI Physical Design”, Proc. ACM/IEEE International Conference of Computer-Aided Design 2023,(.pdf).
License#
BSD 3-Clause License. See LICENSE file.