MULTIPLICATION OF LARGE OPERANDS
To multiply two multi-word operands, a number e of caching registers is used to cache the values of operand words. The multiplication is done using several runs, which each com¬ prise several parts (R0Q1, R0Q2, R1Q4). In an initial part (R0Q1, R1Q1) words of the operands are loaded into caching registers, and a first set of partial products are processed; the initial part leaves a number e of words of a first operand in caching registers. Because of the cached words of one operand, a sequential inner part (R0Q2, R1Q2; R0Q3, R1Q3) re-uses cached operand words without requiring load operations for that operand, and only words of the other operand are loaded for processing of partial products, preferably according to a product-scanning multiplication method, namely, by grouping together operations for partial products of the same product index (k); each inner part again leaves a number of operand words in caching registers, though of the respective other operand. A final part (R0Q4, R1Q4) processed a final set of partial products using cached operand words.
1. A method for performing a multiplication of two large operands (a, b) on a processing system with a multiplication circuit (MC), said multiplication circuit configured to calculate the product of a pair of word-wide operand inputs into a two-word-wide product result, where a word is a specified number of bits, wherein each of the operands (a, b) is represented by a plurality of contiguous ordered word-wide operand segments (A[i], B[j]), each identified by means of a respective operand index (i, j), and a result (c) of the multiplication is represented by a plurality of contiguous ordered word-wide product segments (C[k]) identified by means of a product index (k), the method comprising processing of partial products by multiplication of operand segments of one of the two operands and operand segments of the other of the two operands according to the steps of: - loading operand segments of the two operands corresponding to specific values of the operand indices (i, j) into the multiplication circuit, with the exception of operand segments that are already held in registers of the multiplication circuit, - performing a multiplication operation on the operand segments in the multiplication circuit to obtain a respective two-word-wide intermediate product (X[i, j]), and - updating product segments by adding the two-word-wide intermediate product to product segments (C[k]) which have a product index value equal to the sum of the operand index values of the operand segments as well as the next product index value, respectively; wherein said processing of partial products is repeated for each value pair ([i, j]) of the two operand indices according to a specified sequence which is composed of runs (Rp, RB, R0', R1', R2'), each of said runs corresponding to a subset of the set of index value pairs, the subsets of different runs being disjoint, and a number (e) of caching registers (CR) is used in each run for caching operand segments of at least one of the operands, said caching registers being at least word-wide registers of the multiplication circuit, characterized in that each run comprises several parts (Q1, Q2, Q3, Q4), wherein each part ischaracterized bya parameter number (Eq) which specifies the number of operand segments of one of the two operands being cached in caching registers and used for processing of partial products, wherein the parts are consecutively: - an initial part (Q1) in which: operand segments of a first of the two operands and at least one operand segment of the respective other operand are loaded into caching registers, partial products are processed for a number of product index values (k), wherein the number of partial products processed for each product index value increases from one to the parameter number (E1) of the initial part, and at the end of the initial part, a number (e) of first operand segments are left in caching registers, which number corresponds to the parameter number (E2, E4) of the next part; - optionally one or more inner parts (Q2, Q3) wherein in each inner part: partial products are processed wherein for each product index value the same number of partial products is processed, which number corresponds to the parameter number (E2, E3) of the respective part, wherein all operand segments of one of the operands used for the partial products in the part are held in caching registers as a result of a respective preceding part, whereas at least one operand segment of the respective other operand is loaded into the multiplication circuit, namely, for each product index processed in the part at least one operand segment, and after processing of the partial products in the respective part, a number of operand segments of the respective other operand are left in caching registers, said number corresponding to the parameter number of the respective next part (Q3, Q4); and - a final part (Q4) in which partial products are processed for a number of product index values (k) where the number of partial products processed for each product index value decreases from the parameter number (E4) of the final part to one, and wherein all operand segments of one of the operands used for the partial products in the final part are held in caching registers as a result of a respective preceding part; wherein at least one of the runs (R0, R1) comprises at least one inner part (Q2, Q3).