Technology

The following is one example of code used as input to the SequenceL™ compiler to produce a parallel matrix multiply application. The example is compared to an identical matrix multiply application as written in Haskell, a parallel programming language, to show the simplicity of the SequenceL™ representation.

Compact coding syntax

Matrix multiply application written for the SequenceL™ compiler:

Matmul (x(2), y(2)) [i,j] := sum ( x[i, all] * y[all, j] ) ;

Matrix multiply application written in parallel programming language Haskell (an efficient language for parallel computing):

multMat :: [[Int]] -> [[Int]] -> [[Int]] 
multMat m1 m2 = (multMatT m1 (transpose m2)) 
multMatT :: [[Int]] -> [[Int]] -> [[Int]] 
multMatT m1 m2T = 
[[multVec row col|col <- m2T]|row <- m1] 
multVec :: [Int] -> [Int] -> Int
multVec v1 v2 = sum (zipWith (*) v1 v2)
multMatPar::Int->[[Int]]->[[Int]]-> [[Int]] 
multMatPar z m1 m2 = 
(multMat m1 m2) `using` strat z
strat = blockStrat
lineStrat c = parListChunk c rnf
blockStrat c matrix -- best?
    = let blocks = concat 
(splitIntoClusters numB matrix) -- result splitted
                                 -- in numB * numB blocks
numB  = round (sqrt (fromIntegral (length 
matrix) / fromIntegral c))
-- approx. same num/granularity of sparks as in others...
      in parList rnf block
type Vector = [Int]
type Matrix = [Vector]
splitIntoClusters :: Int -> Matrix -> [[Matrix]]
splitIntoClusters c m | c < 1 = splitIntoClusters 1 m
splitIntoClusters c m1 = mss
  where bh  = kPartition (length m1) c
bhsplit [] [] = []
bhsplit [] _  = error 
"some elements left over"
bhsplit (t:ts) xs = hs : (bhsplit ts rest)
	  	 where (hs,rest) = splitAt t xs   
        ms = bhsplit bh m1 -- blocks of rows 
        mss = map (colsplit bh) ms
        colsplit [] _  = []
        colsplit (t:ts) rs
         | head rs == [] = []
         | otherwise = 
(cab:colsplit ts resto)
          where  (cab,resto) = unzip 
(map (splitAt t) rs)
-- helper for splitIntoClusters (formerly bresenham)
kPartition :: Int -> Int -> [Int]
kPartition n k = zipWith (+) ((replicate (n `mod` k) 1) ++ repeat 0)
    (replicate k (n `div` k)

Performance advantage

The coding simplicity has an additional advantage in enabling better performance. The SequenceL™ compiler produces more efficient output code resulting in higher performance when implemented on multicore systems. The following chart shows the relative performance of the matrix multiply code above when implemented on a multicore system.

Products

TMT's multicore solutions are available today in two product offerings:

Contact

Texas Multicore Technologies, Inc.
12912 Hill Country Blvd
Building F, Suite 200
Austin, TX 78738

To learn more about our revolutionary development technologies and solution implementation team, please contact:

For help with specific questions or ongoing testing and evaluations contact: