There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
SIMD enhanced Array operations
SIMDArray FSharp
SIMD and other Performance enhanced Array operations for F#
Example Usage
//Faster mapletarray=[|1..1000|]letsquaredArray= array |> Array.SIMD.map (fun x -> x*x)(fun x -> x*x)// Map and many other functions need one lambda to map the Vector<T>, // and one to handle any leftover elements if array is not divisible by // Vector<T>.Count. In the case of simple arithmetic operations they can// often be the same as shown here. If you arrange your arrays such that // they will never have leftovers, or don't care how leftovers are treated // just pass a nop like so:openSIMDArrayUtilsletarray=[|1;2;3;4;5;6;7;8|]letsquaredArray= array |> Array.SIMD.map (fun x -> x*x) nop
// Some functions can be used just like the existing array functions but run faster// such as create and sum:letnewArray= Array.SIMD.create 10005//create a new array of length 1000 filled with 5letsum= Array.SIMD.sum newArray
// The Performance module has functions that are faster and/or use less memory// via other means than SIMD. Usually by relaxing ordering constraints or adding// constraints to predicates:letdistinctElements= Array.Performance.distinctUnordered someArray
letfilteredElements= Array.Performance.filterLessThan 5 someArray
letfilteredElements= Array.Performance.filterSimplePredicate (fun x -> x*x <100) someArray
Array.Performance.mapInPlace (fun x-> x*x) someArray
// The SIMDParallel module has parallelized versions of some of the SIMD operations:letsum= Array.SIMDParallel.sum array
letmap= Array.SIMDParallel.map (fun x -> x*x) array
// Two extensions are added to System.Threading.Tasks.Parallel, to enable Parallel.For loops// with a stride length efficiently. They also have much less overhead. You can use them to roll your own // parallel SIMD functions, or any parallel operation that needs a stride length > 1// Using:// ForStride (fromInclusive : int) (toExclusive :int) (stride : int) (f : int -> unit)// You can map each Vector in an array and store it in result
Parallel.ForStride 0 array.Length (Vector<^T>.Count)(fun i ->(vf (Vector<^T>(array,i ))).CopyTo(result,i))// Using:// ForStrideAggreagate (fromInclusive : int) (toExclusive :int) (stride : int) (acc: ^T) (f : int -> ^T -> ^T) combiner// You can sum or otherwise aggregate the elements of an array a Vector at a time, starting from an initial accletresult= Parallel.ForStrideAggreagate 0 array.Length (Vector<^T>.Count) Vector<^T>(0)(fun i acc -> acc +(Vector<^T>(array,i)))(fun x acc -> x + acc)//combines the results from each task into a final Vector that is returned
Notes
Only 64 bit builds are supported. Mono should work with 5.0+, but I have not yet tested it. Performance improvements will vary depending on your CPU architecture, width of Vector type, and the operations you apply. For small arrays the core libs may be faster due SIMD overhead.
When measuring performance be sure to use Release builds with optimizations turned on.
Floating point addition is not associative, so results with SIMD operations will not be identical, though often
they will be more accurate, such as in the case of sum, or average.