Go to Top

Swift and Metal GPU Programming on OSX 10.11 / El Capitan

Apple recently (June 2015 on WWDC) announced Metal GPU Programming support for OS X 10.11 El Capitan, which is great news for Metal-developers on iOS!
This posting gives a basic example how to use Swift with Metal on OSX. The code is available on github.com/atveit/SwiftMetalForOSX and I’ve so far tested it on my laptop (Macbook Pro Retina, Mid 2012 with a NVIDIA GeForce GT 650M 1024 MB GPU)

But first, what is Metal?

Apple describes Metal originally as: “Metal provides the lowest-overhead access to the GPU, enabling you to maximize the graphics and compute potential of your iOS 8 app. With a streamlined API, precompiled shaders, and support for efficient multi-threading, Metal can take your game or graphics app to the next level of performance and capability.” – source: https://developer.apple.com/metal/

I consider Metal as an alternative to OpenGL for graphics processing, but also as an alternative to OpenCL and Nvidia’s Cuda
on data-parallel programming for GPUs (also called General Purpose GPU programming – GPGPU).

1. Swift and Metal GPU Programming on OSX 10.11 / El Capitan

This example shows a very simple Metal shader function – doubler – that doubles each element in the input vector

1.1 Summary of steps:

  1. Create a new OSX Cocoa Swift project in Xcode (requires Xcode 7.0 beta or newer)
  2. Add dependency libraries (added Metal and Quartzcore libraries in this case)
  3. Create a Metal Shader method – doubler – in Shaders.metal
  4. Create a (convenience) MetalViewController class in MetalViewController.swift that provides the methods:
    1. setupMetal() – creates metalDevice, metalCommandQueue, metalDefaultLibrary and metalCommandBuffer objects
    2. setupShaderInMetalPipeline(shaderName:String) – returns shader function, computePipelineState and computePipelineErrors objects
    3. createMetalBuffer(var vector:[Float]) – creates a Metal byte buffer from a Swift float array
  5. Create a ViewController  in ViewController.swift that:
    1. creates an input vector and (empty) output vector of same length
    2. creates  Metal Byte buffers for those vectors
    3. creates a MetalComputeCommandEncoder – command encoder (primary object for using Metal)
    4. adds metal input and output buffers to the command encoder
    5. connects the shader function to the command encoder (indirect via computePipelineState object)
    6. configures number of threads and thread groups to use on the GPU for the command encoder
    7. finalizes configuration of command encoder (i.e. ends encoding)
    8. starts metal job by commit() and waits until it is completed
    9. gets output data from the GPU into an NSData object
    10. converts NSData object into a regular Swift Array
    11. prints output array

Conclusion

Have shown a simple example of using Metal for GPU-programming on OSX, there were as you might see some minor changes in the Metal API that came with Xcode 7 beta compared with prior Xcode versions, but it was quite easy to make it work.

Other Metal blog posts I’ve written:

 

Best regards,

Amund Tveit

amund@memkite.com


Appendix – Code

Appendix 1 – Metal Shader – Shaders.metal

#include 
using namespace metal;

kernel void doubler(const device float *inVector [[ buffer(0) ]],
                    device float *outVector [[ buffer(1) ]],
                    uint id [[ thread_position_in_grid ]]) {
    outVector[id] = 2*inVector[id];
}

Appendix 2 – View Controller – ViewController.swift

import Cocoa
import Metal

@available(OSX 10.11, *)
class ViewController: MetalViewController {
    
    override func viewDidLoad() {
        super.viewDidLoad()
        setupMetal()
        let (_, computePipelineState, _) = setupShaderInMetalPipeline("doubler")
        
        // Create input and output vectors, and corresponding metal buffer
        let N = 100
        let inputVector = createInputVector(N)
        let inputMetalBuffer = createMetalBuffer(inputVector)
        var outputVector = [Float](count: N, repeatedValue: 0.0)
        let outputMetalBuffer = createMetalBuffer(outputVector)
        
        // Create Metal Compute Command Encoder and add input and output buffers to it
        metalComputeCommandEncoder = metalCommandBuffer.computeCommandEncoder()
        metalComputeCommandEncoder.setBuffer(inputMetalBuffer, offset: 0, atIndex: 0)
        metalComputeCommandEncoder.setBuffer(outputMetalBuffer, offset: 0, atIndex: 1)
        
        // Set the shader function that Metal will use
        metalComputeCommandEncoder.setComputePipelineState(computePipelineState)
        
        // Find max number of parallel GPU threads (threadExecutionWidth) in computePipelineState
        let threadExecutionWidth = computePipelineState.threadExecutionWidth
        
        // Set up thread groups on GPU
        let threadsPerGroup = MTLSize(width:threadExecutionWidth,height:1,depth:1)
        let numThreadgroups = MTLSize(width:(inputVector.count+threadExecutionWidth)/threadExecutionWidth, height:1, depth:1)
        metalComputeCommandEncoder.dispatchThreadgroups(numThreadgroups, threadsPerThreadgroup: threadsPerGroup)
        
        // Finalize configuration
        metalComputeCommandEncoder.endEncoding()
        
        print("outputVector before job is running: \(outputVector)")
        
        // Start job
        metalCommandBuffer.commit()
        
        // Wait for it to finish
        metalCommandBuffer.waitUntilCompleted()
        
        // Get output data from Metal/GPU into Swift
        let data = NSData(bytesNoCopy: outputMetalBuffer.contents(),
            length: outputVector.count*sizeof(Float), freeWhenDone: false)
        data.getBytes(&outputVector, length:inputVector.count * sizeof(Float))
        print("inputVector = \(inputVector)")
        print("outputVector = \(outputVector)")
        exit(0)
    }
    
    func createInputVector(N: Int) -> [Float] {
        var vector = [Float](count: N, repeatedValue: 0.0)
        for (index, _) in vector.enumerate() {
            vector[index] = Float(index)
        }
        return vector
    }
}

Appendix 3 – MetalViewController – MetalViewController.swift

import Foundation
import Cocoa
import Metal
import MetalKit
import QuartzCore

@available(OSX 10.11, *)
class MetalViewController : NSViewController {
    
    var metalDevice:MTLDevice!
    var metalCommandQueue:MTLCommandQueue!
    var metalDefaultLibrary:MTLLibrary!
    var metalCommandBuffer:MTLCommandBuffer!
    var metalComputeCommandEncoder:MTLComputeCommandEncoder!
    
    
    func setupMetal() {
        // Get access to OSX GPU
        metalDevice = MTLCreateSystemDefaultDevice()
        
        // Queue to handle an ordered list of command buffers
        metalCommandQueue = metalDevice.newCommandQueue()
        
        // Access to Metal functions that are stored in Shaders.metal file, e.g. sigmoid()
        metalDefaultLibrary = metalDevice.newDefaultLibrary()
        
        // Buffer for storing encoded commands that are sent to GPU
        metalCommandBuffer = metalCommandQueue.commandBuffer()
    }
    
    
    func setupShaderInMetalPipeline(shaderName:String) -> (shader:MTLFunction!,
        computePipelineState:MTLComputePipelineState!,
        computePipelineErrors:NSErrorPointer!)  {
            
            var shader = metalDefaultLibrary.newFunctionWithName(shaderName)
            var computePipeLineDescriptor = MTLComputePipelineDescriptor()
            computePipeLineDescriptor.computeFunction = shader
            //        var computePipelineErrors = NSErrorPointer()
            //            let computePipelineState:MTLComputePipelineState = metalDevice.newComputePipelineStateWithFunction(shader!, completionHandler: {(})
            var computePipelineErrors = NSErrorPointer()
            var computePipelineState:MTLComputePipelineState? = nil
            do {
                computePipelineState = try metalDevice.newComputePipelineStateWithDescriptor(computePipeLineDescriptor)
            } catch {
                print("catching..")
            }
            return (shader, computePipelineState, computePipelineErrors)  
    }
    
    func createMetalBuffer(var vector:[Float]) -> MTLBuffer {
        let byteLength = vector.count*sizeof(Float)
        return metalDevice.newBufferWithBytes(&vector, length: byteLength, options: MTLResourceOptions.CPUCacheModeDefaultCache)
    }
}

, , , , ,

About Amund Tveit (@atveit - amund@memkite.com)

Amund Tveit works in Memkite on developing large-scale Deep Learning and Search (Convolutional Neural Network) with Swift and Metal for iOS (see deeplearning.education for a Memkite app video demo). He also maintains the deeplearning.university bibliography (github.com/memkite/DeepLearningBibliography)

Amund previously co-founded Atbrox , a cloud computing/big data service company (partner with Amazon Web Services), also doing some “sweat equity” startup investments in US and Nordic startups. His presentations about Hadoop/Mapreduce Algorithms and Search were among top 3% of all SlideShare presentations in 2013 and his blog posts has been frequently quoted by Big Data Industry Leaders and featured on front pages of YCombinator News and Reddit Programming

He previously worked for Google, where he was tech.lead for Google News for iPhone (mentioned as “Google News Now Looks Beautiful On Your iPhone” on Mashable.com), lead a team measuring and improving Google Services in the Scandinavian Countries (Maps and Search) and worked as a software engineer on infrastructure projects. Other work experience include telecom (IBM Canada) and insurance/finance (Storebrand).

Amund has a PhD in Computer Science. His publications has been cited more than 500 times. He also holds 4 US patents in the areas of search and advertisement technology, and a pending US patent in the area of brain-controlled search with consumer-level EEG devices.

Amund enjoys coding, in particular Python, C++ and Swift (iOS)

3 Responses to "Swift and Metal GPU Programming on OSX 10.11 / El Capitan"

  • Arjun Jain
    August 9, 2015 - 2:45 am Reply

    I get an error at metalDevice.newComputePipelineStateWithDescriptor

  • Jim Witte
    August 22, 2015 - 6:05 am Reply

    When I try to compile on xCode (7 beta 5; 7A176x) for El Capitan beta (15A262e), I get an error on the line:

    computePipelineState = try metalDevice.newComputePipelineStateWithDescriptor(computePipeLineDescriptor

    of:

    cannot invoke ‘newComputePipelineStateWithDescriptor’ with an argument list of type ‘(MTLComputePipelineDescriptor)’

    Does this mean it wants more arguments? (and what would they be) Or has the Metal API and/or Swift syntax changed since this was written?

    Thanks,
    Jim

Leave a Reply to Amund Tveit (@atveit - amund@memkite.com) Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>