Advanced Image Processing with Core Image

Datetime:2016-08-22 22:44:28          Topic: OpenGL           Share

Core Image is Apple’s framework for image processing and analysis. With over 170 built-in filters that can be used alone or together in complex graphs, as well as support for custom image kernels, Core Image offers unlimited creative potential for visual effects that can be applied to still or moving images. In this talk, we’ll look at Core Image from its very basics right through to advanced techniques.

Core Image Introduction

Core Image is an image processing and analysis framework. It’s designed to provide near-real-time image processing for both still and moving images. The framework contains over 170 built-in filters. These include blurs, color adjustments, blends and composites, and also some special effects, such as distortions.

The filters can be used alone or they can be used together, either in a simple, linear chain or in more complex graphs. They’re also designed to be as performant as possible, so some, such as the Gaussian blur, are actually backed by Metal Performance Shaders, and offer really phenomenal speeds, even with very large blur radii. In addition to image processing, Core Image also offers image analysis.

The CIDetector class can search for and identify faces, rectangles, bar codes and also areas of text in both still and moving images. Core Image also plays nicely with other frameworks such as SceneKit and SpriteKit, so we can get scenes created in either of those frameworks, and apply a sort of post-processing effect to add things like motion blur or film grain or color treatments.

Key Core Image Classes

CIKernel

At the heart of every filter, whether it’s built-in filter or coded by us, is at least one kernel, and the kernel is the function that is executed for every single pixel on the final destination image. It actually contains the image processing algorithm that we want to run to generate this output image.

CIFilter

CIFilter is a sort of lightweight, mutable object that we’ll use in Swift to create a final image. Now, most of them, but not all of them, accept an input image and arrange parameters. So, for example, the color adjustment filter accepts four parameters, one is obviously the input image, and it has three additional numeric parameters to control the brightness, contrast and the saturation.

CIImage

Now Core Image has its own image data type called CIImage . However, CIImage doesn’t contain the bitmap data that you might expect. It actually just contains the instructions for how to treat the image. Core Image filters accept CIImage as inputs and send them out as outputs. And it’s only when the output is converted to a renderable format, like a UIImage , for example, that the filters in the chain or the graph are actually executed, so often you’ll hear a CIImage considered to be the recipe for how to construct or create the final image.

CIContext

And the fundamental class for rendering a Core Image output is the Core Image context, CIContext . This is responsible for compiling and running the filters - it represents a drawing destination - either the GPU or the CPU. When you instantiate the CIContext , that is quite an expensive operation, we should create one and reuse the same Core Image context. Because they’re immutable and thread safe, the same context can be shared across different threads.

Querying For Filters

Filters are grouped by category

All the built-in Core Image filters belong to one or more of 21 categories. The categories include a blur category, which contains lots of different blur filters. There’s a color adjustment category that includes filters that change the color values of an image, so this category includes filters for adjusting the hue, you see a hue adjustment, and for changing things like exposure, and the white point of an image.

A more artistic category called Color Effects includes filters such as false color. Lastly all the photo effects filters that come bundled in the iOS camera app all exist inside the color effects category. There’s a Distortion category, and these change the image’s geometry, and they include things like bulging and twisting and warping. The Generator category includes filters that create images without an input image, so it includes things like stripes and solid colors and checkerboards, but also includes things to generate random noise or halos and sunbeams, or a starshine like we see here. And there’s even a Stylize category for simulating painting or sketching.

Core Image provides a single class, a CIFilter , which is used to create all the filters. The framework offers methods for querying the system for the names of all the available filters, and it’s these names that are used to create instances of the filters. Here, we’re querying for all available filters in the blur category. We get back a handful of names of filters, and then we use one of those names, CIZoomBlur , to actually instantiate a zoom blur filter.

CIFilter.filterNamesInCategory(kCICatetgoryBlur)

let zoomBlurFilter = CIFilter(name: "CIZoomBlur")

If you pass nil to that filter names and category, we get back all the filters. I have an open source app on GitHub called Filterpedia , and it demonstrates almost every single built-in filter as well as a handful of filters I’ve created myself.

Creating a CIImage & CIFilter

All filters accept an input image, which is of type CIImage , and we can easily create a CIImage from a UIImage using this initializer. Using the name return from filter names and category, we can create a filter instance and set its parameters with setValue .

let vegas = CIImage(image: UIImage(named: "vegas.jpg"))

We can create an edge filter with an intensity of 10, and the Vegas image as its input image. All filters have an output image property, and it’s querying that output image property that actually starts building the CIImage recipe from the kernel or kernels.

let edgeDetectFilter = CIFilter(name: "CIEdges")!

edgeDectectFilter.setValue(10, forKey: kCIInputIntensityKey)

edgeDetectFilter.setValue(vegas, forKey: kCIInputImageKe)

However, it’s not until the output image is converted to a displayable image – here we’re going to convert it to a UIImage – and that filter is actually executed. There are some convenience methods that can simplify the code.

Filters can be created with a dictionary of parameters in the constructor, so we can remove those setValue calls, or we can even skip explicitly creating a filter by using CIImage by applying filter method.

let glowingImage = CIFilter(name: "CIColorControls", withInputParameters: [kCIInputImageKey: edgesImage,                      kCIInputSaturationKey: 1.75])?.outputImage?
                              .imageByApplyingFilter("CIBloom", withInputParameters:[kCIInputRadiusKey: 2.5], kCIInputIntensityKey: 1.25])
                              .imageByCroppingToRect(vegas.extent)

It’s worth noting that the bloom filter we’re using here and some other filters such as blurs, they’ll change the extent of your image, because they have pixels that sort of bleed out to the edges of the original extent. We can remove those unwanted pixels and keep the image size the same with imageByCroppingToRect and passing in the original image’s extent to that image by cropping to rect.

Compositing

let darkVegasImage = vegas.imageByApplyingFilter("CIPhotoEffectNoir", withInputParameters: nil)
                          .imageByApplyingFilter("CIExposureAdjust", withInputParameters: ["inputEV": -1.5])

let finalComposite = glowingImage!.imageByApplyingFilter("CIAdditionCompositing", withInputParameters: [kCIInputBackgroundImageKey: darkVegasImage])

Images can be composited with themselves or other images, using lots and lots of blend modes. We create another version of our Vegas image in monochrome, a bit dark and moody, we’ll use a combination of the built-in photo effects noir filter, then we’ll also use an exposure adjust to darken it a little, then we can use a composite to composite both of those images over each other. So now we get the glowing edges over the dark and moody version.

Now we’ve created quite a rich filter graph here, we’ve got a single image, it’s acting as the input to two separate chains. One which is with the edges and the bloom, and the other with the noir filter and the exposure adjust.

We could assume that Core Images created intermediate images at each step, or intermediate buffers at each step. However, one of the great things about Core Image is it’s able to concatenate all the code from these kernels into a single kernel where it can, and it eliminates the requirement to create any intermediate buffers or intermediate images. So, that filter graph we’ve created, we can take that whole filter graph and wrap it up into a new CIFilter :

class EdgeGlowFilter: CIFilter
{
    var inputImage: CIImage?
    override var outputImage: CIImage?
    {
        guard let inputImage = inputImage else
        {
            return nil
        }
        let edgesImage = inputImage
            .imageByApplyingFilter(
                "CIEdges",
                withInputParameters: [
                    kCIInputIntensityKey: 10])
        let glowingImage = CIFilter(
            name: "CIColorControls",
            withInputParameters: [
                kCIInputImageKey: edgesImage,
                kCIInputSaturationKey: 1.75])?
            .outputImage?
            .imageByApplyingFilter(
                "CIBloom",
                withInputParameters: [
                    kCIInputRadiusKey: 2.5,
                    kCIInputIntensityKey: 1.25])
            .imageByCroppingToRect(vegas.extent)
        let darkImage = inputImage
            .imageByApplyingFilter(
                "CIPhotoEffectNoir",
                withInputParameters: nil)
            .imageByApplyingFilter(
                "CIExposureAdjust",
                withInputParameters: [
                    "inputEV": -1.5])
        let finalComposite = glowingImage!
            .imageByApplyingFilter(
                "CIAdditionCompositing",
                withInputParameters: [
                    kCIInputBackgroundImageKey:
darkImage])
        return finalComposite
    }
  }

The actual mechanics of the filtering is identical to our previous example, all I’ve done is move the code into the new filter’s output image getter, where the composite effect is built when the output image is queried. And we could use our new EdgeGlowFilter instance by creating a new instance of it.

To use the same CIFilter API as the built-in filters, we’ll need to register it using the register filter name method. So we need to pass register filter name our filter name, EdgesGlowFilter , we need to pass it a reference to a filter’s vendor, which conforms to CIFilter constructor protocol, and also we’re going to pass it some attributes, such as categories. Inside the filter vendor, which conforms to CIFilter constructor, we have to implement a filter with name, and this will return us a new instance of the filter we just created, based on a string name. And now, after we’ve registered the filter, we can create it using the same syntax as any other built-in filter.

Creating Custom Filters

Core Image kernels allow us to implement our own filter algorithms from scratch. These kernels act on every single pixel of the destination image, the output image, individually. In GLSL, in the Core Image Shading Language, we take the kernel of that loop and use that as the function. There are three different types of Core Image kernel: two really high-performance, specialized types, and there’s one more general type. A CIWarpKernel is specifically designed for changing the position of the given pixel, but it’s unable to change that pixel’s color.

Filters for rotating or distorting an image would be based on a CIWarpKernel . The color kernel, on the other hand, is designed to change the color of a pixel in place. It’s has no knowledge of any other pixels apart from the pixel it’s currently working with, and it can’t change their position.

for x in 0 ..< imageWidth
{
    for y in 0 ..< imageHeight
    {
       outputImage[x][y] = exposureAdjust(inputImage[x][y], 0.75)
    }
}

And then finally, the CI kernel is the more general version. It can access the entire image, so it’s able to sample other pixels. So for example, a blur would be based on a CI kernel because it needs to output a pixel’s color based on that pixel’s neighbors. And it’s worth noting that if your code requires the capability of a general kernel, and if it can be expressed as a separate warp and color kernel, using those two more specialized versions will yield better performance than using a general kernel.

Warp Kernel

Let’s look at warp kernels in a little more detail. They’re designed specifically for warping, translating or deforming images, and this could be as simple as rotating or scaling an image, or it could be as complex as simulating the refraction of a lens. Now the kernel is aware of the coordinate of the pixel currently being computed in the output image, and it returns the coordinates to where that pixel’s colors should be sourced from in the input image. And this function destCoord returns the coordinates of the pixel currently being computed.

kernel vec2 upsideDownWarp(vec2 extent)
{
return vec2(destCoord().x, extent.y - destCoord().y);
}

To turn an image upside-down, we’d simply subtract the current pixel’s X from the overall height of the input image. And to implement this in GLSL, we would pass that code as a string to the CIWarpKernel constructor, and this string, as I’ve done here, could be hard-coded in Swift, or it could come from an external file. And there we have upside-down Mona Lisa.

As with our previous composite example, we need to override an output ImageGetter in a CIFilter to execute the upside-down kernel, so, we’ll pass in the input image extent as a two-component vector, with x and y, so the kernel can subtract the current y position from the height, and applyWithExtent will build the kernel and the CIImage recipe. Because we’re moving pixels around, we need to supply this thing called a region of interest callback.

Here’s another warp kernel. This time we’re going to use some basic trigonometry to change the sample coordinates, based on the sine of the destination pixel’s coordinates. So, in this case, we don’t need to know the extent, we don’t need to know anything more than destCoord , so the kernel actually doesn’t require any arguments.

kernel vec2 carnivalMirror()
{
  float y = destCoord().y
  +
  sin(destCoord().y / 20.0) * 20.0;
  float x = destCoord().x +
  sin(destCoord().x / 10.0) * 10.0;
  return vec2(x, y);
}

Color Kernel

So whilst a warp kernel can move pixels around but not change their color, a color kernel is able to change the color of pixels, but in place. The only thing it really knows about is the color of the current pixel, which is passed as an argument type of underscore, underscore, sample, and that’s basically a full component vector, the components are red, green, blue and alpha for transparency.

kernel vec4 thresholdFilter(__sample image)
{
return vec4(sqrt(image.bgr), image.a);
}

A color kernel has to return a new color, which is also a vector over the size of four, so in this example, we’re going to swizzle around the input image’s colors, and then return the square root of that swizzled value, and we hope to get an oddly-colored and slightly brightened image. As with the warp kernel, this code is passed into a CIColorKernel as a string to its constructor, and because in this case the color kernel doesn’t affect the geometry or size of an image, there’s no requirement for a region of interest callback. We’re simply passing our input image as the only argument, and in form, apply the extent we want the kernel to apply the entire image.

override var outputImage: CIImage!
{
  guard let inputImage = inputImage,
      kernel = kernel else
  {
      return nil
  }
    let extent = inputImage.extent
    let arguments = [inputImage]
    return kernel
.applyWithExtent(extent,
        arguments: arguments)
}

And we have a slightly oddly-colored Mona Lisa.

Now, although color kernels can’t access other pixels, they do know their own coordinates, the samplerCoord , so here’s an example of a kernel that uses the position of the current pixel, and it uses the modular function to calculate a brightness value. And by multiplying the current color with that brightness value, you get this funny, semi-deshaded effect.

There are circumstances where we need data about the entire image. We may want to calculate a pixel’s color based on the values of other pixels and neither a warp kernel nor a color kernel can do that. So blurs and other convolution kernels are examples of this.

kernel vec4 thresholdFilter(__sample pixel)
{
vec2 coord = samplerCoord(pixel);
float brightness = mod(coord.y, 40.0) / 40.0;
brightness *= 1.0 - mod(coord.x, 40.0) / 40.0;
return vec4(brightness * (image.rgb), image.a);
}

General Purpose Kernels

Here, we’ll look at a kernel that applies a box blur to an image, but it’s based on the brightness of another image. So let’s step through the GLSL to do that.

A kernel will accept three arguments. It’s going to accept a source image, which will blur, a second image whose luminance will control that blur amount, and then finally, a float value that controls a maximum amount of blur. We’ll sample our blur image to get its color at the current coordinate, and then we’ll use this thing called the dot function to calculate the luminance. The blur radius for this pixel is that luminance-based blur amount, which is a range between zero and one, and it’s going to be multiplied by our blur radius argument.

Now, we’re going to begin a loop with standard seesaw loops, and then, and it’s the kernel within the kernel, we’ll use those loop indexes as samples to offset the sample laboring pixels. We create a coordinate by adding the X and Y, the loop indexes to destCoord , and pass that value to a function called samplerTransform . This will give us a coordinate which we can use to sample and accumulate the images of our input image into source surrounding box. Lastly we divide that accumulated color by the number of samples, and return that as a color with an alpha of one.

kernel vec4 lumaVariableBlur(sampler image, sampler blurImage, float blurRadius)
{
  vec2 d = destCoord();
  vec3 blurPixel = sample(blurImage, samplerCoord(blurImage)).rgb;
  float blurAmount = dot(blurPixel, vec3(0.2126, 0.7152, 0.0722));
  float n = 0.0;
  int radius = int(blurAmount * blurRadius);
  vec3 accumulator = vec3(0.0, 0.0, 0.0);
    for (int x = -radius; x <= radius; x++)
    {
      for (int y = -radius; y <= radius; y++)
      {
      vec2 workingSpaceCoordinate = d + vec2(x,y);
      vec2 imageSpaceCoordinate = samplerTransform(image, workingSpaceCoordinate);
      vec3 color = sample(image, imageSpaceCoordinate).rgb;
      accumulator += color;
      n += 1.0;
      }
    }
  accumulator /= n;
  return vec4(accumulator, 1.0);
}

The image we’re going to use to control the amount of blur could be any image, but we’re going to use a gradient, and we’re going to use a CIRadialGradient filter, to actually generate it for us.

It’s interesting to note that CIImages created with these generators have infinite extent. By adding imageByCroppingToRect , we can crop it back down to the same size as our original image, and then the applyWithExtent for the general kernel is actually the same as applyWithExtent for a warp kernel.

The result with the gradient image controlling the blur amount shows that there’s no blur at the center of the gradient, where it’s black, and there is more pronounced blur toward the edge of the screen, where the gradient image is white.

Displaying with UIImage

To display a final output, we could use a UIImage view, which accepts a UIImage , and actually converting the output from our filters to a UIImage is as simple as using this initializer. In doing that conversion, it can take a moment.

When we use this approach to create a CIImage from a UIImage , a new CIContext is created for us in the background, and this will affect performance. And also, it’s not just performance that’s an issue, images created with this technique don’t respect content mode. So if our image view is a different aspect ratio to our image, we’re going to end up with this weird, stretched result.

We can improve performance by creating our own Core Image context. This can be reused, we don’t create a new one with each conversion, we use the context create CG image to create a core graphics image, and that we can use to create a UIImage , which we can populate the UIImage view with. Also, the other advantage as well as performance is that when we use this technique, the result respects content mode. However, this process means that we’re doing a Core Image filter in the GPU, we’re going back down to the CPU to do that conversion, and then back up to the GPU to display the image, which may be fine for still or static images, but if we’re working with video, we want something a bit more responsive, we’re going to need a faster approach.

Rendering Core Image Output With OpenGL

GLKView actually solves this issue. This is part of GLKit - it’s an OpenGL ES view which manages a frame buffer, and simplifies displaying GPU-bound content. The first thing to do is to create an EAGLContext , which is an OpenGL ES rendering context, that’s going to be shared between a CIContext that we create and also the GLKView .

let eaglContext = EAGLContext(API: .OpenGLES2)
let ciEAGLContext = CIContext(EAGLContext: eaglContext,
    options: [kCIContextWorkingColorSpace: NSNull()])
let glkView = GLKView(frame: CGRect(x: 0,
        y: 0,
        width: 400,
        height: 800),
    context: eaglContext)

Giving the Core Image context that EAGLContext means that all it’s drawing, using the draw image function, will be drawn to that context. And we’re ready to execute our filter graph and display the result, we’ll use a setNeedsDisplay on the GLKView , and this will invoke GLKView on any delegates that it has.

To squeeze out even a little bit more performance, we can use Metal to display CIImages .

let colorSpace = CGColorSpaceCreateDeviceRGB()!

This approach is going to use Metal Kits MTKView , so we’re going to wrap up our code in a new class, MetalImageView , and this will expose an image property of type CIImage . Using Metal has a lower CPU memory overhead than using the GLKit solution, and may be the best solution for some performance-critical apps.

lazy var commandQueue: MTLCommandQueue =
{
        [unowned self] in
        return self.device!.newCommandQueue()
}()

lazy var ciContext: CIContext =
{
        [unowned self] in
        return CIContext(MTLDevice: self.device!)
}()

We create a command queue, which will queue and submit commands to the GPU. Finally, we’ll create a CIContext , and this is based on the Metal device. Our initializer for the class will accept an optional device, and this is the interface to the GPU, this is the Metal device.

override init(frame frameRect: CGRect,
device: MTLDevice?)
    {
        super.init(frame: frameRect,
            device: device ??
            MTLCreateSystemDefaultDevice())
        if super.device == nil
        {
            fatalError("Device doesn't support Metal")
        }
        framebufferOnly = false
    }

If this is passed as nil, we’ll use nil coalescing when calling super.init , to always make sure we have a device, and if that fails, it means the device doesn’t support Metal, and it throw a fatal error. We set the framebufferOnly to false, because it’s going to be reading and writing to the MetalKitView ’s texture, and we’ll expose an image property with a didSet observer.

var image: CIImage?
   {
        didSet
        {
            renderImage()
        }
    }

To make sure that there’s an image to render, and the display buffer the current drawable is available, we’ll do some quick transforms to make sure the image fits width, and now we’re actually ready to draw the input image, the Metal texture, which we do with the CIContext render.

func renderImage()
    {
        guard let
            image = image,
            targetTexture = currentDrawable?.texture else
        {
            return
        }
        let commandBuffer = commandQueue.commandBuffer()

        let bounds = CGRect(origin: CGPointZero,
        size: drawableSize)
               let originX = image.extent.origin.x
               let originY = image.extent.origin.y
               let scaleX = drawableSize.width / image.extent.width
               let scaleY = drawableSize.height /image.extent.height
               let scale = min(scaleX, scaleY)
               let scaledImage = image
                   .imageByApplyingTransform(
        CGAffineTransformMakeTranslation(-originX,
        -originY))
           .imageByApplyingTransform(CGAffineTransformMakeScale(scale, scale))

       ciContext.render(scaledImage,
       toMTLTexture: targetTexture,
       commandBuffer: commandBuffer,
       bounds: bounds,
       colorSpace: colorSpace)

       commandBuffer
      .presentDrawable(currentDrawable!)
       commandBuffer.commit()

Finally, we wrap up by presenting the current drawable and scheduling the command buffer to be executed on the GPU.

Q: How does Core Image compare to GPUImage?

Core Image does use the GPU, and you have a choice - the filters will execute on the GPU and then you can make a choice between the CPU for where the conversion happens, or the drawing destination. If you elect to use the CPU, it means you can use threading, and I think that GPUImage won’t allow that.

You are kind of reaching out to a third party for a third party library, and if development stops, you’ll be scuppered.

See the discussion on Hacker News .

Transcription below provided by Realm: a replacement for SQLite & Core Data with first-class support for Swift! Check out the Swift docs!





About List