Getting Started: Android CameraX
A quick guide (and sample code) to get you started on using the new Android Jetpack CameraX API as a Barcode Scanner with MLKit.
As the name implies, Jetpack (a collection of modern APIs for Android development) truly gives developers a productivity boost — and the introduction of the Jetpack CameraX API is no different.
With CameraX, developing camera based Android views is a more streamlined process with some standout features as compared to its predecessor (Camera2):
- Abstracts away the complexities of differences in OEM sensor hardware drivers.
- Supports Android API 21 and up.
- As with other Jetpack components, it is lifecycle aware, and takes care of bringing up and breaking down the required resources in response to app lifecycle events.
- To me, the most exciting feature is the introduction of what can be called the “Use Case Pipeline”
A Use Case pipeline?
Essentially a UseCase
in the context of the CameraX API is a class that accepts an image frame from the camera, does something with it, and notifies the API when it is done, at which time the next Use Case is called — rinse and repeat.
As an example — this is how you would initialize / setup a camera instance with CameraX:

As you can see, the bindToLifecycle
function is passed:
- The Activity (LifecycleOwner) that will host the camera instance.
- A helper (
CameraSelector
) that is used to nominate which hardware sensor (camera) to use. - A list (vararg) of
UseCase
types to run each frame through.
This last parameter allows you to chain together multiple Use Cases as required.
Moreover, as mentioned earlier, this is bound to a lifecycle aware Camera instance, so for example, if you background the app, CameraX will take care of suspending the hardware and making associated resources available for garbage collection.
Base Use Cases
As of now, the CameraX API has three distinct base Use Cases, namely:
- Preview: accepts a surface for displaying a preview —
Preview
- Image analysis: provides CPU-accessible buffers for analysis, such as for machine learning inference :) —
ImageAnalysis
- Image capture: captures and saves a photo —
ImageCapture
It is safe to assume most of the time that thePreview
Use Case will be placed in the pipeline. This will take the received image frame and render it to a Surface
( androidx.camera.view.PreviewView
) for the user to see.
Similarly, ImageCapture
provides the functionality to save the received frame as a photo.
The most exciting is obviouslyImageAnalyses
— this gives you the raw framebuffer to run inference on with an ML model of your choosing.
For the purpose of this article, I will use the
BarcodeScanning
API (available as part of Google’s MLKit framework) to run inference on the image frame to recognize and decode different barcode standards found in the image
Suffice to say, using this approach to scan barcodes blows everything else — for example the ZXing library — out of the water in terms of accuracy and speed.
In my tests I got recognition speeds of<200ms. As a comparison, using deterministic approaches from libraries like ZXing et al. results in speeds an order of magnitude larger (>2 seconds) under ideal conditions — not to mention the major difference in CPU utilization.
Digging In
I now introduce a demo Android project that implements CameraX and MLKit to demonstrate how you can use CameraX to run inference on an image frame (as mentioned — a Barcode Scanner)
Here is a video of the result looking at some packaging that contains both an EAN Barcode and QR Code (Notice the decoded barcode string at the bottom of the screen)
Getting Started
Although you can use the example code on GitHub, I will point out some (albeit obvious) things to remember:
- Add the CameraX (and optionally the MLKit) dependencies in your app module
build.gradle
file. - Remember to declare the
AndroidManifest.xml
permissions and features for using the device camera hardware. - The project uses
dataBinding
so ensure your version of Android Studio is compatible to auto-generate the binding classes. - In the example project, the actual integration is placed in
CameraHelper.kt
to keep it portable.
A general paradigm to be aware of is that image processing and inference on a mobile device can be a relatively expensive operation and pipelining multiple Use Cases (especially inefficient ones) together can have drastic impact on device battery and resources.
At the end of this article, I discuss how to let use cases to skip frames above some load threshold — resulting in reduced throughput — but better resource utilization.
Main Activity
The first order of business is creating the view where the preview (the camera feed the user can see) is rendered:
Here I use a view specifically designed for rendering camera previews. Note that although the view fills the parent, the actual dimensions of the rendered frames is set with an aspect ratio calculation depending on device screen size and orientation.
Next up, the activity that binds this view:
Nothing fancy here:
- The
CameraHelper
class is initialized with owner, context, the view to render the preview on, and a callback function to receive decoded barcode recognition results. - A permission handler override passes the permission result back to the helper.
The CameraHelper
itself is where the action is. You can have a look at the class directly on Github:
Although not essential, for convenience we create a typealias
Listener object to be passed for the analyzer to callback results from the barcode inference with:
Next we create an ExecutorService
to run theImageAnalysis
Analyzer in the background / on it’s own thread in addition to a few global properties:
The start function checks if permissions are granted, and if not, launches the Camera permission request, which will result in the Activity above getting the permission result, which is again passed to the onRequestPermissionsResult
of this class:
As you can see this process will loop indefinitely until the user grants camera permissions — as without the camera the app cannot do what it is designed to do.
Next the startCamera
function sets up the camera future provider within the context executor role:
Here we “waterfall” the selected camera (with back camera preferred) and then proceed to bind the use cases to the provider:
- As per the
bindToLifecycle
method signature described at the start of this article, aCameraSelector
instance is required to indicate which hardware sensor (camera) to use. If you wanted to add a button on the view to toggle the camera, you would updatelensFacing
with the required sensor and callbindCameraUseCases
again — but for the purpose of scanning barcodes, the “selfie” camera is not generally used. - Next we nominate the use cases to run the image frames through
- And finally, set the surface to show the preview on.
At this point — the app is running, the camera is capturing frames and the CameraX
API is sending them to the nominated Use Cases.
Reading Barcodes (The ImageAnalysis Use Case)
This is the exciting part and demonstrates what CameraX and it’s concept of Use Cases does:
- We extend the
ImageAnalysis.Analyzer
base Use Case… - … and override the
analyze
function which is passed anImageProxy
instance.
imageProxy
contains helpers to get/set data on the image — in this example we simply get the frame as an Image
instance — but the proxy also:
- Exposes the raw pixel buffer
- Enables setting and getting a cropped rectangle (sub image)
- Getting image type, dimensions and rotation, among others.
Note: Rotation (or orientation) of the image frame is important for inference tasks as many CV models are trained on data in a certain orientation. You can also see it is passed to the BarcodeScanner (MLKit) API.
Finally, we run the frame through the BarcodeScanner
API as provided by MLKit — which has a callback containing all the recognized barcodes in the frame and in turn call our listener(s) with the result (if any)
Lastly, and most importantly, we call imageProxy.close()
— This is required to let theCameraProvider
know we are done processing the frame. Not doing this will result in other Use Cases (including the Preview) freezing as it assumes the Use Case is still processing.
A Note on Performance
When we initialize the analyzer, we stipulate a few flags that define the frame buffer behaviour for the CameraProvider
:
Backpressure
This is perhaps the most useful flag with regard to performance, and depends on whether you want to process each frame from the camera, or if you are happy only receiving the latest frame.
ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST
will send your Use Case the latest available frame.
ImageAnalysis.STRATEGY_BLOCK_PRODUCER
will “pause” the Camera until the provided frame is analyzed (and imageProxy.close()
is called)
Queue Depth
Of course, if you need to process every frame, this will have a performance hit if your code is not optimized or takes too long, so to further help in this case an ImageQueueDepth
can be specified. As per the official documentation:
Sets the number of images available to the camera pipeline for
ImageAnalysis.STRATEGY_BLOCK_PRODUCER
mode.The image queue depth is the number of images available to the camera to fill with data. This includes the image currently being analyzed by
ImageAnalysis.Analyzer.analyze(ImageProxy)
. Increasing the image queue depth may make camera operation smoother, depending on the backpressure strategy, at the cost of increased memory usage.When the backpressure strategy is set to
ImageAnalysis.STRATEGY_BLOCK_PRODUCER
, increasing the image queue depth may make the camera pipeline run smoother on systems under high load. However, the time spent analyzing an image should still be kept under a single frame period for the current frame rate, on average, to avoid stalling the camera pipeline.The value only applies to
ImageAnalysis.STRATEGY_BLOCK_PRODUCER
mode. ForImageAnalysis.STRATEGY_KEEP_ONLY_LATEST
the value is ignored.If not set, and this option is used by the selected backpressure strategy, the default will be a queue depth of 6 images.
Advanced Note: Tracking FPS
As hinted in the official Google example app code for CameraX an ImageAnalysis
Use Case can calculate the current FPS rate as it’s running.
Although the executor is usually running on it’s own background thread, it could — in some cases — be useful to simply call imageProxy.close()
if your FPS drops below a certain threshold instead of processing the frame.
Using this technique together with the buffer flags set above, you are able to fine tune performance as needed.
Conclusion
This article gives you a basic example of using the new CameraX API. There are many more possibilities around the ImageAnalysis
base Use Case and I would like to explore more possibilities in future posts.
Thanks for reading and I hope you got some useful insights!