How to create a Document Scanner App in Android Studio using new ML Kit

Introduction

In this, article we are going create an Android app using Google’s new Document Scanning API. It is so easy to implement this feature in an app. We’re going to make this app using Jetpack Compose.

Previously, developers need to write code to handle document detection, cropping and other features, which were complex and time taking. Also, Integrating external libraries increase app size.

Now, with Document Scanner API one can implement document scanning feature with just 2 dependencies:

implementation("com.google.android.gms:play-services-mlkit-document-scanner:16.0.0-beta1")
implementation("io.coil-kt:coil-compose:2.5.0")

First is Google’s own ML Kit document Scanner dependency.
Second is Coil dependency. It is an image loading library for Android that is easy to load and display images in app.

What is ML Kit?

ML Kit is a toolkit that allows developers to add features of machine learning in android app. This toolkit contains lots of API’s like object detection, text recognition and translation and many more. The code is so to implement and we don’t need to be expert in machine learning.

Features of Document Scanner API

  1. Ensures user’s privacy by keeping everything on the device.
  2. Automatically detects documents, crop them precisely.
  3. Easy to implement an user friendly interface.
  4. Allows users to refine scans with cropping, filters, and more.
  5. Saves Documents in PDF or JPEG format.

Code

In this article, I’m developing Docs Scanner using Jetpack Compose.

package com.ghanshyam.docsscanner

import android.net.Uri
import android.os.Bundle
import android.widget.Toast
import androidx.activity.ComponentActivity
import androidx.activity.compose.rememberLauncherForActivityResult
import androidx.activity.compose.setContent
import androidx.activity.result.IntentSenderRequest
import androidx.activity.result.contract.ActivityResultContracts
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.rememberScrollState
import androidx.compose.foundation.verticalScroll
import androidx.compose.material3.Button
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.runtime.Composable
import androidx.compose.runtime.getValue
import androidx.compose.runtime.mutableStateListOf
import androidx.compose.runtime.mutableStateOf
import androidx.compose.runtime.remember
import androidx.compose.runtime.setValue
import androidx.compose.runtime.toMutableStateList
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.layout.ContentScale
import androidx.compose.ui.tooling.preview.Preview
import coil.compose.AsyncImage
import com.ghanshyam.docsscanner.ui.theme.DocsScannerTheme
import com.google.mlkit.vision.documentscanner.GmsDocumentScanner
import com.google.mlkit.vision.documentscanner.GmsDocumentScannerOptions
import com.google.mlkit.vision.documentscanner.GmsDocumentScannerOptions.RESULT_FORMAT_JPEG
import com.google.mlkit.vision.documentscanner.GmsDocumentScannerOptions.RESULT_FORMAT_PDF
import com.google.mlkit.vision.documentscanner.GmsDocumentScannerOptions.SCANNER_MODE_FULL
import com.google.mlkit.vision.documentscanner.GmsDocumentScanning
import com.google.mlkit.vision.documentscanner.GmsDocumentScanningResult
import java.io.File
import java.io.FileOutputStream

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)

        val options = GmsDocumentScannerOptions.Builder().setScannerMode(SCANNER_MODE_FULL)
            .setGalleryImportAllowed(true).setPageLimit(10)
            .setResultFormats(RESULT_FORMAT_JPEG, RESULT_FORMAT_PDF).build()

        val scanner = GmsDocumentScanning.getClient(options)

        setContent {
            DocsScannerTheme {
                Surface(
                    modifier = Modifier.fillMaxSize(), color = MaterialTheme.colorScheme.background
                ) {
                    var imageUris by remember {
                        mutableStateOf<List<Uri>>(emptyList())
                    }
                    val scannerLauncher =
                        rememberLauncherForActivityResult(
                            contract = ActivityResultContracts.StartIntentSenderForResult(),
                            onResult = {
                                if (it.resultCode == RESULT_OK) {
                                    val result =
                                        GmsDocumentScanningResult.fromActivityResultIntent(it.data)
                                    imageUris = result?.pages?.map {
                                        it.imageUri
                                    }
                                        ?: emptyList()
                                    result?.pdf?.let { pdf ->
                                        val fos = FileOutputStream(File(filesDir, "scan.pdf"))
                                        contentResolver.openInputStream(pdf.uri)?.use {
                                            it.copyTo(fos)
                                        }
                                    }
                                }
                            })
                    Column(
                        modifier = Modifier
                            .fillMaxSize()
                            .verticalScroll(rememberScrollState()),
                        verticalArrangement = Arrangement.Center,
                        horizontalAlignment = Alignment.CenterHorizontally
                    ) {
                        imageUris.forEach { uri ->
                            AsyncImage(
                                model = uri, contentDescription = null,
                                contentScale = ContentScale.FillWidth,
                                modifier = Modifier.fillMaxWidth()
                            )
                        }
                        Button(onClick = {
                            scanner.getStartScanIntent(this@MainActivity)
                                .addOnSuccessListener {
                                    scannerLauncher.launch(
                                        IntentSenderRequest.Builder(it).build()
                                    )
                                }
                                .addOnFailureListener {
                                    Toast.makeText(
                                        applicationContext,
                                        it.message,
                                        Toast.LENGTH_SHORT
                                    ).show()
                                }
                        }) {
                            Text(text = "Scan Document")
                        }
                    }
                }
            }
        }
    }
}

Conclusion

I have just basic functionalities of this API, you can explore more by reading documentation. This app’s code is available on my Github, here’s the link: https://github.com/Ghanshyam32/docs-scanner-ml-kit/

Official Documentation: https://developers.google.com/ml-kit

Leave a Reply

Your email address will not be published. Required fields are marked *