写在前面
前一阵在尝试实现一个robot rpa
相关的设计理论,涉及到需要快速的对比图像的差异,并且提取出差异部分的内容。
一、思路
- 使用
opencv
读取图像为RGB
数据 - 根据
RGB
的坐标宽、高、通道
确定通道值 - 使用
hash
对其进行标记,以便后续快速查找 - 根据
宽、高、通道
快速对比查找找出不符合的部分,求出差异矩形
- 根据像素差异点截取
出差异矩形
部分 - 将
出差异矩形
部分的图片流送到Tesseract
识别
注意:这里如果涉及到不同分辨率之间的图像识别,需要先做以下前置处理:
- 对比
2
张图片的分辨率,求出最小分辨率
范围 - 根据
最小分辨率
进行图像压缩,使其都按固定分辨率进行后续的工作
二、获取opencv
- 访问 opencv releases 选择合适的平台下载对应版本的包,我这里以
windows
为例(官网没有提供linux
的安装方式,都是直接下载源代码了cmake
) - 安装之后
lib
位置为windows
在安装目录下的build\java\x64
linux
在安装目录下的build/lib/
- 将
opencv-470.jar
(我下载的版本是4.7
)包放到你的项目工程下,用包管理器添加引用 - 将
opencv_java470.dll
(我下载的版本是4.7
)包放到你的项目工程下
到这里,项目就具备了opencv
的使用条件了
三、获取tesseract
我这里用的不是原生 Tesseract ,我用的是 tess4j ,它对前者进行了一些简易封装,不过我还是建议大家使用前者,至于原因大家可以去看一下 tess4j issues
- 访问 tess4j releases 获取
- 也可以在
maven中央仓库
中搜索到
获取tesseract
所需的语言包
四、编写opencv工具类
ImageContrastUtils.java
package com.lc.image;
import org.opencv.core.*;
import java.util.HashMap;
import static org.opencv.highgui.HighGui.imshow;
import static org.opencv.imgcodecs.Imgcodecs.imread;
import static org.opencv.imgcodecs.Imgcodecs.imwrite;
import static org.opencv.imgproc.Imgproc.*;
/**
* @author cheng.liu
* @version 1.0
* @description: TODO
* @date 2023/1/31 11:43
*/
public class ImageContrastUtils {
static {
System.load(ClassLoader.getSystemResource("libs/opencv_java470.dll").getPath());
}
/**
* 获取图像分辨率
*
* @param imagePath 图像路径
* @return 宽度*高度
*/
public static Integer[] getResolution(String imagePath) {
Mat imageMat = imread(imagePath);
if (imageMat.empty()) {
System.out.println("图片不存在:" + imagePath);
}
//宽度,高度
return new Integer[]{imageMat.cols(), imageMat.rows()};
}
/**
* 裁剪图像
*
* @param sourceImage 要裁剪的源图像
* @param outPath 预期裁剪后的图像路径(会在这个路径最后加上”cutOut“后缀)
* @param minRow 像素最小行
* @param minCol 像素最小列
* @param maxRow 像素最大行
* @param maxCol 像素最大列
* @return 裁剪后的图像输出路径
*/
public static String cutOut(Mat sourceImage, String outPath, Integer minRow, Integer minCol, Integer maxRow, Integer maxCol) {
//剪切范围
Rect rect = new Rect(new Point(minCol, minRow), new Point(maxCol, maxRow));
//设置roi
Mat imageROI = new Mat(sourceImage, rect);
//剪切目标
Mat cutImage = new Mat();
imageROI.copyTo(cutImage);
String path = getCreateImageName(outPath, "cutOut");
imshow("cutOut", cutImage);
imwrite(path, cutImage);
return path;
}
private static String getPixelKey(Integer row, Integer col) {
return String.join(",", String.valueOf(row), String.valueOf(col));
}
/**
* 压缩图像分辨率
*
* @param imagePath 图像路径
* @param width 压缩后的宽度,默认200
* @param height 压缩后的高度,默认200
* @return 压缩后的图像
*/
public static Mat resizeImage(String imagePath, Double width, Double height) {
Mat imageMat = imread(imagePath);
if (imageMat.empty()) {
System.out.println("图片不存在:" + imagePath);
}
if (width == null || width <= 0) {
width = 200d;
}
if (height == null || height <= 0) {
height = 200d;
}
Mat mat = new Mat();//缩放之后的图片
resize(imageMat, mat, new Size(width, height), INTER_AREA);//缩小图片
// resize(imageMat, mat, new Size(width, height), 0.5, 0.5, INTER_AREA);//缩小图片
imshow("resize", mat);
return mat;
}
/**
* 获取图像的像素值
*
* @param imagePath 图像路径
* @return 像素坐标 #getPixelKey 通道:值
*/
public static HashMap<String, double[]> getPixelValue(String imagePath) {
Mat imageMat = imread(imagePath);
if (imageMat.empty()) {
System.out.println("图片不存在:" + imagePath);
}
return getPixelValue(imageMat);
}
/**
* 获取图像的像素值
*
* @param imageMat 图像
*/
public static HashMap<String, double[]> getPixelValue(Mat imageMat) {
HashMap<String, double[]> map = new HashMap<>();
//图像行,高度
int rows = imageMat.rows();
//图像列,宽度
int cols = imageMat.cols();
//图像通道,维度
// int channels = imageMat.channels();
for (int i = 0; i < rows; i++) {
for (int j = 0; j < cols; j++) {
map.put(getPixelKey(i, j), imageMat.get(i, j));
}
}
return map;
}
/**
* 将图像置灰-保留一个通道
*
* @param imagePath 图像路径
* @return opencv可操作实例
*/
public static Mat grayImage(String imagePath) {
Mat image = imread(imagePath);
if (image.empty()) {
System.out.println("图片不存在:" + imagePath);
}
Mat grayImage = new Mat(image.rows(), image.cols(), CvType.CV_8SC1);
cvtColor(image, grayImage, COLOR_RGB2GRAY);
// imwrite(getCreateImageName(imagePath, "gray"), grayImage);
imshow("gray", grayImage);
return grayImage;
}
private static String getCreateImageName(String imagePath, String prefix) {
return imagePath.replace(".", "_" + prefix + ".");
}
}
五、编写ocr工具类
ImageOcrUtils.java
package com.lc.image;
import net.sourceforge.tess4j.ITesseract;
import net.sourceforge.tess4j.Tesseract;
import net.sourceforge.tess4j.TesseractException;
import javax.imageio.ImageIO;
import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;
/**
* @author cheng.liu
* @version 1.0
* @description: TODO
* @date 2023/1/31 11:43
*/
public class ImageOcrUtils {
// 语言包从 https://github.com/tesseract-ocr/tessdata 下载放到 项目的tessdata 目录下
// 语言包说明 https://tesseract-ocr.github.io/tessdoc/Data-Files.html
private static ITesseract tesseract;
static {
tesseract = new Tesseract();
tesseract.setDatapath(getTessData());
}
private static String getTessData() {
String path = String.join(File.separator, System.getProperty("user.dir"), "tessdata");
File file = new File(path);
if (!file.exists()) file.mkdirs();
return file.getPath();
}
public static String ocr(String imagePath, String language) {
File file = new File(imagePath);
BufferedImage bufferedImage = null;
try {
bufferedImage = ImageIO.read(file);
} catch (IOException e) {
e.printStackTrace();
}
String text = "";
try {
tesseract.setLanguage(language);
// tesseract.setLanguage("eng");
// tesseract.setHocr(true);
text = tesseract.doOCR(bufferedImage);
} catch (TesseractException e) {
e.printStackTrace();
}
return text;
}
}
六、最终实现
我不想把代码拆开了说明,方法上都有注释,自己看把
ServiceAppTestMain.java
package com.lc.image;
import org.opencv.core.Mat;
import javax.imageio.ImageIO;
import java.awt.*;
import java.awt.image.MultiResolutionImage;
import java.awt.image.RenderedImage;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.HashMap;
import java.util.List;
import java.util.stream.Collectors;
/**
* @author cheng.liu
* @version 1.0
* @description: TODO
* @date 2023/1/31 11:41
*/
public class ServiceAppTestMain {
public static void main(String[] args) {
taskArrangement();
}
private static void taskArrangement() {
try {
System.out.println("停止3秒后截取第一张屏幕画面");
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
System.out.println("开始截取第一张屏幕画面");
String beforeScreen = screenshot();
try {
System.out.println("停止3秒后截取第二张屏幕画面");
Thread.sleep(3000);
} catch (InterruptedException e) {
e.printStackTrace();
}
String currentScreen = screenshot();
String imagePath = contrastImage(beforeScreen, currentScreen);
// String imagePath = contrastImage(String.join(File.separator, getBasePath(), "test1.png"),
// String.join(File.separator, getBasePath(), "test2.png"));
if (imagePath == null || imagePath.trim().isEmpty()) {
return;
}
long start = System.currentTimeMillis();
String ocr = ImageOcrUtils.ocr(imagePath, "eng");
// String ocr = ImageOcrUtils.ocr(String.join(File.separator, getBasePath(), "1675747922902_cutOut.png"), "eng");
System.out.println("ocr 耗时:" + (System.currentTimeMillis() - start));
System.out.println("ocr 内容:" + ocr);
}
/**
* 获取图像标准
* @param beforeScreen 图像1
* @param currentScreen 图像2
* @return 图像分辨率 [宽,高]
*/
private static Integer[] getMinImage(String beforeScreen, String currentScreen) {
Integer[] beforeScreenResolution = ImageContrastUtils.getResolution(beforeScreen);
Integer[] currentScreenResolution = ImageContrastUtils.getResolution(currentScreen);
int wBefore = beforeScreenResolution[0];
int hBefore = beforeScreenResolution[1];
int wCurrent = currentScreenResolution[0];
int hCurrent = currentScreenResolution[1];
int w = wBefore;
if (wBefore - wCurrent > 0) {
w = wCurrent;
}
int h = hBefore;
if (hBefore - hCurrent > 0) {
h = hCurrent;
}
return new Integer[]{w, h};
}
/**
* 对比
* @param beforeScreen 图像1
* @param currentScreen 图像2
* @return 差异位置的图像临时路径
*/
private static String contrastImage(String beforeScreen, String currentScreen) {
// Mat beforeScreenMat = ImageContrastUtils.grayImage(beforeScreen);
// Map<String, Double> beforeScreenMap = ImageContrastUtils.getPixelValue(beforeScreenMat);
// Mat currentScreenMat = ImageContrastUtils.grayImage(currentScreen);
// Map<String, Double> currentScreenMap = ImageContrastUtils.getPixelValue(currentScreenMat);
// HashMap<String, double[]> beforeScreenMap = ImageContrastUtils.getPixelValue(beforeScreen);
// HashMap<String, double[]> currentScreenMap = ImageContrastUtils.getPixelValue(currentScreen);
Integer[] minWH = getMinImage(beforeScreen, currentScreen);
Mat beforeScreenResize = ImageContrastUtils.resizeImage(beforeScreen, Double.valueOf(minWH[0]), Double.valueOf(minWH[1]));
Mat currentScreenResize = ImageContrastUtils.resizeImage(currentScreen, Double.valueOf(minWH[0]), Double.valueOf(minWH[1]));
HashMap<String, double[]> beforeScreenMap = ImageContrastUtils.getPixelValue(beforeScreenResize);
HashMap<String, double[]> currentScreenMap = ImageContrastUtils.getPixelValue(currentScreenResize);
//对比像素差异
if (beforeScreenMap.size() != currentScreenMap.size()) {
System.out.println("对比阶段:两张图片分辨率不一致");
}
System.out.println("全尺寸像素总个数" + beforeScreenMap.size());
System.out.println("==============测试数据 开始=======================");
List<Pixel> beforeScreenPixels = beforeScreenMap.entrySet().stream().map(m -> {
Pixel pixel = new Pixel(m.getKey());
pixel.setValues(m.getValue());
return pixel;
}).collect(Collectors.toList());
List<Pixel> currentScreenPixels = currentScreenMap.entrySet().stream().map(m -> {
Pixel pixel = new Pixel(m.getKey());
pixel.setValues(m.getValue());
return pixel;
}).collect(Collectors.toList());
Integer beforeMinCol = beforeScreenPixels.stream().min(Comparator.comparing(Pixel::getCol)).get().col;
Integer beforeMinRow = beforeScreenPixels.stream().min(Comparator.comparing(Pixel::getRow)).get().row;
Integer beforeMaxCol = beforeScreenPixels.stream().max(Comparator.comparing(Pixel::getCol)).get().col;
Integer beforeMaxRow = beforeScreenPixels.stream().max(Comparator.comparing(Pixel::getRow)).get().row;
Integer currentMinCol = currentScreenPixels.stream().min(Comparator.comparing(Pixel::getCol)).get().col;
Integer currentMinRow = currentScreenPixels.stream().min(Comparator.comparing(Pixel::getRow)).get().row;
Integer currentMaxCol = currentScreenPixels.stream().max(Comparator.comparing(Pixel::getCol)).get().col;
Integer currentMaxRow = currentScreenPixels.stream().max(Comparator.comparing(Pixel::getRow)).get().row;
System.out.println("全尺寸最小值:" + beforeMinRow + "," + beforeMinCol + " === " + currentMinRow + "," + currentMinCol);
System.out.println("全尺寸最大值:" + beforeMaxRow + "," + beforeMaxCol + " === " + currentMaxRow + "," + currentMaxCol);
System.out.println("==============测试数据 结束=======================");
//对比像素,找出不同的像素值
List<Pixel> varyPixels = new ArrayList<>();
beforeScreenMap.forEach((key, values) -> {
double[] currentValues = currentScreenMap.get(key);
if (currentValues.length - values.length == 0) {
for (int i = 0; i < currentValues.length; i++) {
if (currentValues[i] - values[i] != 0) {
varyPixels.add(new Pixel(key));
break;
}
}
} else {
varyPixels.add(new Pixel(key));
}
});
// 使用 varyPixels 绘制变化区域
System.out.println("有变化的像素个数:" + varyPixels.size());
if (varyPixels.isEmpty()) {
System.out.println("无变化");
return null;
}
Integer minCol = varyPixels.stream().min(Comparator.comparing(Pixel::getCol)).get().col;
Integer minRow = varyPixels.stream().min(Comparator.comparing(Pixel::getRow)).get().row;
Integer maxCol = varyPixels.stream().max(Comparator.comparing(Pixel::getCol)).get().col;
Integer maxRow = varyPixels.stream().max(Comparator.comparing(Pixel::getRow)).get().row;
System.out.println("变化的区域 start point:" + minRow + "," + minCol + " end point:" + maxRow + "," + maxCol);
String beforeCutOutPath = ImageContrastUtils.cutOut(beforeScreenResize, beforeScreen, minRow, minCol, maxRow, maxCol);
System.out.println("beforeScreen cutOutPath:" + beforeCutOutPath);
String currentCutOutPath = ImageContrastUtils.cutOut(currentScreenResize, currentScreen, minRow, minCol, maxRow, maxCol);
System.out.println("currentScreen cutOutPath:" + currentCutOutPath);
return currentCutOutPath;
}
/**
* rgb模型
*/
public static class Pixel {
private Integer row;
private Integer col;
private double[] values;
public Pixel(String pixel) {
int index = pixel.indexOf(",");
this.row = Integer.valueOf(pixel.substring(0, index));
this.col = Integer.valueOf(pixel.substring(index + 1));
}
public Pixel(Integer row, Integer col, double[] values) {
this.row = row;
this.col = col;
this.values = values;
}
public Integer getRow() {
return row;
}
public void setRow(Integer row) {
this.row = row;
}
public Integer getCol() {
return col;
}
public void setCol(Integer col) {
this.col = col;
}
public double[] getValues() {
return values;
}
public void setValues(double[] values) {
this.values = values;
}
}
/**
* 获取当前opencv操作的目录
* 不存在目录会自动创建
* @return 返回操作目录路径
*/
private static String getBasePath() {
String path = String.join(File.separator, System.getProperty("user.dir"), "opencv");
File file = new File(path);
if (!file.exists()) file.mkdirs();
return file.getPath();
}
/**
* 截取当前屏幕
* @return 当前屏幕图片存储的位置
*/
private static String screenshot() {
Robot robot = null;
try {
robot = new Robot();
} catch (AWTException e) {
e.printStackTrace();
}
Toolkit toolkit = Toolkit.getDefaultToolkit();
Dimension dimension = toolkit.getScreenSize();
assert robot != null;
MultiResolutionImage mrImage = robot.createMultiResolutionScreenCapture(new Rectangle(dimension));
Image image = mrImage.getResolutionVariants()
.stream()
.reduce((first, second) -> second)
.orElseThrow();
//图片写到磁盘
String path = String.join(File.separator, getBasePath(), System.currentTimeMillis() + ".png");
File file = new File(path);
try {
ImageIO.write((RenderedImage) image, "png", file);
} catch (IOException e) {
e.printStackTrace();
}
return file.getPath();
}
}
该代码启动后会等待3
秒后开始截取当前屏幕,然后等待3
秒截取第二张屏幕,后续会开始计算对比两次屏幕图像的不同区域,将其截取出来,并且送到ocr
进行图像内容提取。
七、源代码
示例源代码 image-process
其中有以下说明:
-
该代码只有
windows
下运行的示例,想在其他环境下运行请按opencv
的文档要求更改(sdk
下载方法已经在前面说过了) -
该代码缺少
opencv
对应的opencv_java470.dll
(我下载的版本是4.7
)文件,github
限制单个文件不能超过100MB
(下载方法已经在前面说过了) -
该代码缺少
项目的tessdata
文件夹及语言包,太大上传不上来,需要自己下载(下载方法已经在前面说过了)
评论区