scratchJR_OCRversion二次开发说明

文章发布时间:

2024-09-07

最后更新时间:

2024-09-07

文章总字数:

3.6k

预计阅读时间:

18 分钟

一次在AI辅助下开展的软件二次开发实践，主要涉及有Electron，SQLite技术栈。

针对特定需求进行scratchJR软件的二次开发说明

开发目的

实体化的积木可以让学生直观地看到代码的结构和逻辑，帮助他们理解编程的基本概念。通过动手操作积木，学生可以更积极地参与学习过程，增强学习的互动性和趣味性。编程对初学者来说可能比较抽象和复杂，实体化的积木可以降低学习难度，让编程变得更加容易接近。通过组合不同的积木块来构建程序，学生可以学习到基本的逻辑和顺序结构，培养逻辑思维能力。

因此为scratchJR教学软件添加拍照识别特定实体代码块，并转换为对应可执行文件的功能。

需求分析

scratchJR二次开发版（以下简称为“sjr”）应在软件原有基础上增加拍照，识别实体代码块，转换为sjr可识别的文件并打开的功能。
sjr应能够在Windows平台（win32_x64），西沃教学白板等平台正常运行。

设计概要

Scratch Jr. Architecture Diagram

因sjr技术框架为Electron，将拍摄+图像识别，处理+导入图片数据分别设置为两个功能按钮，并辅以图像与标号示意。

功能实现细节见“详细设计”。

拍摄+图像识别

对各类代码块进行编号并标识，使用Python+OpenCV进行摄像头行为控制，将拍下的图片进行OCR文字识别，将识别结果通过ipc通信导入下一步等待用户选择是否导入或重新拍摄。

UMI离线OCR文字识别方案：调用本地文字识别引擎实现无网络情况下sjr能够正常工作，OCR引擎以打包致sjr文件包中并实现点击即用。用户仅需启动OCR服务后即可正常使用sjr识别功能。

开源项目地址：hiroi-sora/Umi-OCR: OCR software, free and offline. 开源、免费的离线OCR软件。内置多国语言库。 (github.com)

处理+导入数据

当用户按下导入数据按钮时，将识别结果导入sjr数据库并自动刷新界面，主界面出现按当前时间命名的新工程，导入成功。

详细设计

参照scratchJR官方二次开发说明进行开发构建。

前端页面设计

在home.html下添加所需按钮与对应功能：

<ul>
	<button id="myButton" class="shifted-image">
		<img src="./pic.png" alt="拍照识别">
	</button>


	<button id="loadJsonButton">
		<img src="./picture.png" alt="导入代码">
	</button>


	<!-- 在<body>标签中的合适位置添加以下按钮 -->

	<script src="./QSCode/QScript.js"></script>


	<div></div>
</ul>

其中配置按钮的图像与样式的类型如下：其中，点开头的样式用于对按钮位置进行修饰。

<!--pic button style--->
<style>
	button {
		border: none;
		outline: none;
		background: none;
		padding: 0;
		cursor: pointer;
	}

	button img {
		display: block;
	}

	.shifted-image {
		margin-left: 59%;
		/* 将图片下移其高度的一半 */
	}

	.scaled-image {
		width: 70%;
		height: auto;
	}

	.both-images {
		margin-top: 30%;
		width: 80%;
		height: auto;
	}

	.button-container {
		display: flex;
	}
</style>
<!--pic button style--->

其中按钮对应的JavaScript代码将实现拍摄功能对python的调用与前端用户反馈，具体内容如下：

// 点击按钮后执行 
const { exec } = require('child_process');
const path = require('path');

document.addEventListener('DOMContentLoaded', function () {
    // 通过ID获取按钮元素
    var button = document.getElementById('myButton');

    // 为按钮添加点击事件监听器
    button.addEventListener('click', function () {


        // 调用python脚本
        const { spawn } = require('child_process');

        // 使用spawn启动Python脚本
        // const pythonProcess = spawn('python', ['./src/app/QSCode/a().py']);
        const pypath = path.join(__dirname, './QSCode/pic2json/dist/pic2code.exe')
        const pythonProcess = spawn(pypath);

        // 监听Python脚本的输出
        pythonProcess.stdout.on('data', (data) => {
            console.log(`Python script output: ${data}`);
        });

        // 监听Python脚本的错误输出
        pythonProcess.stderr.on('data', (data) => {
            console.error(`Python script error: ${data}`);
        });

        // 监听Python脚本结束事件
        pythonProcess.on('close', (code) => {
            console.log(`Python script closed with code ${code}`);
            alert('图片已拍摄，请确保启动OCR服务后点击"2"按钮'); // 显示一个警告框
        });
    });
});

数据处理与导入

以上JavaScript脚本将调用由python编写的图像处理脚本，代码如下：

# 启动程序时，先调用capture_image()函数，将图像保存到pic文件夹中，然后调用convert_image_to_base64()函数，将图像转换为base64编码，并将编码发送给后端。后端收到编码后，调用OCR接口，将图像中的文字识别出来，并将识别结果返回给前端。前端将识别结果显示在画布上。

#OCR程序应及时启动

import json
import cv2
import os
import base64
import requests
import sqlite3  
import time
import re
import sys
# Existing code

result = {
    "pages": [
        "page 1"
    ],
    "currentPage": "page 1",
    "page 1": {
        "textstartat": 36,
        "sprites": [
            "Tic 1"
        ],
        "num": 1,
        "lastSprite": "Tic 1",
        "Tic 1": {
            "shown": True,
            "type": "sprite",
            "md5": "Blue.svg",
            "id": "Tic 1",
            "flip": True,
            "name": "Tic",
            "angle": 0,
            "scale": 0.5,
            "speed": 2,
            "defaultScale": 0.5,
            "sounds": [
                "pop.mp3"
            ],
            "xcoor": 123,
            "ycoor": 180,
            "cx": 73,
            "cy": 123,
            "w": 147,
            "h": 247,
            "homex": 123,
            "homey": 180,
            "homescale": 0.5,
            "homeshown": True,
            "homeflip": True,
            "scripts": [
                []
            ]
        },
        "layers": [
            "Tic 1"
        ]
    }
}

BDict = {
    "A1": ["onflag", "null",10,10],
    "A2": ["onclick", "null",10,10],
    "A3": ["ontouch", "null",10,10],
    "A4": ["onmessage", "null",10,10],
    "A5": ["message", "Orange",10,10],

    "B1": ["home", "null",10,10],
    "B2": ["hop", "1",10,10],
    "B3": ["left", "1",10,10],
    "B4": ["right", "1",10,10],
    "B5": ["down", "1",10,10],
    "B6": ["up", "1",10,10],
    "B7": ["forward", "1",10,10],
    "B8": ["back", "1",10,10],

    "C1": ["say", "hi",10,10],
    "C2": ["grow", "1",10,10],
    "C3": ["shrink", "1",10,10],
    "C4": ["same", "null",10,10],
    "C5": ["hide", "null",10,10],
    "C6": ["show", "null",10,10],

    "D1": ["endstack", "null",10,10],
    "D2": ["forever", "null",10,10],

    "E1": ["playsnd", "pop.mp3",10,10],

    "F1": ["wait", "1",10,10],
    "F2": ["stopmine", "null",10,10],
    "F3": ["setspeed", "1",10,10],
    "F4": ["repeat", "1",10,10,[]],
}

executable_path = os.path.dirname(sys.executable)
pic_folder_path = os.path.join(executable_path, 'pic')
imagepath = pic_folder_path + '\\' + 'capture.jpg'

def capture_image():
    # Create the pic folder if it doesn't exist
    if not os.path.exists('pic'):
        os.makedirs('pic')

    # Capture image from camera
    cap = cv2.VideoCapture(0)
    ret, frame = cap.read()

    # Save the captured image
    if ret:
        cv2.imwrite(imagepath, frame)
        print("Image captured and saved successfully.")
    else:
        print("Failed to capture image.")

    # Release the camera
    cap.release()

# Convert image to base64
def convert_image_to_base64(image_path):
    with open(image_path, "rb") as image_file:
        encoded_image = base64.b64encode(image_file.read()).decode("utf-8")
        print("Image converted to base64 successfully.")
    return encoded_image


# 调用OCR接口
def ocr(encoded_image):
    # 启动umi.exe
    url = "http://127.0.0.1:1224/api/ocr"
    data = {
        "base64": encoded_image,
        "options": {
            "data.format": "text"
        }
    }

    headers = {"Content-Type": "application/json"}
    data_str = json.dumps(data)
    response = requests.post(url, data=data_str, headers=headers)
    if response.status_code == 200:
        res_dict = json.loads(response.text)
        print("ocr result:\n", res_dict.get("data"))
        matches = re.findall(r'([A-Z]\d|\d|\*)', res_dict.get("data"))
        formatted_result = ' '.join(matches)
        print("formatted result:\n", formatted_result)
        return formatted_result
    else:
        print("Failed to call OCR API.")
        return ""

# 支持循环结构, F4 后接循环次数， 后为循环内的指令，*表示循环结束
def word2Json(userInput):
    inputList = userInput.split()
    loop_start = False
    loop_count = 0
    loop_instructions = []
    loop_x = 10

    for i in range(len(inputList)):
        if i+1 == len(inputList):
            break
        if loop_start:
            if inputList[i] == "*":
                loop_start = False
                result["page 1"]["Tic 1"]["scripts"][0].append(["repeat", loop_count, 0, 0, loop_instructions])
                loop_instructions = []
            else:
                if inputList[i] not in BDict:
                    continue
                loop_instructions.append(BDict[inputList[i]])
        elif inputList[i] == "F4":
            loop_start = True
            loop_count = int(inputList[i+1])
        else:
            if inputList[i] not in BDict:
                continue
            key = inputList[i]
            value = inputList[i+1]
            if value.isdigit():
                value = int(value)
                if value == 0:
                    # 将0设置为"NULL"

                    BDict[key][1] = "null"

            # 将BDict[key][1]的值设置为上一个BDict[key][1]的值加10
            BDict[key][2] = loop_x + 40
            # 更新上一个BDict[key][1]的值
            loop_x = BDict[key][2]
            result["page 1"]["Tic 1"]["scripts"][0].append(BDict[key])       

    
    with open('./a.json', 'w') as file:
        json.dump(result, file, indent=4)
    return result


def mainP2J():
    # 拍照并保存图片
    capture_image()
    # 图片转base64
    encoded_image = convert_image_to_base64(imagepath)
    # print(encoded_image)

    # 调用OCR接口
    userinput = ocr(encoded_image)

    # userinput = "B3 6 F4 2 B7 2 D1 null B8 3 * B8 3"
    word2Json(userinput)
    linkSQL(result)

def debugMain():
    # 拍照并保存图片
    capture_image()
    # 图片转base64
    encoded_image = convert_image_to_base64(imagepath)
    # print(encoded_image)

    # 调用OCR接口
    userinput = ocr(encoded_image)

    #userinput = "B3 6 F4 2 B7 2 D1 null B8 3 * B8 3"
    word2Json(userinput)
    #linkSQL(result)

debugMain()

使用pyinstaller进行python程序编译，可使其在普通Windows平台正常运行。

另一按钮用于将数据写入jsr，当按钮按下，electron页面将通知渲染进程electronClient.js，借由渲染进程向主进程发送加载数据的信息：

// 当home.html页面的按钮loadJsonButton被点击时，会向主进程发送消息，消息类型为'load-json-file'，消息内容为'a.json'。
document.addEventListener('DOMContentLoaded', () => {
    const loadJsonButton = document.getElementById('loadJsonButton');
    loadJsonButton.addEventListener('click', () => {
        // 使用Node.js的fs模块读取本地文件
        ipcRenderer.send('load-json-file', 'a.json');
    });
});

将数据导入sjr需要调取程序提供的SQLite数据库管理实例，于main.js中创建一个新方法用于写入数据，通过阅读项目主进程可发现以下数据库管理方法：

/**
  runs a sql query on the database, returns the number of rows from the result
  @param {json} json object with stmt and values filled out
  @returns lastRowId
  */
  stmt(jsonStrOrJsonObj) {


    try {
      // {"stmt":"select name,thumbnail,id,isgift from projects where deleted = ? AND version = ? AND gallery IS NULL order by ctime desc","values":["NO","iOSv01"]}

      // if it's a string, parse it.  if not, use it if it's not null.
      const json = (typeof jsonStrOrJsonObj === 'string') ? JSON.parse(jsonStrOrJsonObj) : jsonStrOrJsonObj || {};
      const stmt = json.stmt;
      const values = json.values;



      if (DEBUG_DATABASE) debugLog('DatabaseManager executing stmt', stmt, values);

      const statement = this.db.prepare(stmt, values);

      while (statement.step()) statement.get();
      // return JSON.stringify(statement.getAsObject());

      const result = this.db.exec('select last_insert_rowid();');

      const lastRowId = result[0].values[0][0];

      return lastRowId;
    } catch (e) {
      if (DEBUG_DATABASE) debugLog('stmt failed', jsonStrOrJsonObj, e);
      console.error('stmt failed', jsonStrOrJsonObj, e);
      return -1;
    }
  }

透过该方法构建数据库插入语句，并订阅渲染进程传入的导入数据的信息，当接收器收到信号时调用以下方法：

ipcMain.on('load-json-file', (event, filename) => {
  console.log('load-json-file');
  const db = dataStore.getDatabaseManager();

  // 读取JSON文件
  const filePath = './a.json';
  if (!filePath) {
    console.log('load-json-file: File could not be resolved.', filename);
    event.returnValue = null;
    return;
  }
  const data = fs.readFileSync(filePath, 'utf8');
  const newJsonData = JSON.parse(data);

  const currentTime = new Date().toISOString();
  const timestamp = new Date().getTime();

  // 完整测试数据
  const newProject = {
    // ID: newID,
    CTIME: currentTime,
    MTIME: timestamp,
    ALTMD5: null, // 假设没有其他元数据的MD5，使用NULL
    POS: null,     // 假设没有特定位置信息，使用NULL
    NAME: `P${currentTime.slice(2, 15)}`, // 项目名称，这里使用ID前缀'P'作为示例
    JSON: JSON.stringify(newJsonData), // 将JavaScript对象转换为JSON字符串
    THUMBNAIL: JSON.stringify({ "pagecount": 1, "md5": "13_823d327a5d13e21aea5d06732a3c329d.png" }), // 假设的缩略图信息
    OWNER: null,   // 假设没有特定所有者信息，使用NULL
    GALLERY: null, // 假设项目不特定于画廊，使用NULL
    DELETED: 'NO', // 假设项目未被删除
    VERSION: 'iOSv01', // 假设的项目版本
    ISGIFT: 0, // 假设项目不是礼物
  };

  const newCMD = {
    stmt: "insert into projects (ctime, mtime, altmd5, pos, name,json, thumbnail, owner, gallery, deleted, version, isgift) values (?,?,?,?,?,?,?,?,?,?,?,?)",
    values: [newProject.CTIME, newProject.MTIME, newProject.ALTMD5, newProject.POS, newProject.NAME, newProject.JSON, newProject.THUMBNAIL, newProject.OWNER, newProject.GALLERY, newProject.DELETED, newProject.VERSION, newProject.ISGIFT]
  };
  event.returnValue = db.stmt(newCMD);
  if (win) {
    win.reload();
  }
  console.log('load-json-file result:', event.returnValue);
});

完成数据写入后将自动刷新页面实现新工程的可视化，并通过测试数据完成测试：

1
2
3

// 测试JSON数据
const newJsonData = { "pages": ["page 1"], "currentPage": "page 1", "page 1": { "textstartat": 36, "sprites": ["Tic 1"], "num": 1, "lastSprite": "Tic 1", "Tic 1": { "shown": true, "type": "sprite", "md5": "Blue.svg", "id": "Tic 1", "flip": true, "name": "Tic", "angle": 0, "scale": 0.5, "speed": 2, "defaultScale": 0.5, "sounds": ["pop.mp3"], "xcoor": 123, "ycoor": 180, "cx": 75, "cy": 126, "w": 151, "h": 253, "homex": 123, "homey": 180, "homescale": 0.5, "homeshown": true, "homeflip": true, "scripts": [[["back", "1", 96, 44], ["left", "1", 161, 44], ["repeat", 2, 226, 28, [["forward", "1", 261, 44], ["endstack", "null", 326, 44]]]]] }, "layers": ["Tic 1"] } };

程序打包

使用electron提供的打包工具或原sjr提供的指令进行程序编译，详细配置参数如下：

"scripts": {
    "start": "electron-forge start",
    "debugMain": "electron-forge start -i",
    "package": "electron-forge package",
    "make": "electron-forge make",
    "make64": "electron-forge make --arch=x64",
    "make32": "electron-forge make --arch=ia32",
    "makeAll": "electron-forge make --arch=ia32,x64",
    "publish": "electron-forge publish",
    "lint": "eslint src --color"
},

可针对不同的用户平台生成对应的可执行文件。

参考文献

Umi-OCR GitHub Repository
- 作者：hiroi-sora
- 标题：Umi-OCR: OCR software, free and offline
- URL：https://github.com/hiroi-sora/Umi-OCR
Umi-OCR Documentation
- 作者：hiroi-sora
- 标题：Umi-OCR Documentation
- URL：https://github.com/hiroi-sora/Umi-OCR/blob/main/docs/http/README.md
How to Write a Highly Readable Software Engineering Design Document
- 作者：古道轻风
- 标题：如何写一份高可读性的软件工程设计文档
- URL：https://www.cnblogs.com/88223100/p/How-to-write-a-highly-readable-software-engineering-design-document.html
ScratchJr Desktop README
- 作者：krayon
- 标题：ScratchJr Desktop README
- URL：https://github.com/krayon/scratchjr-desktop/blob/master/README.md
my-electron-app GitHub Repository
- 作者：sui5yue6
- 标题：my-electron-app Source Code
- URL：https://github.com/sui5yue6/my-electron-app/blob/main/main.js
Electron Quick Start Guide
- 作者：Electron Team
- 标题：快速入门 | Electron
- URL：https://www.electronjs.org/zh/docs/latest/tutorial/quick-start

≡