A Pen Plotter Powered by Artificial Intelligence
This article has been contributed by Lin Ma, Software Engineer and KVM Virtualization Specialist at SUSE. If you want to read more from him about virtualization, machine learning and artificial intelligence, have a look at the following articles:
How to Do Deep Machine Learning Tasks Inside KVM Guests with a Passed-through NVIDIA GPU
Machine Learning: Oriented Text Detection from Natural Scene
Face Recognition Based on *dlib* in a KVM Guest
Today I’m going to briefly introduce a home-made pen plotter which is based on machine learning. It can detect, recognize and answer very simple exam questions about mathematical computing.
Components
The following sections detail the technical requirements and specifications of our setup.
Host Components
- Raspberry Pi 3
- openSUSE Tumbleweed raspberrypi3 aarch64 2019.05.17 image
- TensorFlow 1.13.1 from the science:machinelearning repository in the Open Build Service (OBS)
Artificial Intelligence (AI) Components
- Pre-trained Connectionist Text Proposal Network (CTPN) neural network model of oriented text detection in a natural scene (For more details, please read https://www.suse.com/c/machine-learning-oriented-text-detection-from-natural-scene/)
- Baidu optical character recognition (OCR) service
Upper Computer Component
- Arduino UNO R3 plus Grbl firmware
Machine Components
- CNC Shield V3 Expansion Board (1x)
- A4988 stepper motor driver module (2x)
- NEMA 17 stepper motor (2x)
- Servo motor (1x)
- Camera (1x)
- Limit Switch (2x)
- Plain shaft with slider (2x)
- Threaded shaft with coupling & nut (2x)
- Aluminum alloy plate
Workflow
1. Prepare your questions for the mathematical computing on a piece of paper.
2. The AI pen plotter takes a picture through the camera and resizes it to 1280 * 1690, like in the example below:
— Neural Network Job START—
3. The picture will be handled by the pre-trained CTPN neural network model on the Raspberry Pi 3 to figure out where the text is and to record the relative positions.
4. A script will perform a couple of screenshots according to the positions and send these sub-pictures which include the text to the Baidu OCR service. See the example below:
5. The Baidu OCR service processes these sub-pictures. Then it recognizes and returns the text.
— Neural Network Job DONE—
6. A script checks the returned text, searches for keywords, and tries to understand the questions. Then it starts to parse the questions and generates the process to create a picture. See the example below:
7. A script calculates the relative positions and converts the picture into a gcode file, as in the example below:
8. Run the command <gcode sender> to send these gcode instructions to the upper computer.
9. The Grbl firmware generates the step and direction signals according to the gcode instructions and hands them on to the CNC Shield V3.
10. The CNS Shield V3 controls the linear XY motion with the stepper motor and the movement of the pen (up/down) with the servo motor.
Live Demonstration
If you want to see a live demo of what is described above, just have a look at the videos below.
Summary
As you can see, to process a picture of this size which contains only one short mathematical question, the time consumed is around 11 minutes. It is very likely that the time consumed for the entire process can be reduced by 50 percent, if the code is changed and sends the text detection job to ‘cloud’ instead of to the native Raspberry Pi 3, or if you use Raspberry Pi 3 with Neural Compute Stick(s) for accelerating the inference. But this assumption still would have to be proven :-).
By the way, the latest TensorFlow version 1.13.1 as available in openSUSE factory is more functional and more stable.
Thank you very much for your attention!
Related Articles
Feb 15th, 2023
No comments yet