A Pen Plotter Powered by Artificial Intelligence

Share
Share

This article has been contributed by Lin Ma, Software Engineer and KVM Virtualization Specialist at SUSE. If you want to read more from him about virtualization, machine learning and artificial intelligence, have a look at the following articles:

How to Do Deep Machine Learning Tasks Inside KVM Guests with a Passed-through NVIDIA GPU

Machine Learning: Oriented Text Detection from Natural Scene

Face Recognition Based on *dlib* in a KVM Guest

 

Today I’m going to briefly introduce a home-made pen plotter which is based on machine learning. It can detect, recognize and answer very simple exam questions about mathematical computing.

Components

The following sections detail the technical requirements and specifications of our setup.

 

 

Host Components

  • Raspberry Pi 3
  • openSUSE Tumbleweed raspberrypi3 aarch64 2019.05.17 image
  • TensorFlow 1.13.1 from the science:machinelearning repository in the Open Build Service (OBS)

 

Artificial Intelligence (AI) Components

  • Pre-trained Connectionist Text Proposal Network (CTPN) neural network model of oriented text detection in a natural scene  (For more details, please read https://www.suse.com/c/machine-learning-oriented-text-detection-from-natural-scene/)
  • Baidu optical character recognition (OCR) service

 

Upper Computer Component

  • Arduino UNO R3 plus Grbl firmware

 

Machine Components

  • CNC Shield V3 Expansion Board (1x)
  • A4988 stepper motor driver module (2x)
  • NEMA 17 stepper motor (2x)
  • Servo motor (1x)
  • Camera (1x)
  • Limit Switch (2x)
  • Plain shaft with slider (2x)
  • Threaded shaft with coupling & nut (2x)
  • Aluminum alloy plate

 

   

Workflow

1. Prepare your questions for the mathematical computing on a piece of paper.

2. The AI pen plotter takes a picture through the camera and resizes it to 1280 * 1690, like in the example below:

— Neural Network Job START—

3. The picture will be handled by the pre-trained CTPN neural network model on the Raspberry Pi 3 to figure out where the text is and to record the relative positions.

4. A script will perform a couple of screenshots according to the positions and send these sub-pictures which include the text to the Baidu OCR service. See the example below:

5. The Baidu OCR service processes these sub-pictures. Then it recognizes and returns the text.

— Neural Network Job DONE—

6. A script checks the returned text, searches for keywords, and tries to understand the questions. Then it starts to parse the questions and generates the process to create a picture. See the example below:

7. A script calculates the relative positions and converts the picture into a gcode file, as in the example below:

8. Run the command <gcode sender> to send these gcode instructions to the upper computer.

9. The Grbl firmware generates the step and direction signals according to the gcode instructions and hands them on to the CNC Shield V3.

10. The CNS Shield V3 controls the linear XY motion with the stepper motor and the movement of the pen (up/down) with the servo motor.

Live Demonstration

If you want to see a live demo of what is described above, just have a look at the videos below.

 

 

Summary

As you can see, to process a picture of this size which contains only one short mathematical question, the time consumed is around 11 minutes. It is very likely that the time consumed for the entire process can be reduced by 50 percent, if the code is changed and sends the text detection job to ‘cloud’ instead of to the native Raspberry Pi 3, or if you use Raspberry Pi 3 with Neural Compute Stick(s) for accelerating the inference. But this assumption still would have to be proven :-).

By the way, the latest TensorFlow  version 1.13.1 as available in openSUSE factory is more functional and more stable.

Thank you very much for your attention!

Share
(Visited 43 times, 1 visits today)

Leave a Reply

Your email address will not be published. Required fields are marked *

No comments yet

Avatar photo
10,193 views
Meike Chabowski Meike Chabowski works as Documentation Strategist at SUSE. Before joining the SUSE Documentation team, she was Product Marketing Manager for Enterprise Linux Servers at SUSE, with a focus on Linux for Mainframes, Linux in Retail, and High Performance Computing. Prior to joining SUSE more than 20 years ago, Meike held marketing positions with several IT companies like defacto and Siemens, and was working as Assistant Professor for Mass Media. Meike holds a Master of Arts in Science of Mass Media and Theatre, as well as a Master of Arts in Education from University of Erlangen-Nuremberg/ Germany, and in Italian Literature and Language from University of Parma/Italy.