In this project, we study how to enable a robot to master Chinese calligraphy. The goal is to obtain an intelligent brush control agent powered by RL with high generalization capability, i.e. able to write any character in any style. To make it really useful in the real-world, we aim to deploy a writing controller on a robot arm with end effector attached to a calligraphy brush dipped with ink, and when commanded with a character or a sequence of characters, it should write it out faithfully in a chosen style on a piece of paper.
Background
Writing (characters on a piece of paper, not writing essays) is one of the defining characteristics that make us different from any other species, and non-trivial – students have to practice a lot in order to write beautiful words. Writing can be even harder for some ethnic groups such as Chinese considering the fact that there are over thousands of characters and some characters can be quite complex.
Writing requires different functionalities in our brain to work together, such as memorizing coarse trajectories so that others can recognize what we write, MPC-alike writing tool manipulation ability with simulation going on in the brain, and aesthetic discrimination of the final character image.
Chinese calligraphy is the writing of Chinese characters using an specialized brush (somewhat alike watercolor brushes) in an aesthetically meaningful way. It usually requires a practitioner years of training to be capable of dexterous manipulation of an inky soft brush tuft against a water-absorbing paper to create an image of a specific character in an aesthetically valuable way (you have to pay attention to both the overall style and local fine-grained details). It is a challenging control task due to hard-to-predict 3D soft-body dynamics, friction, adhesion, and fluid dynamics of ink in the paper.
In recent years, we have seen tremendous advances of AI content generation methods in various domains, including text, audio, image and video. Chinese calligraphy, with a history dating back to 2000 years ago, is an integral part of Chinese traditional art. Considering the difficult nature of mastering calligraphy and with the advent of deep reinforcement learning and robotics, it is tempting to ask: can we a robot master calligraphy at human level by learning?
Related work (this part is incomplete)
Generating digital Chinese calligraphy has been studied extensively since the Information Age.
hand-crafted rules
There are many works in handwriting generation before the prevailing of deep learning
RNN
These methods lack the ability to generate images with highly-varying stroke boldness and fine-
grained texture, which gives calligraphy its unique artistic meaning compared to normal handwriting.
generative models with neural nets
There are a lot of works using VAE or GAN to generate calligraphy of a specific style. However,
these methods use complex neural networks to directly output image pixels, which is very unnatural
considering how calligraphy is actually created by human. Our method trains a controller that output
brush motion sequences and bears more resemblance to the natural handwriting process than these
methods.
style transfer using pix2pix alike methods
trajectory optimization / optimal control
These methods perform character-wise optimization and lack the generalization ability of ML
algorithms
GA Tech: “using vector-based character database, which provides a quick and accurate way
to extract strokes, as well as stroke order”
TODO: using varational model in VMAIL. use learned model instead of using system
identification; using SysID+learning in SimGAN) “The dynamic virtual brush model has
two components: a drawing component, and a dynamic update component. The drawing
component describes how the brush leaves a mark on paper depending on its parameters.
The updating component then describes how the brush parameters are updated due to
deformations when executing an open-loop trajectory [x(t), y(t), z(t)] .
extract strokes and run optimization based on individual strokes
slow end-effector velocity to prevent excessive jerk and vibrations.
accurately predicting the state of the brush and finding a feasible control trajectory is
difficult and unreliable. To avoid accumulating prediction error as more strokes are written,
we have the robot dip ink after a stroke is written, which restores the brush to a predictable
state. We handcrafted a control algorithm to accomplish this: given a circular inkstone, the
brush is pushed down heavily at first to make the tip flat, and we then slowly move it to the
edge of the inkstone in different directions with a gradually smaller extent.
DRL robotics calligraphy
Only care about writing low-granularity(interpolation between
adjacent trajectory points, totally 6 points) parameterized single stroke. cosine distance
between two images low resolution training image samples: 28x28