How to use the Handwriter package


Interested in trying it out yourself?

While we work on something interactive and web-based, you can download the package yourself and give it a try. The following is a step by step tutorial to help you along the way.

Recently we have been incorpoating our research work built on top of handwriter so that you may also use those. See the Research Pipelines section for more.


Things you'll need:

  • The R software enviroment, downloadable from The R Project. You may use this mirror from Iowa State if you wish.
  • RStudio Desktop, an IDE that should simplify the experience.
  • A sample of handwriting in .png format, you can:
    • Write something up (black and white works best) and scan it digitally.
    • Use an online tool like Sketch.io create and export some handwriting easily.
    • Use one of our images to get started.

Terms to know

  • Glyphs | Often letters, but not always due to the separation algorithm used.
  • Index | Top to bottom and left to right, our way of keeping track where a pixel sits on the document as a whole.

Getting started

Once you have R installed, you'll want to install and load our package from CRAN using:

                      install.packages("handwriter")
                        library(handwriter)
                    

Get your image as a .png (we'll use this one, available here):

Writing_csafe_single

Once you have that, read in the image: The image is also cropped as part of this process.

                        csafe = list()
                          csafe$image = readPNGBinary("path/to/the/picture.png")
                      


Preparing the image for processing

Plot the original, cropped image:

                        plotImage(csafe$image)
                      
Writing_csafe_single


thin the image and you can plot it again:

                        csafe$thin = thinImage(csafe$image)
                          plotImageThinned(csafe$image, csafe$thin)
                      
Writing_csafe_single

Processing the image exploring the results

Process the image

                        csafe_processlist = processHandwriting(csafe$thin, dim(csafe$image))
                      

processHandwriting() will return tons of information about the document. It is worth exploring, and we've provided a handy list here of what each element of the list means

On a document level:

  • nodes | A list of all 'points of interest'
  • connectingNodes | A list of all nodes where glyphs connect
  • terminalNodes | A list of all nodes where a path in a glyph ends
  • breakPoints | A list of calculated points to break glyphs apart (based on connectingNodes)

As well as a letterList for each letter/glyph that includes:

  • path | A list of all points
  • nodes | A list of all 'points of interest'
  • allPaths | A list of lists of calculated 'paths'
  • adjMatrix | Adjacency matrix
  • letterCode | A unique letter code
  • connectingNodes | A list of all nodes where the glyph connects
  • terminalNodes | A list of all nodes where a path in the glyph ends
  • characterFeatures | A list of all 'points of interest' in the document
    • aspect_ratio | Height to width ratio
    • height | Height of the glyph, measured in pixels
    • width | Width of the glyph, measured in pixels
    • topmost_row | The top-most row, as its y coordinate
    • bottom_row | The bottom-most row, as its y coordinate
    • leftmost_col | The left-most column, as its x coordinate
    • rightmost_col | The left-most column, as its x coordinate
    • centroid_index | The centroid of the glpyh, as its index
    • centroid_y | The y coordinate of the centroid
    • centroid_x | The x coordinate of the centroid
    • centroid_horiz_location |
    • centroid_vert_location |
    • lHalf | List of all points on the left half of the glyph
    • rHalf | List of all points on the right half of the glyph
    • disjoint_centroids | The centroids of the left and right halves, as their idicies
    • slope | The slope of the glyph as it runs through the centroid
    • pixel_density |
    • box_density |
    • uniqueid | A unqiue numerical identifier for the glyph
    • down_dist | Distance from the lowest point of a glyph to the next glyph, measured in pixels
    • line_number | The line number the glyph falls in
    • order_within_line | The ordered within the line the glyph falls in
    • l_neighbor_dist | Distance from the left-most point in the glyph to its left neighbor, measured in pixels
    • r_neighbor_dist | Distance from the right-most point in the glyph to its left neighbor, measured in pixels
    • xvar | Varience of X, used in calculate covariance
    • yvar | Varience of Y, used in calculate covariance
    • covar | Covarience of the glyph
    • wordIndex | Word number the glyph belongs to

Exploring processed writing

Using the information returned from processHandwriting(), Handwriter allows plotting on a letter (or glyph), word, sentence, or the entire document.

Ensure to save this information for plotting to work correctly

                        csafe$nodes = csafe_processlist$nodes
                          csafe$breaks = csafe_processlist$breakPoints
                          dims = dim(csafe$image)
                      

Also included is the ability to plot individual glyphs from the sample of writing, using plotLetter(). First lets look at the parameters and options, and then run through a few examples.

The parameters include:

  • letterList Object
  • The index of the glyph you wish to plot
  • dims Object
  • OPTIONAL: Boolean - Number the paths within the glyph
  • OPTIONAL: Boolean - Plot the centroid of the glyph
  • OPTIONAL: Boolean - Plot the slope of the glyph

The following will result in the first glyph being plotted with all optional paramters:

                        plotLetter(csafe_processList$letterlist, 1, dims)

                          #Note: No optional parameters specified is the same as:
                          #plotLetter(csafe_processlist$letterList, 1, dims, TRUE, TRUE, TRUE)
                      
c_all_features

This will plot the fifth glyph with just the slope and centroid:

                        plotLetter(csafe_processlist$letterList, 5, dims, FALSE, TRUE, TRUE)
                      
c_all_features

To plot words, a little bit of extra processing must be done:

                        words = create_words(csafe_processList)
                              words_after_processing = process_words(words, dim(csafe_document$image), TRUE)
                      

Then you can plot just the word with plotWord

                          plotWord(csafe_processList$letterList, 1, dims)
                        

Or optionally, use the plotColorNodes function to show some additional information.

                          plotColorNodes(csafe_processList$letterList, 1, dims, words_after_processing)
                        
Writing_csafe_single

Plot the line, where the second parameters is the line_number

                        plotLine(csafe_processList$letterList, 1, dims)
                      

Plot the original, cropped image

                        plotImage(csafe$image)
                      
Writing_csafe_single

Plot the thinned image

                        plotImageThinned(csafe$image, csafe$thin)
                      
Writing_csafe_single

Plot all nodes found during processing

                        plotNodes(csafe$image, csafe$thin, csafe$nodes)
                      
Writing_csafe_single

Plot all glyph breaks found during processing

                        plotNodes(csafe$image, csafe$thin, csafe$breaks)
                      
Writing_csafe_single

Research Pipelines

K-means Clustering | Perform K-means clustering on a glyph level | Read more

Triangle Decomposition | Perform triangle decomposition on a word level | Read more