Description:
In the domain of copyright law that deals with fictional works, issues of nonliteral
copying have been quite contentious. The focus has been on how to protect the public
domain against monopolies of ideas that serve as fodder for creative writings, while
also providing adequate protection for authors expressions of ideas in order to incentivize future work. Several judges have developed tests that can be applied to fictional
works, however they are rather abstract and rely on the discretion of those involved
in individual court cases. With this in mind, I sought out to develop an automated
method that seeks to identify unique expressions of ideas in literary works. Drawing
from discussions of nonliteral copying in the context of copyright infringement, expressions are hereafter defined as patterns composed of the following literary components:
writing style, character personalities, plot themes, plot developments, and relationships
between characters. The method I propose as a tool for detecting nonliteral copying
is a data visualization. This method relies on computational linguistics and also on
the power of data visualization to uncover otherwise obscured patterns of expression
through the use of color, layers, and small multiples. The efficacy of the linguistic
analysis and data visualization is judged by its ability to accurately identify important
characters, concepts, and plot developments on works in isolation. Additionally, the
efficacy of the data visualization as a tool for identifying nonliteral copying is analyzed
using works written by the same author and the discussion of its application to a works
by distinct authors.