stoRy is a Tidyverse friendly package for downloading, exploring, and analyzing Literary Theme Ontology (LTO) data in R.
# Install the released version of stoRy from CRAN with:
install.packages("stoRy")
# Or the developmental version from GitHub:
# install.packages("devtools")
::install_github("theme-ontology/stoRy") devtools
The easiest way to get started with stoRy is to make use of the LTO demo version data. It consists of the themes and 335 thematically annotated The Twilight Zone American media franchise stories from the latest LTO version.
Begin by loading the stoRy package:
library(stoRy)
The LTO demo version is loaded by default:
which_lto()
Get a feel for the demo data by printing some basic information about it to console:
print_lto()
See the demo data help page for a more in depth description:
`lto-demo` ?
Thematically annotated stories are initialized by story ID. For example, run
<- Story$new(story_id = "tz1959e1x22") story
to initialize a Story
object representing the classic
The Twilight Zone (1959) television series episode The
Monsters Are Due on Maple Street.
Story thematic annotations along with episode identifying metadata
are printed to console in either the default or the standard
.st.txt
format:
story$print(canonical = TRUE) story
There are two complementary ways of going about finding story IDs. First, the LTO website story search box offers a quick-and-dirty way of locating LTO developmental version story IDs of interest. Since story IDs are stable, developmental version The Twilight Zone story IDs can be expected to agree with their demo data counterparts. Alternatively, a demo data story ID is directly obtained from an episode title as follows:
# install.packages("dplyr")
library(dplyr)
<- "The Monsters Are Due on Maple Street"
title <- clone_active_stories_tbl()
demo_stories_tbl <- demo_stories_tbl %>% filter(title == !!title) %>% pull(story_id)
story_id story_id
The dplyr
package is required to run the
%>%
mediated pipeline.
A tibble of thematic annotations is obtained by running:
<- story$themes()
themes themes
The Monsters Are Due on Maple Street is a story about how mass
hysteria can transform otherwise normal people into an angry mob. To
view the mass hysteria theme entry, initialize a
Theme
object with theme_name
argument defined
accordingly:
<- Theme$new(theme_name = "mass hysteria")
theme
theme$print(canonical = TRUE) theme
To view a tibble of all demo data stories featuring mass hysteria run:
$annotations() theme
As with story IDs, there are two ways to look for themes of interest. Developmental version themes are searchable from LTO website theme search box. Demo version themes are explorable in tibble format. For example, here is one way to search for mass hysteria directly in the demo themes:
# install.packages("stringr")
library(stringr)
<- clone_active_themes_tbl()
demo_themes_tbl %>% filter(str_detect(theme_name, "mass")) demo_themes_tbl
Notice that all themes containing the substring "mass"
are returned.
Each story belongs to at least one collection (i.e. a set of related stories). The Monsters Are Due on Maple Street, for instance, belongs to the two collections:
$collections() story
To initialize a Collection
object for The Twilight
Zone (1959) television series, of which The Monsters Are Due on
Maple Street is an episode, run:
<- Collection$new(collection_id = "Collection: tvseries: The Twilight Zone (1959)") collection
Collection info is printed to console in the same way as with stories and themes:
collection$print(canonical = TRUE) collection
In general, developmental version collections can be explored from the LTO website story search box or through the package in the usual way:
<- clone_active_collections_tbl()
demo_collections_tbl demo_collections_tbl
The LTO thematically annotated story data can be analyzed in various ways.
To view the top 10 most featured themes in the The Twilight Zone (1959) series run:
<- Collection$new(collection_id = "Collection: tvseries: The Twilight Zone (1959)")
collection <- get_featured_themes(collection)
result_tbl result_tbl
To view the top 10 most featured themes in the demo data as a whole run:
<- get_featured_themes()
result_tbl result_tbl
To view the top 10 most enriched, or over-represented themes in The Twilight Zone (1959) series with all The Twilight Zone stories as background run:
<- Collection$new(collection_id = "Collection: tvseries: The Twilight Zone (1959)")
test_collection <- get_enriched_themes(test_collection)
result_tbl result_tbl
To run the same analysis not counting minor level themes run:
<- get_enriched_themes(test_collection, weights = list(choice = 1, major = 1, minor = 0))
result_tbl result_tbl
To view the top 10 most thematically similar The Twilight Zone franchise stories to The Monsters Are Due on Maple Street run:
<- Story$new(story_id = "tz1959e1x22")
query_story <- get_similar_stories(query_story)
result_tbl result_tbl
Cluster The Twilight Zone franchise stories according to thematic similarity:
# install.packages("isa2")
library(isa2)
set.seed(123)
<- get_story_clusters()
result_tbl result_tbl
The command set.seed(123)
is run here for the sake of
reproducibility.
Explore a cluster of stories related to traveling back in time:
<- 3
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
Explore a cluster of stories related to mass panics:
<- 5
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
Explore a cluster of stories related to executions:
<- 7
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
Explore a cluster of stories related to space aliens:
<- 10
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
Explore a cluster of stories related to old people wanting to be young:
<- 11
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
Explore a cluster of stories related to wish making:
<- 13
cluster_id pull(result_tbl, stories)[[cluster_id]]
pull(result_tbl, themes)[[cluster_id]]
The package works with data from these LTO versions:
lto_version_statuses()
To download and cache the latest versioned LTO release run
configure_lto(version = "latest")
This can take awhile.
Load the newly configured LTO version as the active version in the R session:
set_lto(version = "latest")
To double check that it has been loaded successfully run
which_lto()
Now that the latest LTO version is loaded into the R session, its stories and themes can be analyzed in the same way as with the “demo” LTO version data as shown above.
If you encounter a bug, please file a minimal reproducible example on GitHub issues. For questions and other discussion, please post on the GitHub discussions board.
All code in this repository is published with the GPL v3 license.