Introduction

Chris Claremont began his contributions to the Uncanny X-Men series with the 94th issue, and worked from 1975-1991 writing the popular Marvel comic series (Chris Claremont). The X-Men are mutant superheroes who traverse worlds, fighting to protect humans and mutants alike. Throughout the issues published by Claremont, the geographic setting shifts all the time, but frequently Claremont reused some of the same cities, states, countries, or parts of the world. I was interested in exploring which of these areas of the world are used the most for his issues. With the help of GIS software, this project’s focus was to create a heat map that visualizes the most frequented (Earth) locations by the X-Men, during Claremont’s run.
Sources
Claremontrun is an open-source Github repository that holds numerous datasets compiled as a part of a project tracking Claremont’s X-Men publications. There are seven datasets that have been collected in regard to this project and they all track different aspects of his issues. Here are some examples of information they hold: character information, such as their actions; locations that appear in each issue; data about issue covers; bechdel test data. My project, of course, made use of the Locations dataset. The Location data originally only had four columns–“# Issue”, “Location”, Context (e.g.”Present “,”Dream “, or “Past”), and Notes– formatted as an XLSX which contained 1,592 rows. The “Location” values are strings that either discretely describe location (e.g. Jean Grey’s Apartment, Greenwich Village, NYC), or vaguely give the location setting (or object) that the characters in (e.g. “Asteroid M, Space”, or “Plane flying over Canada”). This type of string data made it too difficult to parse in Excel or OpenRefine, so I converted it to a CSV and wrangled it using R/RStudio. I used ArcGIS Online to create my heat map and supplemental pie chart.
Presentation
I decided to use a simpler website to share my information. I created an Instant App through ArcGIS and embedded that into my website. For some reason, it at first only shows the chart, but viewers can navigate to the map themselves, or click on the provided link to an external view. I have my map and text side by side, as I thought that it might be helpful to read the process while glancing at the map to see the direct results of my decisions. I’ve also included my data at the bottom of the page so that others can use it for themselves if they ever wanted. The original data can be found in Github.
Data Processes
There was lots to do when it came to cleaning up this data. For starters, I removed the “Notes” column as it contained no information relevant to my goals, and narrowed the scope of “Context” so that I was only focusing on “Present” locations. I thought there was a possibility that past or dream contexts could’ve referred to locations that were part of the X-Men series, but maybe not in the issues written by Claremont. Furthermore, I came up with a few criteria for which location data I could keep based on clarity, and whether it could actually be recognized and mapped in ArcGIS online. I used keywords and string detection to process the data in this way. Any location description that contained a word related to Space, Planet, Orbit, Spaceship, Aircraft, Plane, Dimension, or some sort of motion, I removed. Next, I wanted to standardize most of the locations so that a location such as “Stornoway, Scotland’s North-West Coast” would become “Stornoway, Scotland”. For this, I used spacyr, an R package with Named Entity Recognition (NER) which recognized and extracted locations names of cities, states, or countries. This step in my processing was where I ran into problems with OpenRefine, and decided to switch to a tool I was more familiar with. These choices left me with a much smaller, but cleaner dataset: 582 rows of mappable data. My final dataset had four columns: “Issue #”, “Location”, “Context”, and “Place” which stores all the standardized location data.
-
If only the pie chart is showing, find the small map viewer button in the top right corner. Or, click this.
View my cleaned datasets below.
Final Location was used for mapping purposes, and Location Count was used for the chart.
Mapping Processes
With ArcGIS online, I chose to create a heat map because I thought it would be the best way to visualize and represent the most frequent location settings. Additionally, I kept an individual point layer that overlays the heat points so that low frequency locations could still be seen easily. It’s important to acknowledge that due to some of my earlier datasorting decisions, the locations have been more generalized than they originally were so unique locations (like a specific city park) might’ve just been grouped into that city location. Super high point accuracy was not as important for my goal, rather made sure the frequency of an area is visualized well. My pie chart also groups together locations whose frequencies are 1 into “Other” category.
Significance
Visualizing the locations from X-Men issues allows readers to perceive information that’s not necessarily clear if one is just reading book by book. The larger scope that this map covers highlights some of Claremont’s patterns and lets fans understand the writer, and series better. It could also make issues with unique locations more interesting or mysterious. In the claremontrun repository, there’s an example of a bar chart that attempts to visualize location counts. Although the chart holds some information, it feels very underwhelming and challenging to comprehend where some of these locations even are. Using digital tools to map this sort of information is not only more engaging, but also much more effective when it comes to properly representing the data and informing the viewer.