The second cohort of data visualization interns are off and running here at the Digital Project Studio. They will be sharing the projects they are working on very soon. But, as they are getting up to speed, I want to take a minute to reflect on learning about data visualization and technology in general. Recently, in our Tech and Texts seminar series, we read some selections of Wilkinson’s The Grammar of Graphics (an interesting book which formed the basis for the R plotting library ggplot). He begins with an insightful reflection on the difference between graphics and charts:
Opinions tend to reflect feelings as well as beliefs. Sentiment analysis, also known as opinion mining, is a technique used today for generating data on trends in people’s attitudes and feelings on anything from products and services to current events. This data is created by calculating sentiment scores using what people have said or written. Despite the efforts of computer scientists, semanticists and statisticians to figure out ways to program computers to identify the feelings expressed in words, the technique of sentiment analysis is still at best only reliable as a starting point for closer readings.
The results of sentiment analysis can quickly become misleading if presented without any reference to the actual passages of text that were analyzed. Nevertheless, it is helpful as a technique for delving into large corpora and collections of unstructured texts to capture trends and shifts in sentiment intensity.
For a final collaborative project of the academic year 2015-2016, our team at the Digital Projects Studio decided to take on the challenge of visualizing the intensity of emotions and opinions expressed during the 2016 primary election debates. (Click here to see the final product). Our dataset was a set of complete transcripts for twelve Republican and eight Democratic debates. To process the data, we filtered out interventions of moderators and interjections from the audience, ran the statements of each candidate through a sentiment analyzer from Python’s NLTK (Natural Language ToolKit) library, and indexed the statements of each candidate by debate number, numeric sentiment score, and sentiment category.
A recent project I’ve been working on in the Digital Projects Studio has been moving a website built in PHP to Django. To understand why we’d go through the headache of moving the site into Django it’d be good to first understand some of the scope of the project.
This post is a follow-up to the introduction to the Field Notebook and the demo notebook, ‘Monumental Gifts’. I will go over how to install the app and start customizing your own web-based Field Notebook. This post will focus on how to start tailoring the models and appearance of your Notebook to suit your needs for your research. If you are interested (or discover later that you are interested) in building your own original application from scratch, I recommend working through the Beginner’s Tutorial on Django’s website. In fact, even if you don’t plan on building your own application, I still recommend the tutorial. You’ll have better understanding of how to modify and use your Field Notebook if you become familiar with how Django works as a framework.
Installing the app
Our team at the Digital Projects Studio is excited to present our web-based ‘field notebook’, designed with humanities and social science researchers in mind. We wanted to offer field researchers a reusable application with enough structure for ease of use but also with options for further customization according to individual needs. The main purpose of the application is to permit researchers to continually add digital objects, then retrieve and automatically group these objects in different ways as their field collections grow larger. Automated grouping is made possible by requiring the user to add some basic metadata to a digital object as soon as they enter it into the application’s database. The kind of metadata used can also be customized according to the user’s needs.
When first learning how to integrate my Bootstrap and Django, I wasn’t able to find a quick cheat sheet to reference without visiting different documentation pages. To help others I’ve put together a list below of the tags I used most often. A full list of Django tags and filters can be found here.
In this blog, we will cover the basics of creating a bar chart using a given set of data in D3.js.
What will you need:
Step 1: Create CSV file
Prepare some sample data in excel and save it as a CSV file. Here is some unemployment data for the US in the month of Jan from 2005 to 2015
Here, Year and Jan – headers for the two columns will act as properties of data when you bring it in.
Step 2: Create HTML file
Here is the basic template to start off your HTML file. Make sure to save this HTML file in the same folder as your CSV file.[gist https://gist.github.com/noureend/a4687f25d5c0021d63ad]
In order to do this, you will want to do this on a server as most browsers won’t render it. I am using XAMPP.
The function to fetch the data to D3 is:[gist https://gist.github.com/noureend/3af22c1e6f00bee9d12c]
Because our data is in a csv file, we call d3.csv. If your data is in json format you could call d3.json.
Then, we specify our arguments.
- The first argument is the path to the data file. Since, my data file is in the same folder as the HTML file, I can just specify the name of the data file.
- The second argument is a callback function.
Step 4: Create a SVG container
We specify the basic size of our SVG container using the attr function
We create a variable called canvas which then becomes a shortcut for calling the code on the right of the equal to sign.[gist https://gist.github.com/noureend/2967002b5eac58921e13]
Step 5: Creating Bars
It’s now time to add our bars for the bar graph.[gist https://gist.github.com/noureend/38e6f123a9a0cbd8e3e1]
We refer back to the data that we created earlier as an argument to our callback function. Which in turn references the data stored in our file.
Next, using the enter method we will append a rectangle for each data element and give it some properties (width, height, y position, and color).
You will notice that the width and y position are functions. The reason for this is you want to specify which data property you are referencing with the ‘d’ variable. The “* 10” multiplies the data by 10 and the bars get bigger.
For the ‘y’ attribute is a function of the index. We want to return the index for each data element, then times it by 50.
Step 6: Adding text to the bars[gist https://gist.github.com/noureend/0c4c53d728117cc6a90c]
To add text, we will append “text”, specify the color of the text. The most important thing here is perhaps the ‘y’ attribute. You want the text of each bar to the at the same position as the bar so that you can see which text belongs where. Therefore, similar to the above, copy the ‘y’ attribute (vertical position for the text).
Lastly, you want to specify what text you want to have. So, let the text property be a function of data and return the ‘Month’ property.
Now, open the file in your browser and you should see this: