scatterplot with crossbars reflecting the mean


Here's a scatterplot representing the distribution of pupil diameters for a sample of young adults exposed to different light intensities. The three conditions include: darkness, mid-range illuminance, and very bright ambient light. I've added a black crossbar representing the mean pupil diameter for each condition.

ggplot(scat.raw.melt, aes(variable, value, shape=variable, fill=variable)) + 
  geom_point(shape=21, color="black", size=2.3, alpha=.6, position = position_jitter(w = 0.03, h = 0.0)) + 
  scale_fill_manual(values=newvec) + theme_bw() + 
  theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank(), panel.background = element_blank())+ 
  ylab("Raw Pupil Size (mm)") + theme(legend.position="none") +
  theme(axis.title.x=element_blank()) + 
  stat_summary(fun.y = mean, fun.ymin = mean, fun.ymax = mean, geom = "crossbar", color = "black", size = 0.4, width=0.3)  +
  theme(panel.border = element_blank()) + theme(axis.line = element_line(colour = "black")) 


Scaling axes and axis ticks 

Screen Shot 2018-07-13 at 7.39.26 PM.png
Screen Shot 2018-07-13 at 7.42.20 PM.png

I admit it. I struggle with rescaling axes (limits, tick marks, whatnot) in ggplot2.  Here's an effort to demystify the process.  Let's first create some bogus data. To the right is a code snippet that does just that by creating a dataframe (dat) composed of 2 columns, each populated with 100 observations sampled between 0 and 100 with replacement.  If the stars have aligned, you'll have created a dataframe that looks something like the one on the right.

Screen Shot 2018-07-13 at 7.44.47 PM.png

Now let's generate a scatterplot of X1 by X2.  Write the scatterplot to the variable name,  'futzing'.

You run the script, and the plot looks just like you would expect ... simple random scatter. Suddenly, your boss calls you into her office.  "What did I tell you about your axes?" she queries with thinly veiled contempt.   

You gasp. Your x-axis breaks are 25 units apart.  The horror!  

Screen Shot 2018-07-13 at 8.11.58 PM.png

What you really want is an X-axis with breaks every 10 units starting at the origin (0,0) and stopping at the maximum value.  Here's a code snippet  that will do that.  Before and after scatterplots are below

Random scatter.jpeg

just a tinge of jitter


I'm fond of this simple little facet_wrap scatterplot. I think it's much more informative than a bar chart or boxplot.  The y-axes scale free. I also passed a special color vector to ggplot2

ggplot(scat.melt, aes(variable, value, shape=variable, fill=variable)) + geom_point(shape=21, color="black", size=2.3, alpha=.6, position = position_jitter(w = 0.17, h = 0.0)) + 
  scale_fill_manual(values=coltry) + facet_wrap(~Condition, scales="free_y") + theme_bw() + 
  theme(panel.grid.minor = element_blank(), panel.grid.major = element_blank(), panel.background = element_blank())+ ylab("Pupil Dilation (mm)") 


A simple dendrogram


Scrunch the branches of a cluster dendrogram:<- as.dendrogram(t1.clust)
cutree(, k=7) 
t1.test <- %>% 
  set("labels_cex", 0.75) %>%
  set("labels_col", k=7, value = cust.r.7) %>%
  set("branches_lwd", 1) %>%
  set("branches_k_color", k=7, value = cust.r.7)
par(mar=c(12, 0.5, 0.5, 1))
t1.plot <- plot(t1.test, horiz=F, axes=F)


Scatterplot with a gam smoothing function


Here's a  clean scatterplot of naming ability vs. global cognition in a group of neurotypical older adults.   The R Script is here.  This was relatively straightforward using ggplot2's 'pretty' function to automagically scale the x-axis (Montreal Cognitive Assessment Score).  I manually scaled the y-axis and changed the default colors of the points, fill, and trendline.





Building a multiplot correlation matrix

Sometimes I take the easy way out and export plots out of R to do finer-grained aesthetics in Adobe Illustrator.  Mock me if you must.  I just built up a multi-panel correlation matrix using four separate 'corrplot' functions.  Step 1: create raw correlation matrices.  They look ugly right out of R -- like the matrix on the bottom left.  I exported this into Illustrator and there are 2 key steps (embed and ungroup).  These will allow you to edit the correlation matrix, change parameters, etc.    Eventually, you can build the plot up to a nice(r) looking matrix like the one below.


facet multiplot of semantic scale ratings for bilingual spanish-english vs monolingual english speakers using a custom theme in ggplot2

Here are plots for color, sound, etc for words rated by two groups (bilingual/monolingual). This uses a custom theme I just finagled. It's sort of pretty in how minimal it is.  Here's the script.

The theme part  is:  theme(legend.title=element_blank(), axis.line = element_line(colour = "black"), panel.background = element_blank(), panel.grid.major = element_line(colour = "gray91", size=0.1))


Interpolating & smoothing a continuous time series of pupillary dilation with switch events annotated


Here's one minute of continuous recording of pupillary dilation as the participant hears tones that shift in frequency at the markers. This nicely illustrates pupil spikes when change occurs.  Courtesy of Ally Dworetsky - our summer intern phenom. Script here -- includes interpolation, smoothing, and plotting parameters.



This monstrosity is called a tanglegram. It contrasts two hierarchical cluster dendrograms. In this case, the dendrograms represent English and Spanish clusters generated for translation equivalents among bilinguals (N=20) when rating the same set of words on color, size, emotion, distance, sound.  The 'tangles' show how meaning "remaps" when switching between languages.

I generated the clusters using K-means partitioning and colored the branches of the dendrogram by clusters. Here's the script.  Here are the data.




Human pupillary response functions to "dirty" words

Here's the result of a time series analysis reflecting dilation of the pupil for a sample of 21 adults as they heard neutral words, technical terms for body parts, or profanity. There was some slight funkiness with this in terms of plotting a range and getting ggplot to recognize custom colors.

R-script here


Correlogram of Ratings and Reaction Times to English Profanity

Here's a correlogram. This is simply a visual depiction of a correlation matrix. These are Pearson correlations. The variables are ordered by similarity using the hclust function of the corrplot package. This was a little funky because I didn't like the built in color scale (1 was blue), so I reversed it by manually passing a new color palette.  Here's the R script.


Histogram of Common Noun Ratings as Candidates for Novel English Profanity

Here's a fun little histogram.  This reflects counts for the distribution of Likert-scale ratings (x-axis) for 21 adults who judged whether a common noun combines well with existing English profanity to form a novel emergent profane term.  I changed the bin width here and specified counts on the Y-axis. Here are the data and the script


3d Scatterplot using the rgl package

Here's a 3d plot representing how the meanings of abstract and concrete nouns cluster in a semantic space bounded by three dimensions.  

I used the rgl package.   Once the plot is generated it allows the user to rotate to an optimal plane. 


Annotated R code here


scatterplot with Loess smoothing function

Fancy Scatter No Borders.jpeg

Here's a scatterplot of concreteness by imageability ratings from the MRC Psycholinguistic database with a Loess smoothing function.  

Annotated R code here


Using the facet wrap function for multiple plots


Here we have multiple plots.  ggplot2 uses the facet_wrap function to arrange plots this way. The program breaks the data into subplots based on the factor a user specifies (in this case language).  These data are from a study we are on the verge of submitting. People force choice guessed whether aurally presented words in unfamiliar languages (e.g., Arabic, Dutch, Hebrew, Hindi, Korean, Russian) represented abstract or concrete concepts.  Most people were remarkably above chance even after we eliminated cognates from the mix. The shaded rectangle represents a range of approximate chance responding.

Annotated R code here


Lonely old bar graph

Here's one from an eyetracking study we just completed plotting average response latencies for the word and picture versions of the Pyramids and Palm Trees Test (objects) relative to the Kissing and Dancing Test (actions).    We eliminated the x and y top and right borders and scaled the y-axis minimum to .75.  R code here,  Dataframe here


Time series plot of continuous sampling of pupil diameter during a visual symbol cancellation task in a person with post concussive syndrome

R Code Here     It's sort of a bear to get R to recognize a column of numbers as a time series when it wants them to be a factor. I struggled to get GGPlot to plot the time series. Instead, I reverted to R's plotting function after recoding the data as a time series. This graph represents very rapid fluctuations in the diameter of a pupil (the black part of your eye, not a student) for a person who is experiencing post concussive symptoms during a symbol cancellation task (i.e., many visually similar distractor symbols).


Multiple time series overlaid on the same plot. These data reflect continuous sampling of pupil dilation during the same visual symbol cancellation task for two people. 

R Code Here      Dataset Here

Elizabeth Brophy spent the greater part of today learning how to overlay two discrete time series. Was it worth it in the grand scheme of her limited time on this earth?  You'd have to ask her, but my feeling is that the plots look great.

This reflects pupillary fluctuations measured at 120Hz. The nice thing about this graph is the axis cutting and the comparison of two peoples' time series.  She needs to rescale the y-axis a bit, and we should also add in the event markers. That's for next time. 


Heatmap of our abstract word topography data

Here's a heatmap plot that reflects a hypothetical semantic space wherein 400 highly abstract and concrete English nouns are situated.  The R-code is here. The dimensions across the bottom reflect domains where >350 participants rated the 400 English nouns on their emotional valence, visual salience, etc. The vertical axis reflects increasing word concreteness beginning with abstract words such as justice increasing to concrete words such as dog. The database and all associated word ratings are here. These are the data we reported recently in our Frontiers in Human Neuroscience article.   Hotter areas of white indicate "higher" ratings on a particular domain. This plot is interesting because it shows some nice latent structure of abstract and concrete words in terms of emotion, polarity, and sensory salience.


Much fancier version of the last heatmap

This $#*@ took me about 40 hours to nail down. This heatmap reflects the same data as the previous plot but with many more bells and whistles.  I used the gplot package in R and its heatmap.2 plotting functions, of which I had no working knowledge until about 40 hours ago. These things are really obsessive little puzzles. This involved restructuring my original spreadsheet, coercing R into handling the data table as a matrix with column 1 as row names and then moving stuff around in Illustrator.  Here's the annotated R-code


Facet wrap scatterplots

Here are scatterplots for 14 dimensions arrayed using R's facet wrap function.  Here's the spreadsheet (in long form). Here's the R-code for plotting in ggplot2.


Correlogram reflecting bivariate correlations between odor, motion, visual form, space, emotional valence, and other variables for 750 English Nouns

Here's a correlogram for an article we're writing up now. For those unfamiliar with this format, it is simply a visual depiction of a standard bivariate correlation matrix. The color map is scaled to Pearson R values. I created this using the corrplot package in R.  Here's the spreadsheet and the code.


Interpolation and application of a moving average smoothing algorithm to pupil dilation data

So here's an interesting little time series plot. This reflects a the dilation of a single person's pupil over the course of a few seconds when a monitor rapidly flashes from white to black (the flash point is the orange dotted line). Here are the data and the R-script.  Our eyetracker samples at 120Hz, so there are blink trials that need be interpolated across.  The data off the tracker are jolty and noisy, so we applied a moving average smoothing algorithm of 8 places. This illustrates the time course of the pupil dilation nicely.  


The previous figures with some Adobe Illustrator clean-ups.

This came out pretty nicely. This was the figure that ultimately made its way into this article in Brain and Language:

Reilly J, *Garcia A, & Binney RJ (2016). Does the sound of a barking dog activate its corresponding visual form? An fmri investigation of modality-specific semantic access. Brain and Language, 159, 45-59. doi: 10.1016/j.bandl.2016.05.006








Time series:  Pupil dilation for imagining a sunny day in response to Yes versus looking into a dark room in response to NO.

Here are two time series snaking within one another with error bars created using the pointrange geom in ggplot2.  Here's the R script.  


Ribbon plot

Bonnie Zuckerman created this nice little ribbon plot demonstrating changes in pupil diameter as participants produced different semantic clusters over a one minute period in a verbal fluency task (i.e., Name as many animals as you can in one minute). She cleverly color-coded the time series by cluster (e.g., sea animals, house pets, etc). R Code here.


Manually passing a vector of standard errors to a simple bar chart... with some lazy Photoshopping

The trick with the position dodge function for error bars is that it must match the width of the bars specified in the geom_bar aesthetic (in this case .5). 

Here's the R-script for making this happen. 








Pupillary dilation/constriction for two time series alternating Dark-Bright

So impressive... Ally Dworetsky after one week in the lab  produced this nice li plot using ggplot. She measured her own pupil dilation dynamics over a minute as she viewed a black screen that switched at the 30 second point to yellow (causing a pupillary constriction). She also overlaid another of the summer intern's (Rena) time series for yellow to black with a switch at the 30s point (causing a pupillary dilation). Great work, Ally!  Download the script here


3d scatterplot of profanity versus taboo words in a semantic space constrained by valence, physiological arousal, and social acceptability

Here's a fun little 3d scatterplot using the scatterplot3d package.  This plot represents subjective ratings of emotional valence, social acceptability, and physiological arousal for a series of profane words relative to matched "taboo" body part words. There were a few tricky parts to executing this block of R-code (download here) and here are the data (download csv here). This plot represents the subjective ratings. To come is a plot reflecting peak pupil amplitudes when hearing profane vs. taboo but not profane words.