⌨️ Simple HTML code highlighting workflow with vim
Lately I've wanted to post code snippets to my website, but I want them to look pretty and highlighted instead of just using boring old <pre> tags.
Besides, is all that effort and computation really needed for some colored text?
So I decided to find a workflow in which I can export my code into pre-formatted HTML.
I recently put some effort into customizing vim, and after trying a bunch of themes settled on Edge.
Staring at my beautifully-highlighted code in vim, I wondered if I could just replicate the same effect on my website.
I quickly found the vim
TOhtml function, which does exactly that!
All you have to do is run
TOhtml in vim, and it outputs a formatted HTML file complete with CSS rules that color different types of syntax exactly how it's displayed in the editor. Hoorah!
After messing with
TOhtml for a while, I realized that it was not actually outputting the same syntax highlighting that I was seeing in my terminal.
For example, functions appeared light blue to me in vim, but the exported HTML did not discern them at all.
It took me a long time to realize that the issue was with my color scheme, Edge!
I have no idea why, but
TOhtml does not play well with Edge and ignores some of its syntax highlighting rules.
It also seems to forget that I have true colors turned on!
I tried to run
TOhtml using the popular Solarized scheme, and it worked without hitch!
...but I want to use the Edge color scheme.
And besides, if I ever want to switch color schemes in the future, I don't want to worry about their compatibility with my current syntax highlighting workflow.
In my .vimrc I've created a variable that defines my current color scheme of choice:
let g:usual_colorscheme = "edge"
execute "colorscheme ".g:usual_colorscheme
By doing that, I can declare a function that temporarily sets my color scheme to solarized when I want to use TOhtml, and then define a command to run it:
execute "colorscheme ".g:usual_colorscheme
command NEWhtml call SolarizedTOhtml()
It was humorously difficult to define suitable CSS rules for these vimscript snippets, but I think they look very nice considering how simple this system is.
Since I'm only exporting code using Solarized now, the same CSS rules should work on future code snippets without much modification, and I'll be able to adjust them all uniformly with ease.
The only other adjustments I've made to this setup are the following two settings in my .vimrc:
let g:html_number_lines = 0
let g:html_ignore_folding = 1
Have a beautiful day!
📍 Mapping Education Development Projects in Afghanistan
This is the first time I've used R and LaTeX together, using a package called knitr. I believe papers that include statistical analyses should be fully reproducable, and part of my objective for this project is to learn how to accomplish that. I was partly inspired by this article on Chinese investments (also using AidData data), available as an R notebook 🔓 here. I also used this guide to help navigate the weird world of the maps R package. As I got further into the weeds of plotting the population density data, this Data Carpentry lesson was indispensable.
I requested a bunch of data about Afghanistan's education and forestry development from AidData. The download link for my query is 🔓 here. I decided to focus on the education data for this project, and experiment with overlaying the education project locations over a population density map of Afghanistan. I hope for this map to effectively convey how well-distributed these projects have been relative to populous areas. Have a majority of these projects been in cities, or in a specific region?
Directly converting the population density raster to a data frame ate my laptop's memory and caused R to crash, simply outputting "Killed" in my terminal. I ended up asking for help, and it was suggested that I aggregate the data before conversion. I'm annoyed that R can't handle this data file with better memory management (and I dream of pixel-perfect output!), but the aggregated data still looks very nice.
inscription_data = read.csv("results.csv", header = TRUE)
education_data <- filter(inscription_data,
afg_pop <- raster("afg_ppp_2020.tif")
afg_pop2 <- aggregate(afg_pop, 10)
afg_pop_df <- as.data.frame(afg_pop2, xy = TRUE)
afg <- map_data("worldHires", "Afghanistan")
pop_map <- ggplot() +
theme(axis.title.x = element_blank(),
axis.title.y = element_blank()) +
geom_raster(data = afg_pop_df,
aes(x = x, y = y, fill = afg_ppp_2020)) +
scale_fill_continuous(name = "Population\nDensity",
na.value = NA,
type = "viridis",
trans = "log10",
breaks = c(0.1,1,5,20,100)) +
geom_polygon(data = afg,
aes(x = long, y = lat, group = group),
fill = NA,
color = "black")
geom_point(data = education_data,
aes(x = longitude, y = latitude),
shape = 21,
size = 1,
color = "white",
fill = NA) +
That looks great! I know it's wrong to not have units for the population density scale, and that's something I'd like to fix. The value of each pixel in the original image represented the number of people in the area that it covered, so the numbers on this scale should represent the number of people who live within each of the aggregated chunks. I think this issue would be best solved by calculating the distance each chunk covers in mi/km, and then adjusting the values in the legend to represent people per square mi/km.
That cluster of projects (and people!) in the Northeast is Kabul. How many projects are occuring in the greater Kabul area?
kabul_lat <- 34.525278
kabul_long <- 69.178333
radius <- 0.3
near_kabul <- filter(education_data,
between(latitude, (kabul_lat - radius), (kabul_lat + radius)),
between(longitude, (kabul_long - radius), (kabul_long + radius)))
I think that looks a lot nicer, although it might benefit from more detailed info in the legend about what different sizes of circles indicate. I don't notice anything groundbreaking on this map. There might be fewer projects on the Western side of the country, but I guess that has more to do with the population around Kabul than any other bias. Kabul is nearly 10 times as populous as the second-biggest city, Kandahar, so it is no surprise to see a majority of projects in that region.
Is this really all of the projects? Our education_data variable has `r nrow(education_data)` rows, and even with the `r nrow(near_kabul)` removed around Kabul, are all of them being displayed? I noticed that the dataset has rounded numbers for a lot of the latitude and longitude values, which could mean that many projects in the same town are drawn on top of each other. For example, there are three projects in Badakhshan that are all listed as having the exact same coordinates, despite differing descriptions, years, and dollars spent. There are two methods I'd like to use to mitigate this problem. The first is to cluster points that are within a radius of each other, and draw circles of proportional size to represent each cluster. The second is to calculate the average population density within a radius of each project, and then create a bar chart with each bar representing a range of population densities, and its length representing the number of projects that fall within an area of that density. It might be useful to create a few charts using different radii of average density.
Other variables could be controlled to better represent the differences in aid allocation across Afghanistan. For example, the size of each point on the map could be scaled based on the monetary value of the project it represents, giving the viewer a better idea of where the majority of funds are allocated geographically. Projects in urban areas might receive more funding to serve more people, or maybe the cost of building a school in a rural area would be significantly more expensive, for example. The source of funding, years active, and preexisting school enrollment are all potentially insightful variables to incorporate in a visualization like this.
Goodman, S., BenYishay, A., Lv, Z., & Runfola, D. (2019). GeoQuery: Integrating HPC systems andpublic web-based geospatial data tools. Computers & Geosciences, 122, 103-112.
The data from AidData used in this project were downloaded from 🔓 this page.
WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University (2018). Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00645
The data from WorldPop used in this project were downloaded from this page.