Which greek letters are most frequently used in scientific papers?

tl;dr: In astrophysics, Alpha seem to be most used, in Computer Science, Theta and Alpha, and in Economy, Theta

A recent conversation at work led me to take a closer look at most common use of norwegian characters in written texts, such as books, articles etc. I set out to find them using programming, since this is a very good example where a computer can do a high volume examinations in minutes and a human would use ages.

The conversation continued and two physicists at work started discussing “What about the usage of greek letters in scientific papers?”

ΑΒΓΔΕΖΗΘΙΚΛΜΝΞΟΠΡΣΤΥΦΧΨΩ
αβγδεζηθικλμνξοπρσςτυφχψω
Upper and lower case greel letters

There are some common uses within mathematics, science, and engineering, but you are free to choose which letter to represent any entity. It would be interesting to see if there is a trend and pattern in different areas, what is used more frequently. I chose Computer Science, Astrophysics and Economy to look at.

One of the useful values is that we as teachers can make seemingly scary symbols in primary school mathematics harmless. By showing that they are used relatively arbitrarily, as one study suggests, shows that the symbols used in the formulas and equations are nothing but arbitrary letters, Greek as such.

A colleague at work

So, I changed my Python script to look for both upper and lower case greek letters. But what would be a good source? arxiv.org was a good place to start and it could list hundreds and thousands of scientific papers in an archive,with easy access to PDF-versions. This helps when you want to access lots of data programmatically.

How I collected the data

In short; I used the BeautifulSoup Python module to grab each PDF link from that page and for each link, the script download the PDF, converted it into text and compared every single character against the greek alphabet. if it encountered a greek letter, it would add it and the occurance to a JSON file. At the end, the JSON file would contain all the greek letters used as keys and how many occurancses as values.

I used Google Sheet to visualize the JSON files as diagrams, although it could be done with Python and the Matplotlib module. Here are the results. First top five in each category and then at the bottom a complete list for all three categories.

Greek letters in Computer Science papers

The top five most used greek letters in Computer Science articles based on 1500 papers:

Greek letters in Astrophysics papers

The top five most used greek letters in Astrophysics articles based on 1000 papers:

Greek letters in Economy papers

The top five most used greek letters in Economy articles based on 250 papers:

Complete lists

Here are complete lists of greek letters used in the three different areas: