Category: Uncategorized

Fetching, wrangling and visualising sunrise and sunset data using Python

Previously I showed how it was possible to obtain sunrise and sunset times for a whole year at any location on Earth, from a public source.

This time I am going to explain how to fetch that data, clean it up and create graphical visualizations like the one below, all using Python. A Jupyter Notebook is available on GitHub.  Such data might even be useful in, for example, simulation of solar power generation.

Sunlight hours for Derry vs Greenwich

Fetching the data

Looking at the URL string of the returned data query on the USNO website we can see that it contains all of the required values for longitude and latitude, as well as the place name label that we defined ourselves:

We can try replacing those values and expect to be able to retrieve any data set without going through the web interface. This makes it really easy to do what is known as web scraping, i.e. fetching the remote data table for any location automatically from our Python script. All we have to do is supply a label, and the coordinate values, and we can crudely modify the original URL to get the desired result.

```url = "http://aa.usno.navy.mil/cgi-bin/aa_rstablew.pl?ID=AA&year=2018&task=0&place=%s&lon_sign=-1&lon_deg=%s&lon_min=%s&lat_sign=1&lat_deg=%s&lat_min=%s&tz=&tz_sign=-1" % (label, longitude[0], longitude[1], latitude[0], latitude[1])
urllib.urlretrieve (url, inFile)

```

When you fetch in this way a HTML file is returned, containing some html tags and some css code. But luckily the data we want is in plain text starting and ending at consistent line numbers each time, so we can read in the file, slice the lines that we want and then re-save the local file wth only the trimmed data.

```# Dirty and dumb way to get rid of the html and css before the data table:
f = open(inFile, 'r')
# Gets a list of all the lines in the file, with each line as an un-split string:
# Crucially, the line endings remain intact so the local file will be identical to the original raw text!
f.close()
raw = html[25:66] # Slice only the lines that we need. This is hardcoded but was the best method I found.

# Create/OverWrite the local file:
with open(inFile, "w") as text_file:
for line in raw:
text_file.write(line)```

Every time you run this code it fetches from the USNO website. If you plan to run this code many times for the same geographic location it is worth saving to a local file to give their web server a break!

Data wrangling

Now that we have a local data file, let’s read it in to Python. Not so fast! It took me a while to find the best method here. Normally, as a scientist, I work with tabular data containing floats, not a mix of data types. So I often use numpy.loadtxt() as my go to method for reading data files. However, if we try this we get an error:

`A = np.loadtxt(inFile, skiprows=9) # Gives "ValueError: Wrong number of columns at line 38"`

Numpy’s loadtxt() function does not like it when a file contains columns of different lengths. Let’s try a different way.

Let’s try a pandas-based approach first.

```df = pd.read_csv(inFile, skiprows=range(9), header=None, infer_datetime_format=True)
print df, df.shape```

This almost works, but we get a collection of rows where each row is a long string. It has not split the row strings into columns yet!

```                                                    0
0   01  0805 1602  0738 1649  0645 1740  0536 1833...
1   02  0805 1603  0737 1651  0643 1742  0533 1835...
2   03  0805 1604  0735 1653  0641 1744  0531 1836...
3   04  0805 1605  0734 1655  0639 1745  0529 1838...
4   05  0805 1606  0732 1657  0637 1747  0527 1840...
5   06  0804 1608  0730 1659  0635 1749  0524 1841...
6   07  0804 1609  0729 1700  0632 1751  0522 1843...
7   08  0803 1610  0727 1702  0630 1752  0520 1845...
8   09  0803 1612  0725 1704  0628 1754  0518 1846...
9   10  0802 1613  0723 1706  0626 1756  0516 1848...
10  11  0802 1614  0721 1708  0623 1758  0513 1850...
11  12  0801 1616  0720 1710  0621 1759  0511 1851...
12  13  0800 1617  0718 1711  0619 1801  0509 1853...
13  14  0800 1619  0716 1713  0617 1803  0507 1855...
14  15  0759 1620  0714 1715  0614 1804  0505 1856...
15  16  0758 1622  0712 1717  0612 1806  0503 1858...
16  17  0757 1624  0710 1719  0610 1808  0500 1900...
17  18  0756 1625  0708 1720  0608 1810  0458 1901...
18  19  0755 1627  0706 1722  0605 1811  0456 1903...
19  20  0754 1628  0704 1724  0603 1813  0454 1905...
20  21  0753 1630  0702 1726  0601 1815  0452 1906...
21  22  0752 1632  0700 1728  0559 1816  0450 1908...
22  23  0751 1634  0658 1729  0556 1818  0448 1910...
23  24  0749 1635  0656 1731  0554 1820  0446 1911...
24  25  0748 1637  0654 1733  0552 1821  0444 1913...
25  26  0747 1639  0652 1735  0549 1823  0442 1915...
26  27  0746 1641  0650 1737  0547 1825  0440 1916...
27  28  0744 1642  0648 1738  0545 1826  0438 1918...
28  29  0743 1644             0543 1828  0436 1920...
29  30  0741 1646             0540 1830  0434 1921...
30  31  0740 1648             0538 1831           ...
(31, 1)```

In order to manipulate the data we would like Pandas to delimit the data correctly into rows and columns, and in this case we have whitespace between them.

`df = pd.read_csv(inFile, skiprows=range(9), header=None, delim_whitespace=True, infer_datetime_format=True)`

This does not produce the desired results, because the columns get shifted to the left to fill in the empty gaps of shorter months, e.g. now February has 31 days!

```    0    1     2    3     4    5     6    7     8    9    ...       15  \
0    1  805  1602  738  1649  645  1740  536  1833  432   ...    424.0
1    2  805  1603  737  1651  643  1742  533  1835  430   ...    425.0
2    3  805  1604  735  1653  641  1744  531  1836  429   ...    427.0
3    4  805  1605  734  1655  639  1745  529  1838  427   ...    428.0
4    5  805  1606  732  1657  637  1747  527  1840  425   ...    430.0
5    6  804  1608  730  1659  635  1749  524  1841  423   ...    431.0
6    7  804  1609  729  1700  632  1751  522  1843  421   ...    433.0
7    8  803  1610  727  1702  630  1752  520  1845  420   ...    434.0
8    9  803  1612  725  1704  628  1754  518  1846  418   ...    436.0
9   10  802  1613  723  1706  626  1756  516  1848  416   ...    438.0
10  11  802  1614  721  1708  623  1758  513  1850  415   ...    439.0
11  12  801  1616  720  1710  621  1759  511  1851  413   ...    441.0
12  13  800  1617  718  1711  619  1801  509  1853  412   ...    442.0
13  14  800  1619  716  1713  617  1803  507  1855  410   ...    444.0
14  15  759  1620  714  1715  614  1804  505  1856  409   ...    445.0
15  16  758  1622  712  1717  612  1806  503  1858  407   ...    447.0
16  17  757  1624  710  1719  610  1808  500  1900  406   ...    449.0
17  18  756  1625  708  1720  608  1810  458  1901  404   ...    450.0
18  19  755  1627  706  1722  605  1811  456  1903  403   ...    452.0
19  20  754  1628  704  1724  603  1813  454  1905  402   ...    453.0
20  21  753  1630  702  1726  601  1815  452  1906  400   ...    455.0
21  22  752  1632  700  1728  559  1816  450  1908  359   ...    457.0
22  23  751  1634  658  1729  556  1818  448  1910  358   ...    458.0
23  24  749  1635  656  1731  554  1820  446  1911  357   ...    500.0
24  25  748  1637  654  1733  552  1821  444  1913  356   ...    501.0
25  26  747  1639  652  1735  549  1823  442  1915  354   ...    503.0
26  27  746  1641  650  1737  547  1825  440  1916  353   ...    504.0
27  28  744  1642  648  1738  545  1826  438  1918  352   ...    506.0
28  29  743  1644  543  1828  436  1920  351  2004  346   ...    557.0
29  30  741  1646  540  1830  434  1921  351  2005  347   ...    559.0
30  31  740  1648  538  1831  350  2006  422  1950  511   ...      NaN

16     17      18     19      20     21      22     23      24
0   1948.0  512.0  1847.0  600.0  1738.0  653.0  1633.0  743.0  1555.0
1   1946.0  514.0  1844.0  602.0  1736.0  655.0  1631.0  744.0  1554.0
2   1945.0  516.0  1842.0  604.0  1733.0  657.0  1630.0  746.0  1554.0
3   1943.0  517.0  1840.0  605.0  1731.0  658.0  1628.0  747.0  1553.0
4   1941.0  519.0  1838.0  607.0  1729.0  700.0  1626.0  748.0  1553.0
5   1939.0  520.0  1835.0  609.0  1727.0  702.0  1625.0  750.0  1552.0
6   1938.0  522.0  1833.0  610.0  1724.0  704.0  1623.0  751.0  1552.0
7   1936.0  524.0  1831.0  612.0  1722.0  706.0  1621.0  752.0  1552.0
8   1934.0  525.0  1828.0  614.0  1720.0  707.0  1620.0  753.0  1551.0
9   1932.0  527.0  1826.0  615.0  1718.0  709.0  1618.0  754.0  1551.0
10  1930.0  528.0  1824.0  617.0  1716.0  711.0  1617.0  755.0  1551.0
11  1928.0  530.0  1822.0  619.0  1713.0  712.0  1615.0  756.0  1551.0
12  1926.0  532.0  1819.0  620.0  1711.0  714.0  1614.0  757.0  1551.0
13  1924.0  533.0  1817.0  622.0  1709.0  716.0  1612.0  758.0  1551.0
14  1923.0  535.0  1815.0  624.0  1707.0  718.0  1611.0  759.0  1551.0
15  1921.0  536.0  1812.0  625.0  1705.0  719.0  1610.0  800.0  1551.0
16  1918.0  538.0  1810.0  627.0  1703.0  721.0  1608.0  800.0  1552.0
17  1916.0  539.0  1808.0  629.0  1701.0  723.0  1607.0  801.0  1552.0
18  1914.0  541.0  1806.0  631.0  1659.0  724.0  1606.0  802.0  1552.0
19  1912.0  543.0  1803.0  632.0  1657.0  726.0  1605.0  802.0  1553.0
20  1910.0  544.0  1801.0  634.0  1655.0  728.0  1603.0  803.0  1553.0
21  1908.0  546.0  1759.0  636.0  1652.0  729.0  1602.0  803.0  1554.0
22  1906.0  547.0  1756.0  637.0  1650.0  731.0  1601.0  804.0  1554.0
23  1904.0  549.0  1754.0  639.0  1648.0  733.0  1600.0  804.0  1555.0
24  1902.0  551.0  1752.0  641.0  1647.0  734.0  1559.0  805.0  1555.0
25  1900.0  552.0  1749.0  643.0  1645.0  736.0  1559.0  805.0  1556.0
26  1858.0  554.0  1747.0  644.0  1643.0  737.0  1558.0  805.0  1557.0
27  1855.0  556.0  1745.0  646.0  1641.0  739.0  1557.0  805.0  1558.0
28  1743.0  648.0  1639.0  740.0  1556.0  805.0  1559.0    NaN     NaN
29  1740.0  650.0  1637.0  742.0  1555.0  806.0  1600.0    NaN     NaN
30     NaN    NaN     NaN    NaN     NaN    NaN     NaN    NaN     NaN

[31 rows x 25 columns]```

Since we notice that the tabular data has regular, repeating columnar widths, we can use the Pandas fixed width file function read_fwf(). Here is an example for two columns.

`eg = pd.read_fwf(inFile, colspecs=[(4,8), (9,13)], skiprows=9, header=None)`

We can adapt this approach to get all columns. Notice that not each column width is identical but they are bunched in a way that we can use. We can straight away loop through the relevant columns for sunrise and sunset and store them in separate Pandas dataframes.

```sunrise = pd.read_fwf(inFile, colspecs=[(i, i+4) for i in range(4,132,11)], skiprows=9, header=None).values.T.flatten()
# Remove all NaNs from the flattened lists to leave just the days that exist: https://stackoverflow.com/questions/11620914/removing-nan-values-from-an-array
sunrise = sunrise[~np.isnan(sunrise)].astype('int').tolist()
sunset = sunset[~np.isnan(sunset)].astype('int').tolist()
```

I have also transposed and flattened the data, then removed any NaN values and converted to lists, so that we have a list of 365 sunrise times and a list of 365 sunset times. Notice that all of the methods presented here should account for leap years, which contain 366 days, automatically!

Now we want to convert our float values to datetimes. We can do this using Python’s datetime module but we will need to give it integer values for hours, minutes and seconds.

```srise = [datetime.time(int(str(a)[:-2]), int(str(a)[-2:]), 0) for a in sunrise]
sset = [datetime.time(int(str(a)[:-2]), int(str(a)[-2:]), 0) for a in sunset]```

Creating a date list

Our method for extracting the sunrise and sunset times didn’t preserve the dates on which those values occurred, but we can easily create one. Pandas can create a period list using the date_range function, and this can be converted to datetimes.

```periodlist = pd.date_range(start='%s-01-01' % year, end='%s-12-31' % year, freq='D')
print periodlist

datelist = [pd.to_datetime(date, format='%m-%d').date() for date in periodlist]
print "Datelist:\n", datelist[:10]```

This datelist will allow us to plot our sunrise and sunset times against the day of the year easily.

Visualization

Now we can simply invoke matplotlib, creating line plots for the sunrise and sunset data series’ and filling in between using the fill_between() method.

```# Create a nice plot:
plt.close()
plt.figure()

# Plot the sunrise and sunset series as lineplots:
plt.plot(datelist, srise, label='sunrise')
plt.plot(datelist, sset, label='sunset')

# Let's put a fill between the two line plots:
plt.fill_between(datelist, srise, sset, facecolor='yellow', alpha=0.2, interpolate=True)

plt.title('Sunlight hours')
plt.xlabel('date')

# Make sure the y axis covers the (almost) full range of 24 hours:
plt.ylim([datetime.time.min, datetime.time.max])
# The above y limits do not force the y axis to run a full 24 hours, so force manual axes tick values.
# Give a range of values in seconds and matplotlib is clever enough to work it out:
plt.yticks(range(0, 60*60*25, 2*60*60))

plt.legend()
plt.savefig('working.png', dpi=90)
plt.show()```

Data for Greenwich, UK, visualized from USNO data.

Final code

In my final code I added some additional features such as a location dictionary, that allows me to compare a daylight times for different locations on a single plot. My final code can be found in a Jupyter Notebook, which is available on GitHub. Here is a comparison between Greenwich, UK and Derry, UK.

Sunlight hours for Derry vs Greenwich

As we can see, in winter the two locations have identical sunset times but the sunrise in Derry is significantly later. In summer The sunrise times are similar, but Derry has a much later sunset. The days in Derry are really short in winter and really long in summer, as the locals will tell you! As one of the farthest North-Westerly cities in the UK, Derry is considerably further north and further west than Greenwich (London), but it is still in the same time zone. Note that I have not included Daylight saving times in my code. Times are all in GMT.

Map showing Derry and Greenwich, UK.

Successfully clearing ports in Salome (Code ASTER)

Building a geometry in the Salome graphical user interface (GUI).

How Salome tracks ports

When Salome is starting up, it checks for free ports on your system using a few built-in Python scripts. Then when you close Salome those ports should be freed up again for the next one. This has a number of uses, but one reason is to stop multiple instances of Salome trying to use the same port at once.

Those Python scripts keep track of the port numbers that are currently in use by storing the numbers in some configuration files (*.cfg) that are saved on your system. When Salome exits, those configuration files should be updated to recognize that the current port is being freed up again.

A possible problem with port tracking

Sometimes, however, those configuration files do not get updated. For example, if you are running Salome using a script in batch mode you can include a command to kill Salome properly, giving the correct port number. I have found in the past that this method has not been very reliable and so the configuration file keeps being updated with port numbers that are in use, but those numbers are never removed from the “in use” list even if they have actually been freed up on the system.

If you do a lot of scripting in salome you will find that when writing/testing your scripts, if salome crashes a lot then often the ports being used don’t get released and so stay as “being used” in the port log file.

The result is that after a while a maximum number of ports is reached and Salome thinks that there are no ports free, so it will not start successfully, giving the following error message:

RuntimeError:

Can’t find a free port to launch omniNames

Try to kill the running servers and then launch SALOME again.

Perhaps you will check for salome instances running using. There may or may not be lots of Salome processes running on your system. In this post I am going to assume that Salome has closed properly. You can check if salome is running:

ps -x | grep salome

Salome provides some Python scripts that should kill any running instances in a well-behaved way. For example, killSalome.py kills all of the instances running on your system, so you should use it with care:

~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/bin/salome/killSalome.py

But if you already know the port number of a specific instance, killSalomeWithPort.py can be invoked to kill just that one, without affecting other instances that are currently running:

~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/bin/salome/killSalomeWithPort.py 21116

As a last resort, you can kill all processes in your system that mention salome:
ps x | grep salome | awk ‘{print \$1}’ | xargs -n1 kill
OK, so now the ports should be freed up, right? Well maybe not! Your problem might indeed be that Salome is not updating the port config files correctly. Just killing the processes does not help because the next instance of Salome you launch will still check those files and think that there are no free ports. If this describes your current situation, don’t worry! I will now explain how to fix it.

Clearing the port config files

For my version of Salome (see footnotes), hidden configuration files were being created in a number of locations.
For example, the .omniORB_PortManager.cfg file, which in my case is located in my home directory at ~/.omniORB_PortManager.cfg
However, deleting this file did not solve the problem.
I then searched through my home drive for all *.cfg files, but none of the ones that came back were related to ports.
\$ find . -name *.cfg
./bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/share/salome/resources/kernel/channel.cfg
… plus a load of non-salome-related stuff…
This would also have found any “.omniORB_*_2888.cfg” (where 2888 is the port number) as mentioned here but those did not show up. There is a USERS directory within my salome installation directory structure at ~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/SALOME/USERS, however it is empty and so does not contain any such .cfg files.
~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/SALOME/USERS\$ ls -a
.  ..
Finally, I found that for my installation Salome was using /tmp to store these hidden *.cfg files. The /tmp/ directory (and maybe other directories – see below) contained the following files:
• .omniORB_PortManager.cfg
• Stores a list of the busy ports
• .omniORB_PortManager.lock
• Locks the .omniORB_PortManager.cfg from being edited? I’m not sure exactly what it locks.
• Should be deleted each time but if Salome is not doing this you will have many of these files with different port numbers.
• I guess this probably stores the last port that was used, although I deleted it already before confirming this hunch.

On one of my systems (a HPC cluster) Salome was not storing the .omniORB_PortManager.cfg file in /tmp/ . Instead it was located in /home/<username>/bin/salome/appli_V7_6_0/USERS
You can check which path is being used by looking in /home/<username>/bin/salome/appli_V7_6_0/bin/salome/PortManager.py  In there is a variable named “omniorbUserPath”, which is obtained from an environment variable that I could not see. Nonetheless, I modified PortManager.py to print this variable to screen, which told me that it was looking for /home/<username>/bin/salome/appli_V7_6_0/USERS/.omniORB_PortManager.cfg . Believe me, this was very frustrating to identify as I really thought I had deleted all necessary files, only for salome to continue not finding a free port!

You can delete all of these files, and now when you run Salome it will start fresh, creating new files as it needs. Problem fixed! But…

Stop it happening again

The above fix will only help if we don’t cause the problem again. If you are creating many models or running many simulations from a controller script you do not want to keep reaching a hard limit of consecutive salome calls you can make, only to have to manually delete the omniORB config files again. What we really want is to make sure that Salome will update the config files correctly in future.
In the past I tried many times to use killSalomeWithPort.py. I did this by running salome with the –ns-port-log argument and providing a log file to store the port number.
```<salome_distro>/salome --ns-port-log="somefolder/salomePort.log" -t -b script.py
port_file = open('somefolder/salomePort.log' , 'r')
<salome_distro>/bin/salome/killSalomeWithPort.py %s' % killPort```
For some reason I could never get this to work successfully. I always ended up building a call to killSalome.py in my script, which kills all running Salome instances and meant that I had to build models consecutively, never in parallel. It also meant that if I had a script running I could not really use the Salome GUI because it too could be killed at any moment!
Here is the correct way to do it, which I only recently discovered through some trawling of the web (unfortunately I can no longer found the page where I saw it and so I can’t give credit to the author).
```if not salome.sg.hasDesktop():
from killSalomeWithPort import killMyPort
killMyPort(os.getenv('NSPORT'))```
The important part is inside the “if” clause. killSalomeWithPort contains a function called killMyPort and the current port used in our Salome instance is stored in an environment variable named “NSPORT”. So by passing that port number to the function we can kill Salome cleanly!
The salome.sg.hasDesktop() just returns True if we are in the Salome GUI. Because if we were, we would not want our script to kill Salome for us. We only want it to happen if we are running inside a batch script.
I’m wondering why I never found this solution before, as it would have saved me a lot of frustration, but there you go, that’s life! Now I am passing it onto you, have fun!

Footnotes

• I am using the Salome version 8.2.0 for Ubuntu 16.04 x64 precompiled binaries. Different versions have different file structures and so your binary folder path might be different. If you search on the command line for e.g. runSalome.py, you should be able to identify where your salome binaries and Python scripts are located.
• A lot of this info was gleaned from the Salome user forum, particularly from this 2015 post: http://www.salome-platform.org/forum/forum_10/519093933
• In Linux, hidden files have a full stop (US: “period”) in front of their filename. To see them when listing a directory use “ls -a”.

Parallelization in Python example with joblib

It can be ridiculously easy to parallelize code in Python. Check out the following simple example:

```import time
from joblib import Parallel, delayed

# A function that can be called to do work:
def work(arg):
print "Function receives the arguments as a list:", arg
# Split the list to individual variables:
i, j = arg
# All this work function does is wait 1 second...
time.sleep(1)
# ... and prints a string containing the inputs:
print "%s_%s" % (i, j)
return "%s_%s" % (i, j)

# List of arguments to pass to work():
arg_instances = [(1, 1), (1, 2), (1, 3), (1, 4)]
# Anything returned by work() can be stored:
results = Parallel(n_jobs=4, verbose=1, backend="threading")(map(delayed(work), arg_instances))
print results```

Output:

```Function receives the arguments as a list: (1, 1)
1_1
Function receives the arguments as a list: (1, 2)
1_2
Function receives the arguments as a list: (1, 3)
1_3
Function receives the arguments as a list: (1, 4)
1_4
['1_1', '1_2', '1_3', '1_4']
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    3.9s finished```

As you can see, this simple program executed all four of our argument instances sequentially, because we chose n_jobs = 1, i.e. we told it to use 1 CPU, which means it runs in series. The total time to run is reported as approximately 4 sec (it is actually less than 4 sec, but we won’t concern ourselves with this here!).

Now, we run it again in parallel, but this time with n_jobs = 4:

```Function receives the arguments as a list:Function receives the arguments as a list: Function receives the arguments as a list:  Function receives the arguments as a list: (1, 1)(1, 2) (1, 4)

(1, 3)

1_1
1_2
1_41_3

['1_1', '1_2', '1_3', '1_4']
[Parallel(n_jobs=4)]: Done   4 out of   4 | elapsed:    0.9s finished```

As you can see, the internal print commands from all four jobs are being printed to screen more-or-less simultaneously, and not in the original order. Whatever thread finished first gets printed first! The time to finish is now around 1/4 the original time, as expected.

The new default colormap for matplotlib is called “viridis” and it’s great!

It’s probably not news to anyone in data visualization that the most-used “jet” colormap (sic) (sometimes referred to as “rainbow”) is a bad choice for many reasons.

• Doesn’t work when printed black & white
• Doesn’t work well for colourblind people
• Not linear in colour space, so it’s hard to estimate numerical values from the resulting image

The Matlab team recently developed a new colormap called “parula” but amazingly because Matlab is commercially-licensed software no-one else is allowed to use it!
The guys at Matplotlib have therefore developed their own version, based on the principles of colour theory (covered in my own BSc lecture courses on Visualization 🙂 ) that is actually an improvement on parula. The new Matplotlib default colormap is named “viridis” and you can learn all about it in the following lecture from the SciPy 2015 conference (YouTube ):

Viridis will be the new default colour map from Matplotlib 2.0 onwards, but users of v1.5.1 can also choose to use it using the `cmap=plt.cm.viridis` command.
I don’t know about you, but I like it a lot and will start using it immediately!

Science via email

One thing scientists and engineers have to do daily is discuss collaborative work via email exchanges. This often includes the need to share and discuss mathematical equations and to represent variables with subscripts and superscripts or special characters; something that is tricky when you are emailing in plain text.

WikiImages / Pixabay

Of course it is possible to work around this problem! Email was invented by scientists, and for decades they have been communicating in this manner, using various conventions to convey the correct information using plaintext. However, if you are a Gmail user there is a nice extension that will make your equations look proper good.

Tex for Gmail

TeX for Gmail is a Chrome browser extension that checks a Gmail email that you are writing for LaTeX markup and converts the markup to a visually prettier equation, using one of two modes. In Simple Math mode, subscripts and superscripts are correctly formatted but the current font is maintained and text remains ediatble. In Rich Math mode, the equation is rendered into TeX and replaced by an embedded image.  The email recipient doesn’t need the extension installed on their browser in order to read your nice equations!

Example

Original markup:

\$E = mc^{2}\$

Simple Math mode:

E = mc2

Rich Math mode:
$\dpi{300}\inline E = mc^{2}$

Issues

One problem; once the extension has converted my markup to formatted text, I cannot get the markup back. So editing a small mistake usually means re-doing all the curly brackets and other stuff that a TeX equation requires. The only workaround seems to be to stay vigilant and use Undo (Ctrl-z), but this doesn’t work when you notice a mistake in an equation that you wrote a while ago. One improvement could be the option to restore any equation to the original markup.

Conclusions

Overall, a great little tool to improve the clarity of science and maths communications over email. With a few small improvements it could be even better but it is already very usable.