# Category: Software

### Successfully clearing ports in Salome (Code ASTER)

Building a geometry in the Salome graphical user interface (GUI).

## How Salome tracks ports

When Salome is starting up, it checks for free ports on your system using a few built-in Python scripts. Then when you close Salome those ports should be freed up again for the next one. This has a number of uses, but one reason is to stop multiple instances of Salome trying to use the same port at once.

Those Python scripts keep track of the port numbers that are currently in use by storing the numbers in some configuration files (*.cfg) that are saved on your system. When Salome exits, those configuration files should be updated to recognize that the current port is being freed up again.

## A possible problem with port tracking

Sometimes, however, those configuration files do not get updated. For example, if you are running Salome using a script in batch mode you can include a command to kill Salome properly, giving the correct port number. I have found in the past that this method has not been very reliable and so the configuration file keeps being updated with port numbers that are in use, but those numbers are never removed from the “in use” list even if they have actually been freed up on the system.

If you do a lot of scripting in salome you will find that when writing/testing your scripts, if salome crashes a lot then often the ports being used don’t get released and so stay as “being used” in the port log file.

The result is that after a while a maximum number of ports is reached and Salome thinks that there are no ports free, so it will not start successfully, giving the following error message:

RuntimeError:

Can’t find a free port to launch omniNames

Try to kill the running servers and then launch SALOME again.

Perhaps you will check for salome instances running using. There may or may not be lots of Salome processes running on your system. In this post I am going to assume that Salome has closed properly. You can check if salome is running:

ps -x | grep salome

Salome provides some Python scripts that should kill any running instances in a well-behaved way. For example, killSalome.py kills all of the instances running on your system, so you should use it with care:

~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/bin/salome/killSalome.py

But if you already know the port number of a specific instance, killSalomeWithPort.py can be invoked to kill just that one, without affecting other instances that are currently running:

~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/bin/salome/killSalomeWithPort.py 21116

As a last resort, you can kill all processes in your system that mention salome:
ps x | grep salome | awk ‘{print \$1}’ | xargs -n1 kill
OK, so now the ports should be freed up, right? Well maybe not! Your problem might indeed be that Salome is not updating the port config files correctly. Just killing the processes does not help because the next instance of Salome you launch will still check those files and think that there are no free ports. If this describes your current situation, don’t worry! I will now explain how to fix it.

## Clearing the port config files

For my version of Salome (see footnotes), hidden configuration files were being created in a number of locations.
For example, the .omniORB_PortManager.cfg file, which in my case is located in my home directory at ~/.omniORB_PortManager.cfg
However, deleting this file did not solve the problem.
I then searched through my home drive for all *.cfg files, but none of the ones that came back were related to ports.
\$ find . -name *.cfg
./bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/KERNEL/share/salome/resources/kernel/channel.cfg
… plus a load of non-salome-related stuff…
This would also have found any “.omniORB_*_2888.cfg” (where 2888 is the port number) as mentioned here but those did not show up. There is a USERS directory within my salome installation directory structure at ~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/SALOME/USERS, however it is empty and so does not contain any such .cfg files.
~/bin/SALOME-8.2.0-UB16.04/BINARIES-UB16.04/SALOME/USERS\$ ls -a
.  ..
Finally, I found that for my installation Salome was using /tmp to store these hidden *.cfg files. The /tmp/ directory (and maybe other directories – see below) contained the following files:
• .omniORB_PortManager.cfg
• Stores a list of the busy ports
• .omniORB_PortManager.lock
• Locks the .omniORB_PortManager.cfg from being edited? I’m not sure exactly what it locks.
• Should be deleted each time but if Salome is not doing this you will have many of these files with different port numbers.
• I guess this probably stores the last port that was used, although I deleted it already before confirming this hunch.

On one of my systems (a HPC cluster) Salome was not storing the .omniORB_PortManager.cfg file in /tmp/ . Instead it was located in /home/<username>/bin/salome/appli_V7_6_0/USERS
You can check which path is being used by looking in /home/<username>/bin/salome/appli_V7_6_0/bin/salome/PortManager.py  In there is a variable named “omniorbUserPath”, which is obtained from an environment variable that I could not see. Nonetheless, I modified PortManager.py to print this variable to screen, which told me that it was looking for /home/<username>/bin/salome/appli_V7_6_0/USERS/.omniORB_PortManager.cfg . Believe me, this was very frustrating to identify as I really thought I had deleted all necessary files, only for salome to continue not finding a free port!

You can delete all of these files, and now when you run Salome it will start fresh, creating new files as it needs. Problem fixed! But…

## Stop it happening again

The above fix will only help if we don’t cause the problem again. If you are creating many models or running many simulations from a controller script you do not want to keep reaching a hard limit of consecutive salome calls you can make, only to have to manually delete the omniORB config files again. What we really want is to make sure that Salome will update the config files correctly in future.
In the past I tried many times to use killSalomeWithPort.py. I did this by running salome with the –ns-port-log argument and providing a log file to store the port number.
<salome_distro>/salome --ns-port-log="somefolder/salomePort.log" -t -b script.py
port_file = open('somefolder/salomePort.log' , 'r')
<salome_distro>/bin/salome/killSalomeWithPort.py %s' % killPort
For some reason I could never get this to work successfully. I always ended up building a call to killSalome.py in my script, which kills all running Salome instances and meant that I had to build models consecutively, never in parallel. It also meant that if I had a script running I could not really use the Salome GUI because it too could be killed at any moment!
Here is the correct way to do it, which I only recently discovered through some trawling of the web (unfortunately I can no longer found the page where I saw it and so I can’t give credit to the author).
if not salome.sg.hasDesktop():
from killSalomeWithPort import killMyPort
killMyPort(os.getenv('NSPORT'))
The important part is inside the “if” clause. killSalomeWithPort contains a function called killMyPort and the current port used in our Salome instance is stored in an environment variable named “NSPORT”. So by passing that port number to the function we can kill Salome cleanly!
The salome.sg.hasDesktop() just returns True if we are in the Salome GUI. Because if we were, we would not want our script to kill Salome for us. We only want it to happen if we are running inside a batch script.
I’m wondering why I never found this solution before, as it would have saved me a lot of frustration, but there you go, that’s life! Now I am passing it onto you, have fun!

### Footnotes

• I am using the Salome version 8.2.0 for Ubuntu 16.04 x64 precompiled binaries. Different versions have different file structures and so your binary folder path might be different. If you search on the command line for e.g. runSalome.py, you should be able to identify where your salome binaries and Python scripts are located.
• A lot of this info was gleaned from the Salome user forum, particularly from this 2015 post: http://www.salome-platform.org/forum/forum_10/519093933
• In Linux, hidden files have a full stop (US: “period”) in front of their filename. To see them when listing a directory use “ls -a”.

### How to get up-to-date Python packages without bothering your cluster admin

If you have ever been stuck as a user on an out-of-date cluster without root access it can be frustrating to ask the admin guy to install packages for you. Even if they respond, by the time they get round to it you might have moved onto something else. The moment could be gone.

Luckily, as far as Python is concerned, the pyenv project allows users to install their own local Python version or even assign different versions to different directories/projects.

Public domain image.

João Moreira has written a great beginner’s guide on the Amaral Lab homepage in order to get started. I now have the latest version of Python 2 (v2.7.12) installed along with essential packages like Scipy and Pandas, which I added using pip.

Installation of pyenv is easy.

curl -L https://raw.githubusercontent.com/yyuu/pyenv-installer/master/bin/pyenv-installer | bash

Different versions of python can then be installed with

pyenv install 3.4.0

Switching your global Python version is then as simple as typing

pyenv global 3.4.0

From first impressions I can say I highly recommend pyenv, and will continue to learn about it over the coming days through using it. Please refer to João’s excellent post for more details.

### Plotting multivariate data with Matplotlib/Pylab: Edgar Anderson’s Iris flower data set

The problem of how to visualize multivariate data sets is something I often face in my work. When using numerical optimization we might have a single objective function and multiple design variables that can be represented by columnar data in the form {x1, x2, x3, … xn, y} a.k.a. NXY. With design spaces of more than a few dimensions it is difficult to visualize them in order to estimate the relationship between each independent variable and the objective, or perform a sensitivity study.

JensG / Pixabay

While perusing recent work in and tools for visualizing such data I stumbled across some nice examples of multivariate data plotting using a famous data set known as the “Iris data set”, also known as Fisher’s Iris data set or Edgar Anderson’s Iris flower data set. It contains data from 50 flowers each of three different flower species, collected in the Gaspé Peninsula. This set is not in the NXY form typical of optimization routines, but instead each flower has a number of parameters measured and tabulated; namely sepal length, sepal width, petal length and petal width. In other words there is no resultant Y data that is a function of the design space vector. Instead, it is interesting to plot relationships between the measured parameters to determine if they correlate with each other.

A quick internet search brings up a number of examples where the set has been plotted as a gridded set of subplots, using various software tools. For example, Mike Bostock’s blog post demonstrating his D3.js package, and the version on the wikipedia page.

I decided to try and code a Matplotlib script to generate a similar gridded multiplot from the data set. I did so within a Jupyter Notebook (formerly known as iPython Notebook) running Python 2.7. The data was imported using Pandas and made use of Matplotlib’s Pyplot module. Pandas was used to import the data but it could have been done in a number of different ways; it is just that Pandas is designed to work with csv files containing a mix of types.

The resulting image can be seen below.

Fisher’s Iris data set sometimes known as Anderson’s Iris data set, visualization by Simon Bance using Matplotlib/Pyplot. A multivariate data set introduced by Ronald Fisher in 1936 from data collected by Edgar Anderson on Iris flowers in the Gaspé Peninsula.

Here is the script:

"""
https://en.wikipedia.org/wiki/Iris_flower_data_set
A script for plotting multivariate tabular data as gridded scatter plots.
"""
import os
import pandas as pd
import matplotlib.pyplot as plt

inFile = r'iris.dat'

# Check if data file exists:
if not os.path.exists(inFile): sys.exit("File %s does not exist" % inFile)

rootFolder = os.path.dirname(os.path.abspath(inFile))

# Read in the data file
df.head(5) # Prints first n lines to check if we loaded the data file as expected.

# We also have n=4 distinct species in the Species column and I will
# list the species names so we can distinguish them later for plotting:
species = list(df.Species.unique()) # normal python list, thank you very much!
print type(species)

# Here we specify how many columns prepend and append the columns that we want to use.
# For Dakota this would include the objective function(s) column(s) appended to the end.
num_precols = 0
num_obj_fn = 1

# Work out the number of dimensions in each design vector:
num_dims = df.shape[1] - num_obj_fn # We know that there are 3 additional columns (and hope that it stays consistent in future)!
print "Our design vector has %s dimensions: %s" % (num_dims, headers[num_precols:-1])
gridshape = (num_dims, num_dims)
num_plots = num_dims**2
print "Our multivariate grid will therefore be of shape", gridshape, "with a total of", num_plots, "plots"

# Plot the data in a grid of subplots.
fig = plt.figure(figsize=(12, 12))

# Iterate over the correct number of plots.
n = 1

# Create an empty 2D list to store created axes. This alows us to edit them somehow.
axes = [[False for i in range(num_dims)] for j in range(num_dims)]

for j in range(num_dims):
for i in range(num_dims):

# e.g. plt.subplot(nx, ny, plotnumber)
ax = fig.add_subplot(num_dims, num_dims, n) # Plot numbering in this case starts from 1 not zero (MATLAB style indexing)!

# Choose your list of colours
colors = ['red', 'green', 'blue']

for index, s in enumerate(species):

# x axis: For each in the species list look at all rows with that value in the Species column.
# Use the ith column of that subset as the x series.
# y axis: Likewisem, but use the jth column.

if i != j:
ax.scatter(df.where(df['Species'] == s).ix[:,i], df.where(df['Species'] == s).ix[:,j], color=colors[index], label=s)
else:
# Put the variable name on the i=j subplots:
pass

# Set axis labels:

# Hide axes for all but the plots on the edge:
if j &lt; num_dims - 1: ax.xaxis.set_visible(False) if i &gt; 0:
ax.yaxis.set_visible(False)

if i == 1 and j == 0:
ax.legend(bbox_to_anchor=(3.5, 1), loc=2, borderaxespad=0., title="Species name:")

# Add this axis to the list.
axes[j][i] = ax

n += 1

plt.savefig("%s/iris.png" % rootFolder, dpi=300)
plt.show()

Further so-called “classic data sets” are listed at https://en.wikipedia.org/wiki/Data_set#Classic_data_sets.

### The new default colormap for matplotlib is called “viridis” and it’s great!

It’s probably not news to anyone in data visualization that the most-used “jet” colormap (sic) (sometimes referred to as “rainbow”) is a bad choice for many reasons.

• Doesn’t work when printed black & white
• Doesn’t work well for colourblind people
• Not linear in colour space, so it’s hard to estimate numerical values from the resulting image

The Matlab team recently developed a new colormap called “parula” but amazingly because Matlab is commercially-licensed software no-one else is allowed to use it!
The guys at Matplotlib have therefore developed their own version, based on the principles of colour theory (covered in my own BSc lecture courses on Visualization 🙂 ) that is actually an improvement on parula. The new Matplotlib default colormap is named “viridis” and you can learn all about it in the following lecture from the SciPy 2015 conference (YouTube ):

Viridis will be the new default colour map from Matplotlib 2.0 onwards, but users of v1.5.1 can also choose to use it using the cmap=plt.cm.viridis command.
I don’t know about you, but I like it a lot and will start using it immediately!

## Science via email

One thing scientists and engineers have to do daily is discuss collaborative work via email exchanges. This often includes the need to share and discuss mathematical equations and to represent variables with subscripts and superscripts or special characters; something that is tricky when you are emailing in plain text.

WikiImages / Pixabay

Of course it is possible to work around this problem! Email was invented by scientists, and for decades they have been communicating in this manner, using various conventions to convey the correct information using plaintext. However, if you are a Gmail user there is a nice extension that will make your equations look proper good.

## Tex for Gmail

TeX for Gmail is a Chrome browser extension that checks a Gmail email that you are writing for LaTeX markup and converts the markup to a visually prettier equation, using one of two modes. In Simple Math mode, subscripts and superscripts are correctly formatted but the current font is maintained and text remains ediatble. In Rich Math mode, the equation is rendered into TeX and replaced by an embedded image.  The email recipient doesn’t need the extension installed on their browser in order to read your nice equations!

### Example

Original markup:

\$E = mc^{2}\$

Simple Math mode:

E = mc2

Rich Math mode:
$\dpi{300}\inline E = mc^{2}$

## Issues

One problem; once the extension has converted my markup to formatted text, I cannot get the markup back. So editing a small mistake usually means re-doing all the curly brackets and other stuff that a TeX equation requires. The only workaround seems to be to stay vigilant and use Undo (Ctrl-z), but this doesn’t work when you notice a mistake in an equation that you wrote a while ago. One improvement could be the option to restore any equation to the original markup.

## Conclusions

Overall, a great little tool to improve the clarity of science and maths communications over email. With a few small improvements it could be even better but it is already very usable.