Introduction

Summary

The model parameterisations refer to the set of parameters that are varied and estimated in a retrieval. Given that the computational time of retrievals largely depend on the number of fitted parameters, the input information in the atmospheric classes is often parameterised to reduce this number. ArchNEMESIS includes a large set of model parameterisations heritage from the original NEMESIS code and which might be useful for many applications. Nevertheless, new parameterisations can also be easily introduced to develop new methodologies for the analysis of planetary spectra.

The information about the model parameterisations is stored in the Variables class. Given the different forms parameterisations may take, this class needs to be versatile to store all the relevant information about the model parameters and the indications of what information must be changed in the forward model based on a particular parameterisation.

The type of parameterisation is indicated by the VARIDENT parameter of the Variables class. For each parameterisation included in the model, VARIDENT is a three-element integer array. The logic behind this array is as follows:

Let’s call the three elements of the array I$_1$, I$_2$, I$_3$.
I$_3$ indicates the ID for the particular parameterisation, following the identifiers listed in the next section.
If the selected model ID does not correspond to an atmospheric model, then I$_1$ and I$_2$ do not provide any useful information, and they can get arbitrary numbers. However, if the model ID corresponds to an atmospheric parameterisation, then I$_1$ and I$_2$ indicate the atmospheric parameter that must be modified. In particular:
- If I$_1$ = 0, then the model parameterisation applies to the temperature profile.
- If I$_1$ > 0, then the model parameterisation applies to the volume mixing ratio of a particular gas, with I$_1$ and I$_2$ representing the gas ID and isotope ID (e.g., I$_1$=1, I$_2$=0 for H$_2$O; I$_1$=1, I$_2$=4 for HDO).
- If I$_1$ < 0, then we can have:
  - If 1 $\leq$ |I$_1$| $\leq$ NDUST, then the model parameterisation applies to the I$_1$ aerosol population.
  - If |I$_1$| = NDUST + 1, then the model parameterisation applies to the para-H$_2$ fraction profile.
  - If |I$_1$| = NDUST + 1, then the model parameterisation applies to the fractional cloud cover profile.

Examples of VARIDENT

Let’s assume we have an atmosphere with 3 aerosol populations. We include 5 model parameterisations, defined by:

I$_1$=2, I$_2$=0, I$_3$=0: Continuous vertical profile of CO$_2$ volume mixing ratio.
I$_1$=-2, I$_2$=0, I$_3$=2: Scaling factor for abundance of the second aerosol population in the atmosphere.
I$_1$=-4, I$_2$=0, I$_3$=0: Scaling factor for para-H2 fraction profile.
I$_1$=-1, I$_2$=0, I$_3$=32: Abundance of first aerosol population represented by base pressure, optical depth and fractional scale height.
I$_1$=0, I$_2$=0, I$_3$=231: Multiplication of spectrum by polynomial function

Available models

The model parameterisations have the purpose of modifying the information in the reference classes according to a smaller number of parameters. This is especially important for retrievals, when these parameters are modified in each iteration to find the best fit between the modelled and measured spectra. Each of the model parameterisations is defined by some Variable IDs and Variable parameters, which are defined in the .apr file. This file can be read using the read_apr function on the Variables class.

You can run the command python -c 'from archnemesis.Models import Models; print(Models.info())' to display some summary information for each model.

At this stage, only a number of models have been implemented in archNEMESIS with respect to NEMESIS. Detailed information about each of the model parameterisations is included in the Examples. In the following, a brief summary describing each of these models is included:

Model 0: Continuous atmospheric vertical profile.

No. of parameters: NP
No. of extra parameters: 0
Short description: Specify the values of all vertical levels of the atmospheric profiles.
Format in .apr file:
- Number of vertical levels (NP) | Correlation length
- Pressure (level 1) | Value (level 1) | Uncertainty (level 1)
- ...
- Pressure (level NP) | Value (level NP) | Uncertainty (level NP)

Model 2: Scaling of an atmospheric vertical profile.

No. of parameters: 1
No. of extra parameters: 0
Short description: Multiply vertical profile with a scaling factor.
Format in .apr file:
- Scaling factor | Uncertainty

Model 3: Scaling of an atmospheric vertical profile (carried in log-space).

No. of parameters: 1
No. of extra parameters: 0
Short description: Multiply vertical profile with a scaling factor. Internally within the code this scaling factor is converted to log-space. This is suitable for quantities that can vary over orders of magnitudes and that cannot go negative.
Format in .apr file:
- Scaling factor | Uncertainty

Model 9: Cloud represented by base height, optical depth and fractional scale height.

No. of parameters: 3
No. of extra parameters: 0
Short description: The vertical distribution of the cloud is parameterised by its base altitude and a fractional scale height. The total vertically-integrated optical depth of the cloud is also specified.
Format in .apr file:
- Base altitude (km) | Uncertainty
- Optical depth | Uncertainty
- Fractional scale height (km) | Uncertainty

Model 32: Cloud represented by base pressure, optical depth and fractional scale height.

No. of parameters: 3
No. of extra parameters: 0
Short description: The vertical distribution of the cloud is parameterised by its base pressure and a fractional scale height. Below the base altitude the cloud decreases exponentially with a fractional scale height of 1 km. The total vertically-integrated optical depth of the cloud is also specified.
Format in .apr file:
- Base pressure (atm) | Uncertainty
- Optical depth | Uncertainty
- Fractional scale height (km) | Uncertainty

Model 43: Temperature profile from double grey analytic formulation from Parmentier and Guillot (2014).

No. of parameters: 5
No. of extra parameters: 4
Short description: In this model, the temperature profile is parameterised using the double grey analytic formulation derived in Parmentier and Guillot (2014).
Format in .apr file:
- α | Uncertainty in α
- β | Uncertainty in β
- k_ir | Uncertainty in k_ir
- γ₁ | Uncertainty in γ₁
- γ₂ | Uncertainty in γ₂
- Star temperature (K) | Star radius (km) | Star-planet distance (km) | Temperature of planet interior (K)

Model 45: Irwin Ice Giant CH4 model.

No. of parameters: 3
No. of extra parameters: 0
Short description: The vertical distribution of methane is specified with a deep abundance, a relative humidity in the troposphere, and a stratospheric abundance.
Format in .apr file:
- Deep VMR | Uncertainty
- Relative Humidity | Uncertainty
- Stratospheric VMR | Uncertainty

Model 47: Gaussian cloud profile.

No. of parameters: 3
No. of extra parameters: 0
Short description: A cloud modelled with a gaussian opacity profile in log(pressure), with a specified integrated opacity, central pressure, and standard deviation.
Format in .apr file:
- Integrated Opacity | Uncertainty
- Mean Pressure | Uncertainty
- Standard Deviation (units of log(pressure)) | Uncertainty

Model 51: Scaling of another profile.

No. of parameters: 1
No. of extra parameters: 0
Short description: This is parameterisation 49 in NEMESIS. Sets a specified profile equal to a simple scaling of another profile. Useful for varying isotopic abundances of a gas.
Format in .apr file:
- Gas ID to be scaled | Isotope ID to be scaled
- Scaling Factor | Uncertainty

Model 110: Model for Venus cloud from Haus et al. (2016) with an altitude offset.

No. of parameters: 1
No. of extra parameters: 0
Short description: In this model, the Venus cloud is made of four particle modes from a solution of H2SO4 and H2O. The vertical distribution of each mode is parameterised following Haus et al. (2016). In this implementation, we describe the cloud as in the Haus model, but allow an altitude offset to retrieve the altitude of the cloud.
Format in .apr file:
- Altitude offset (km) | Uncertainty in altitude offset (km)

Model 111: Model for Venus cloud from Haus et al. (2016) with an altitude offset and coupled SO2 volume mixing ratio profile.

No. of parameters: 3
No. of extra parameters: 0
Short description: In this model, the Venus cloud is made of four particle modes from a solution of H2SO4 and H2O. The vertical distribution of each mode is parameterised following Haus et al. (2016). In this implementation, we describe the cloud as in the Haus model, but allow an altitude offset to retrieve the altitude of the cloud. In addition, we also include a parameterisation for the volume mixing ratio profile of SO2 by indicating its value below and above the cloud. The mixing ratio is assumed to decrease exponentially within the cloud.
Format in .apr file:
- Altitude offset (km) | Uncertainty in altitude offset (km)
- SO2 mixing ratio below the cloud | Uncertainty in SO2 mixing ratio below the cloud
- SO2 mixing ratio above the cloud | Uncertainty in SO2 mixing ratio above the cloud

Model 229: Model for double-Gaussian instrument line shape for TGO/ACS (Alday et al. 2019).

No. of parameters: 7
No. of extra parameters: 0
Short description: In this model, the instrument lineshape is modelled using a varying double-Gaussian function across the wavelength range. This model was introduced to mimic the instrument lineshape from the ACS instrument aboard the ExoMars Trace Gas Orbiter, and is defined by seven parameters. The first three parameters (P1, P2, P3) indicate the relative offset of the main Gaussian with respect to the wavelengths of the first, mid and last bins in the spectral range (aiming to account for small deviations in the wavelength calibration), with the offset in the rest of the bins being calculated by linear interpolation. The fourth parameter (P4) indicates the relative offset of the second Gaussian with respect to the main one. The fifth parameter (P5) represents the full-width at half-maximum (FWHM) of the two Gaussians, which are assumed to have the same FWHM. The sixth and seventh parameters (P6, P7) indicate the relative amplitude of the second Gaussian with respect to the main one at the first and last bins of the wavelength array, being linearly interpolated for the rest of the bins.
Format in .apr file:
- P1 | Uncertainty
- P2 | Uncertainty
- P3 | Uncertainty
- P4 | Uncertainty
- P5 | Uncertainty
- P6 | Uncertainty
- P7 | Uncertainty

Model 231: Multiplication of spectrum by polynomial function.

No. of parameters: NGEOM $\times$ (NDEGREE+1)
No. of extra parameters: 2 (NGEOM and NDEGREE)
Short description: In this model, we multiply each spectrum (NGEOM spectra) by a polynomial function of degree (NDEGREE). The polynomial function can be different for each of the spectra. The polynomial function is parameterised as $p(\lambda, \lambda_0) = A_0 + A_1(\lambda-\lambda_0) + A_2(\lambda-\lambda_0)^2 +$ ... $+ A_N(\lambda-\lambda_0)^N$, where $\lambda_0$ represents the first wavelength in the vector.
Format in .apr file: The next line in the file is a string containing the name of another file where this parametersiation is specified. The structure of the other file must follow:
- NGEOM | NDEGREE
- A0 (Geometry 1) | Uncertainty in A0 | A1 (Geometry 1) | Uncertainty in A1 | ... | AN (Geometry 1) | Uncertainty in AN
- A0 (Geometry 2) | Uncertainty in A0 | A1 (Geometry 2) | Uncertainty in A1 | ... | AN (Geometry 2) | Uncertainty in AN
- ...
- A0 (Geometry NGEOM) | Uncertainty in A0 | A1 (Geometry NGEOM) | Uncertainty in A1 | ... | AN (Geometry NGEOM) | Uncertainty in AN

Model 444: Size distribution and imaginary refractive index spectrum of aerosol species.

No. of parameters: 2 + NWAVE
No. of extra parameters: 0
Short description: This model is used to vary the size distribution and/or the imaginary refractive index spectrum of an aerosol species. Setting a negative correlation length sets the imaginary refractive index to be constant with wavelength.
Format in .apr file: The next line in the file is a string containing the name of another file where this parametersiation is specified. The structure of the other file must follow:
- Mean size of species | Uncertainty
- NWAVE (number of n_imag wavelengths) | Correlation length
- Reference wavelength | Real refractive index at reference wavelength
- Cross-section normalisation wavelength
- Wavelength 1 | Imaginary refractive index 1 | Uncertainty
- Wavelength 2 | Imaginary refractive index 2 | Uncertainty
- ... | ... | ...
- Wavelength NWAVE | Imaginary refractive index NWAVE | Uncertainty

Model 447: Specification of Doppler velocity.

No. of parameters: 1
No. of extra parameters: 0
Short description: In this model, we specify the Doppler velocity between the target planet and the observer. Currently, the Doppler velocity is common for all geometries in the Measurement class, but in the future this will be updated for different Doppler velocities in different geometries, potentially enabling the retrieval of atmospheric winds.
Format in .apr file:
- Doppler velocity (km/s) | Uncertainty

Model 666: Retrieval of pressure at specified height.

No. of parameters: 1
No. of extra parameters: 1
Short description: In this model, the atmospheric pressure is specified at a given fixed tangent height (extra parameter). Then, the rest of the pressure levels will be adjusted based on the hydrostatic equilibrium relation (which depends on the temperature altitudinal profile).
Format in .apr file:
- Tangent height (km)
- Pressure (atm) | Uncertainty

Model 667: Scaling of spectrum by a dilution factor.

No. of parameters: 1
No. of extra parameters: 1
Short description: In this model, the computed spectrum is multiplied by a dilution factor (i.e., scaling factor). This modelled was initially introduced to account for strong temperature gradients in exoplanets, but it might be useful for other cases.
Format in .apr file:
- Dilution factor | Uncertainty

Model 999: Retrieval of surface temperature.

No. of parameters: 1
No. of extra parameters: 0
Short description: In this model, the surface temperature (K) is modified.
Format in .apr file:
- Surface temperature | Uncertainty

Model ID	No. of parameters	No. of extra parameters	Description
49	NP	0	Continuous atmospheric vertical profile in linear scale (i.e., allowing negative abundances)
50	NP	0	Continuous profile of scaling factors in linear scale
2310	NGEOM $\times$ (NDEGREE+1) $\times$ NWINDOWS	3	Same as model 231 but applying it to different spectral windows

Custom Model Class

Users may want to define their own custom parameterisations ande create new retrieval schemes for their specific purposes. Implementing new parameterisations in archNEMESIS is relatively straightforward, at the simplest level you just need to add a new class to one of the Models/PreRTModels.py, or Models/PostRTModels.py files.

We are in the process of rationalising the model classes, so we recommend using the below model classes as a guide:

In the Models/PreRTModels.py file:
- Model2
- Model3
- Model9
- Model32
- Model45
- Model47
- Model444
In the Models/PostRTModels.py file:
- Model231

NOTE: In each file there is a template model class, that is the result of the steps (2) and (3) (detailed below), that has the methods that need to be implemented. The body of each method body consists of raise NotImplementedError('This is a template model and should never be used') as the first expression and some example code with explanatory comments, just copy the template model class, rename it, give it a unique ID, and replace the method bodies with the required code for your model.

The templates model classes are:

TemplatePostRTModel in the Models/PreRTModels.py file
TemplatePreRTModel in the Models/PostRTModels.py file

General Steps

Work out what kind of model you are creating, this informs where you put your new model class.

Models applied before the radiative transfer calculation are in Models/PreRTModels.py, there are generally two types:
- Atmospheric models affect one of the following atmosphere profiles:
  - gas volume mixing ratio
  - aerosol species (dust) density
  - temperature
  - para-h2 fraction
  - fractional cloud cover
- Non-Atmospheric models affect other things, e.g.:
  - Optical properties of an aerosol species
  - Instrument lineshape
  - Doppler shift of the observer w.r.t the target
Models applied after the radiative transfer calculation are in Models/PostRTModels.py, currently there is only one type:
- Spectral models affect the the modelled output spectrum directly.

Look at the ModelBase class in the Models/ModelBase.py file. List any abstract methods on the class (they are marked with @abstractmethod)
In the file that corresponds to the type of model you are creating, look at the “model type base class” (at the top of the file) : PreRTModelBase for Models/PreRTModels.py; PostRTModelBase for Models/PostRTModels.py. Do the following:
- Any methods with the same name as an abstract method of the ModelBase class does NOT have to be implemented, as the “model type base class” already provides an implementation for you. You can cross these off your list.
- Any abstract methods on the “model type base class” (they are marked with @abstractmethod), DO have to be implemented. Add these to your list.
Write a new class that implements all of the abstract methods on your list, and give it an integer class attribute id that is a unique ID number that will be used to identify your custom model, there is a test to ensure all models have a unique ID number so run the tests after adding a model to ensure everything is set up properly. Use the other implementations in the file as a guide.

Example Custom Model Class

As it is much easier to see the above process in action, we present a walkthrough example of implementing a custom model. The example we are using is not going to be useful, but it serves as a guide that shows how to do something non-trivial.

Our custom model will parameterise the volume mixing ratio of a gas as a stepwise function. This is not really that useful, but it gives us the opportunity to explore how to do a few different things.

Following the above procedure:

Step 1 is to work out what kind of model we are creating. In this case the model is an Atmospheric model (as it alters a gas volume mixing ratio), so we will put the model in the Models/PreRTModels.py file.
Step 2 is to look at ModelBase and PreRTModelBase and write down the abstract methods that require an implementation. At the time of writng (30/06/2025), these are the relevant methods on the two classes:
- ModelBase abstract methods:
  - is_varident_valid
  - from_apr_to_state_vector
  - calculate
  - calculate_from_subprofretg
  - patch_from_subprofretg
  - calculate_from_subspecret
- PreRTModelBase concrete methods (i.e. non-abstract):
  - is_varident_valid
  - patch_from_subprofretg
  - calculate_from_subspecret
- PreRTModelBase abstract methods:
  - calculate_from_subprofretg

Therefore, if we combine the abstract methods from both classes, and then remove the concrete methods of PreRTModelBase, we have the following list of methods that our new class needs to implement: from_apr_to_state_vector, calculate, calculate_from_subprofretg. This is step 3.

We can use the template class in Models/PreRTModels.py to speed up steps (2) and (3), we just copy the template class, rename it, and re-write its methods to do what we need.

Step 4 is to actually implement the methods, and is performed in the following sections.

first definitions of the model class

First off, we need a name. The current model names are things like Model45, they are holdovers from the FORTRAN code and are not very descriptive. Ideally we want something that describes what the model represents, “PiecewiseGasVMR” is a decent choice so we will use that.

Secondly, we need an ID number, we should choose something that is somewhat consistent with the current ID numbers if possible. I will use 10 for this example.

Thirdly, we should add a docstring to the class that describes it.

At this point our model class looks like (some code omitted for clarity):

class PiecewiseGasVMR(PreRTModelBase):
   """
   Parameterises the gas volume mixing ratio as a piecewise function with `n_chunk` chunks.
   The chunk values are retrieved.
   """
   id : int = 10

   def __init__(
      # OMITTED CODE #

   @classmethod
   def calculate(
      # OMITTED CODE #

   @classmethod
   def from_apr_to_state_vector(
      # OMITTED CODE #

   def calculate_from_subprofretg(
      # OMITTED CODE #

defining the model scope

The model class will need to know the following pieces of information

The number of chunks of the piecewise function
the gas ID we want to apply the profile to
the apriori values of the chunks

I find it easiest to work in the methods in the order they are called by the program, the methods are called in the order:

from_apr_to_state_vector - is called first from Variables_0::read_apr when the .apr file is read.
__init__ - is called inside from_apr_to_state_vector as that method needs to construct the model class instance.
calculate_from_subprofretg is called when forward modelling, inside ForwardModel_0::subprofretg.
calculate is called inside calculate_from_subprofretg as that method performs the actual calculation.

writing `from_apr_to_state_vector`

Reading the docstring of the from_apr_to_state_vector method shows what arguments are passed to us from the Variables_0::read_apr method.

unpacking values from VARIDENT

Considering the varident for the model. Reading the documentation, PreRTModelBase::is_varident_valid method, and other model class’s from_apr_to_state_vector implementations shows that, when we are dealing with an atmospheric model, the three integers in varident are defined as:

varident[0]
- == 0 denotes a temperature profile
- > 0 denotes a gas ID that is equal to varident[0].
- < 0
- is an aerosol species profile when the magnitude is less than or equal to the number of aerosol species
- a para-h2 fraction profile when the magnitude is equal to the number of aerosol species + 1
- a fractional cloud cover profile when the magnitude is equal to the number of aerosol species + 2
varident[1] is the ID of the gas isotope if varident[0] is a gas ID.
model_id denotes the ID number of the model to be used.

Therefore,

we can assume that varident[2] is equal to the cls.id of our model. This is what PreRTModelBase::is_varident_valid ensures.
our model requires varident[0] to be positive as it must be a gas ID
varident[1] is an isotope ID.

We can add the following code to the from_apr_to_state_vector method:

# unpack values from VARIDENT
gas_id = varident[0]
assert gas_id > 0, f'model {cls.__name__} with id={cls.id} must have a gas ID as the first VARIDENT value'
iso_id = varident[1]

reading apriori parameters

We want to read the apriori coefficients from the .apr file. The simplest way to do this is to add some extra lines to the .apr file after the VARIDENT for our model. The first line will be the number of chunks, the next n_chunk lines be a space separated list of the chunk values and their errors.

The f argument is the file descriptor of the open .apr file at the line after the VARIDENT for our model. Therefore we just need to read n_chunk+1 lines from the file and unpack the values in the way we just said.

The npro argument is the number of points in each profile, we want our piecewise function to have at least two points shared in each chunk, so we want to make sure we have enough points in the profile.

We can add the following code to the from_apr_to_state_vector method:

# read in apriori parameters
# read number of chunks
n_chunk = int(f.readline().strip())

# ensure we have at least 2 profile points per chunk
assert npro >= 2*n_chunk, f'{cls.__name__} id={cls.id} must have at most half as many chunks as there are points in a profile, have {n_chunk} vs {npro}'

# create holder arrays
chunk = np.zeros((n_chunk,),float)
chunk_err = np.zeros((n_chunk,),float)

# read value and error of each chunk
for i in range(coeff.size):
   chunk[i], chunk_err[i] = tuple(map(float, f.readline().strip().split()))

packing the state vector and the covariance matrix

We need to pack the apriori parameter values into the statevector and pack the parameter errors into the covariance vector. Note, as we have not set any correlation between our chunks, we can just set the diagonal covariance matrix elements.

The ix argument is the next unused index of the state vector. We will need ix along with:

lx - the log flag for the state vector, denotes if a parameter is stored as a log of its “real” value.
x0 - the actual state vector
sx - the covariance matrix
inum - flag for the state vector, denotes if the derivative of a parameter should be numerically calculated or not.

Therefore, we want to:

store the parameters in the state vector. NOTE: we should store the log of their value as the chunk values cannot be negative.
store the errors in the covariance matrix.
set the log flag for our parameters
for simplicity, set to numerically calculate the derivatives of our parameters

We can add the following code to the from_apr_to_state_vector method:

# Packing the state vector and covariance matrix
ix_0 = ix # store the initial value
for i in range(n_chunk):
   # store the log(chunk) values
   lx[ix] = 1
   inum[ix] = 1
   x0[ix] = np.log(chunk[i])

   # store the chunk variance, must use relative errors as we are storing log(value)
   sx[ix,ix] = (chunk_err[i]/chunk[i])**2

   ix += 1

construct the model class instance and return

Finally, we need to construct the model instance. To enable us to easily select the model when (for example) creating a plot, it is a good idea to classify the model this is generally only done for “Atmospheric” models (ones that influence the Atmosphere attribute of the forward model). We should also pass any constants we will need later to the constructor so we can store them on the instance.

We always need to pass the index of the state vector where the parameters of the model start, and the number of parameters the model stores in the state vector. Then we can pass any constants. As they might be useful later, we will also pass n_chunk, gas_id, and iso_id to be stored on the model.

We can add the following code to the from_apr_to_state_vector method:

# Get the model classification
model_classification = variables.classify_model_type_from_varident(varident, ngas, ndust)
assert issubclass(cls, model_classification[0]), "Model base class must agree with the classification from Variables_0::classify_model_type_from_varident"

# return the constructed model instance
return cls(ix_0, ix-ix_0, model_classification[1], n_chunk, gas_id, iso_id)

code listing for `from_apr_to_state_vector`

The altered code is below, I have omitted docstrings and comments etc. for clarity.

class PiecewiseGasVMR(PreRTModelBase):
   # CODE UNCHANGED FROM EARLIER #

   def __init__(
      # OMITTED CODE #

   @classmethod
   def calculate(
      # OMITTED CODE #

   @classmethod
   def from_apr_to_state_vector(
         cls,
         variables : "Variables_0",
         f : IO,
         varident : np.ndarray[[3],int],
         varparam : np.ndarray[["mparam"],float],
         ix : int,
         lx : np.ndarray[["mx"],int],
         x0 : np.ndarray[["mx"],float],
         sx : np.ndarray[["mx","mx"],float],
         inum : np.ndarray[["mx"],int],
         npro : int,
         ngas : int,
         ndust : int,
         nlocations : int,
         runname : str,
         sxminfac : float,
      ) -> Self:

      # unpack values from VARIDENT
      gas_id = varident[0]
      assert gas_id > 0, f'model {cls.__name__} with id={cls.id} must have a gas ID as the first VARIDENT value'
      iso_id = varident[1]

      # read in apriori parameters
      # read number of chunks
      n_chunk = int(f.readline().strip())

      # ensure we have at least 2 profile points per chunk
      assert npro >= 2*n_chunk, f'{cls.__name__} id={cls.id} must have at most half as many chunks as there are points in a profile, have {n_chunk} vs {npro}'

      # create holder arrays
      chunk = np.zeros((n_chunk,),float)
      chunk_err = np.zeros((n_chunk,),float)

      # read value and error of each chunk
      for i in range(coeff.size):
         chunk[i], chunk_err[i] = tuple(map(float, f.readline().strip().split()))

      # Packing the state vector and covariance matrix
      ix_0 = ix # store the initial value
      for i in range(n_chunk):
         # store the log(chunk) values
         lx[ix] = 1
         inum[ix] = 1
         x0[ix] = np.log(chunk[i])

         # store the chunk variance, must use relative errors as we are storing log(value)
         sx[ix,ix] = (chunk_err[i]/chunk[i])**2

         ix += 1

      # Classify the model
      model_classification = variables.classify_model_type_from_varident(varident, ngas, ndust)
      assert issubclass(cls, model_classification[0]), "Model base class must agree with the classification from Variables_0::classify_model_type_from_varident"


      # return the constructed model instance
      return cls(ix_0, ix-ix_0, model_classification[1], n_chunk, gas_id, iso_id)

   def calculate_from_subprofretg(
      # OMITTED CODE #

writing `init(...)`

This method calls the parent class constructor, stores any data we want on the model instance, and defines the parameters of the model to make it easy to pull the model’s information from the state vector and display it.

calling the parent class constructor

The template gives us the code for this, we just need to a copy it. NOTE: If we were not making an “Atmospheric” model, we would probably not bother forwarding atm_profile_type to the parent class, however it should not be a problem to do so.

Add the following code to the __init__ method:

# initialise the parent class
super().__init__(i_state_vector_start, n_state_vector_entries, atm_profile_type)

store data on the model instance

Earlier we sent n_chunk, gas_id, and iso_id to the class constructor. We should make sure we accept these arguments, then store their values on the class instance.

Alter the definition of the __init__ method to read:

def __init__(
   self,
   i_state_vector_start : int,
   n_state_vector_entries : int,
   atm_profile_type : AtmosphericProfileTypeEnum,
   n_chunk : int,
   gas_id : int,
   iso_id : int,
):

Add the following code to the __init__ method:

# storing constants
self.n_chunk = n_chunk
self.gas_id = gas_id
self.iso_id = iso_id

define model parameters

Finally, we define the model parameters.

The piecewise model is fairly simple, we only actually have a single parameter chunk, but is is a multi-value parameter (i.e. it is an array not a single number). The template model class gives us an example of how to define this kind of parameter. As the chunk parameter takes up all of the space in the sub-section of the state-vector assigned to the model, we can just use slice(None) to select the entire sub-section of the state vector. We should also give a description, and the units of the parameter.

Add the following code to the __init__ method:

# define the model parameters, specifically how they are stored in the state vector
self.parameters : tuple[ModelParameter,...] = (
   ModelParameter(
      'chunk',
      slice(None),
      f'the {n_chunk} values of a piecewise function that define a volume mixing ratio for gas : {gas_id} isotope : {iso_id}',
      'RATIO'
   ), # NOTE: the comma here is required.
)

code listing for `init`

The altered code is below, I have omitted docstrings and comments etc. for clarity.

class PiecewiseGasVMR(PreRTModelBase):
   # CODE UNCHANGED FROM EARLIER #

   def __init__(
         self,
         i_state_vector_start : int,
         n_state_vector_entries : int,
         atm_profile_type : AtmosphericProfileTypeEnum,
         n_chunk : int,
         gas_id : int,
         iso_id : int,
      ):
      # initialise the parent class
      super().__init__(i_state_vector_start, n_state_vector_entries, atm_profile_type)

      # storing constants
      self.n_chunk = n_chunk
      self.gas_id = gas_id
      self.iso_id = iso_id

      # define the model parameters, specifically how they are stored in the state vector
      self.parameters : tuple[ModelParameter,...] = (
         ModelParameter(
            'chunk',
            slice(None),
            f'the {n_chunk} values of a piecewise function that define a volume mixing ratio for gas : {gas_id} isotope : {iso_id}',
            'RATIO'
         ), # NOTE: the comma here is required.
      )


   @classmethod
   def calculate(
      # OMITTED CODE #

   @classmethod
   def from_apr_to_state_vector(
      # OMITTED CODE #

   def calculate_from_subprofretg(
      # OMITTED CODE #

writing `calculate_from_subprofretg`

This method is called from ForwardModel_0::subprofretg and is how an atmospheric model influences the state of the retrieval. We have to perform a few tasks in this method.

Unpack parameters from the state vector and get other values we will need
Use the self.calculate(...) method (see later) to calculate the gas volume mixing ratio according to our model
Put the results of the calculation in their correct palce.

The forward_model argument is the current ForwardModel_0 instance, we can access all of its components using this argument. It is very useful as in principle a model class may alter practically anything about a retrieval by interacting with this argument.

Unpack parameters and values

In our case we only have one parameter, and there is a helpful method self.get_parameter_values_from_state_vector (defined on the ModelBase class that we have indirectly inherited from) that we can use to get the values of all parameters simultaniously. We need to pass the state vector and the state vector log flags to this method. Those are stored on the Variables component of the forward_model as forward_model.Variables.XN and forward_model.Variables.LX respectively.

We can also find the index of the gas volume mixing ratio we need to alter. There are actually two ways to do this. The first uses the ipar argument, which encodes the profile type and profile index that an atmospheric profile should alter. We can get the profile type and index from ipar using the Atmosphere_0::ipar_to_atm_profile_type(...) method on the forward_model.AtmosphereX object. We can also get it by using the Atmosphere_0::locate_gas(...) method on the same object as we have stored the gas_id and iso_id values on the model instance. In this case we should do both and check they agree.

We can also make sure we have been given the correct profile type. Checks like this may seem superfulous, but they make sure that if something does go wrong the error is caught as early as possible, which makes debugging much easier. The profile types are defined in the enums.py file.

Add the following code to the calculate_from_subprofretg method:

# unpack parameters from state vector
chunk = self.get_parameter_values_from_state_vector(forward_model.Variables.XN, forward_model.Variables.LX)

# get the profile type and index, and check the values
atm = forward_model.AtmosphereX
atm_profile_type, atm_profile_idx = atm.ipar_to_atm_profile_type(ipar)

assert atm_profile_type == AtmosphericProfileTypeEnum.GAS_VOLUME_MIXING_RATIO, f'model {self.__class__.__name__} with id={self.id}. Only accepts gas volume mixing ratio profiles'

assert atm_profile_idx == atm.locate_gas(self.gas_id, self.iso_id), f'model {self.__class__.__name__} with id={self.id}. The two different ways of getting the gas_vmr_idx must agree'

send to `self.calculate(...)`

We have a lot of freedom to define how the class method self.calculate works. However, the conventions used by the other models are a good place to start. We can always alter the method later if we need to. For now we will send the atmosphere class, the profiel type and index, and the model parameters. We will assume that we will get back an updated atmosphere class and an array of functional derivatives.

Add the following code to the calculate_from_subprofretg method:

atm, xmap1 = self.calculate(
   atm,
   atm_profile_type,
   atm_profile_idx,
   chunk
)

store the results

We need to put the return Atmosphere_0 class into the same bit of the forward_model argument we got it from. And we need to update the xmap argument with the calculated values in xmap1.

xmap is a numpy array of shape (nx, NVMR+2+NDUST, NP, NLOCATIONS) and holds the functional derivatives of the nx values of the state vector, w.r.t the NVMR+2+NDUST profiles, NP profile points, and NLOCATIONS locations. As we are only considering one location right now, and we know we will only be altering one profile (in this case indexed by ipar) we can set the correct section of xmap from xmap1.

Add the following code to the calculate_from_subprofretg method:

forward_model.AtmosphereX = atm
xmap[self.state_vector_slice, ipar, 0:atm.NP] = xmap1

code listing for `calculate_from_subprofretg`

The altered code is below, I have omitted docstrings and comments etc. for clarity.

class PiecewiseGasVMR(PreRTModelBase):
   # CODE UNCHANGED FROM EARLIER #

   def __init__(
      # OMITTED CODE #

   @classmethod
   def calculate(
      # OMITTED CODE #

   @classmethod
   def from_apr_to_state_vector(
      # OMITTED CODE #

   def calculate_from_subprofretg(
         self,
         forward_model : "ForwardModel_0",
         ix : int,
         ipar : int,
         ivar : int,
         xmap : np.ndarray,
      ) -> None:

      # unpack parameters from state vector
      chunk = self.get_parameter_values_from_state_vector(forward_model.Variables.XN, forward_model.Variables.LX)

      # get the profile type and index, and check the values
      atm = forward_model.AtmosphereX
      atm_profile_type, atm_profile_idx = atm.ipar_to_atm_profile_type(ipar)

      assert atm_profile_type == AtmosphericProfileTypeEnum.GAS_VOLUME_MIXING_RATIO, f'model {self.__class__.__name__} with id={self.id}. Only accepts gas volume mixing ratio profiles'

      assert atm_profile_idx == atm.locate_gas(self.gas_id, self.iso_id), f'model {self.__class__.__name__} with id={self.id}. The two different ways of getting the gas_vmr_idx must agree'

      atm, xmap1 = self.calculate(
         atm,
         atm_profile_type,
         atm_profile_idx,
         chunk
      )

      forward_model.AtmosphereX = atm
      xmap[self.state_vector_slice, ipar, 0:atm.NP] = xmap1

writing `calculate`

Finally we have to perform the actual calculation. All we have to do is work out the new profile values, and the functional derivatives. The interface for the self.calcualte(...) method has been effectively set for us in the previous section. Therefore we can go right ahead and define it.

Alter the calculate class method definition to be:

@classmethod
def calculate(
      cls,
      atm : "Atmosphere_0",
      atm_profile_type : AtmosphericProfileTypeEnum,
      atm_profile_idx : int | None,
      chunk : np.ndarray[[int],float],
   ):

find new profile values

Our piecewise function sets the profile values in chunks. If we have 2 chunks for example, we set one half of the profile to the first chunk’s value, and the second half to the second chunk’s value.

Add the following code to the calculate method:

# update profile values
temp = np.array(atm.VMR)
fidx = np.linspace(0,chunk.size,temp.size)
j=0
for i in range(chunk.size):
   while fidx[j] < i+1:
      temp[j, atm_profile_idx] = chunk[i]
      j += 1
atm.edit_VMR(temp)

find functional derivatives

For functional derivatives xmap of shape (nx, NVMR+2+NDUST, NP, NLOCATIONS) assume the state vector values are labelled by ‘K’, and the profile points are labelled by ‘J’. Then xmap[K, :, J, :] is the rate of change in the ‘J^th’ profile point w.r.t the ‘K^th’ state vector entry for each profile and location. We are restricting outselves to 1 location, and we are operating on 1 specific gas profile, so xmap[K,ipar,J] = d(P[J])/d(X[K]) where P[J] is the J^th profile point and X[K] is the K^th state vector entry.

Therefore as our piecewise function just sets profile values to the same value as the parameter, d(P[J])/d(X[K]) = 1 when J influences K and zero otherwise.

After we have found xmap we can return the results

Add the following code to the calculate method:

# find the functional derivatives
xmap = np.zeros((chunk.size,atm.VMR.size[0]),float)
j=0
for i in range(chunk.size):
   while fidx[j] < i+1:
      xmap[i, j] = 1
      j += 1

# return results
return atm, xmap

code listing for `calculate`

The altered code is below, I have omitted docstrings and comments etc. for clarity.

class PiecewiseGasVMR(PreRTModelBase):
   # CODE UNCHANGED FROM EARLIER #

   def __init__(
      # OMITTED CODE #

   @classmethod
   def calculate(
         cls,
         atm : "Atmosphere_0",
         atm_profile_type : AtmosphericProfileTypeEnum,
         atm_profile_idx : int | None,
         chunk : np.ndarray[['mparam'],float],
      ):
      # update profile values
      temp = np.array(atm.VMR)
      fidx = np.linspace(0,chunk.size,temp.size)
      j=0
      for i in range(chunk.size):
         while fidx[j] < i+1:
            temp[j, atm_profile_idx] = chunk[i]
            j += 1
      atm.edit_VMR(temp)

      # find the functional derivatives
      xmap = np.zeros((chunk.size,atm.VMR.size[0]),float)
      j=0
      for i in range(chunk.size):
         while fidx[j] < i+1:
            xmap[i, j] = 1
            j += 1

      # return results
      return atm, xmap

   @classmethod
   def from_apr_to_state_vector(
      # OMITTED CODE #

   def calculate_from_subprofretg(
      # OMITTED CODE #

Full final code listing

Below is the full code listing for this example model class. You can add it to the Models/PreRTModels.py file and run python -c 'from archnemesis.Models import Models; print(Models.info(10))' to see a summary of this model.

NOTE: This is an example model created as an learning aid. It has not been tested so the exact implementation may have some bugs, but the process of creation is correct. Whenever creating a new model, think about adding a test to check the new model performs as expected.

class PiecewiseGasVMR(PreRTModelBase):
   """
   Parameterises the gas volume mixing ratio as a piecewise function with `n_chunk` chunks.
   The chunk values are retrieved.
   """
   id : int = 10

   def __init__(
         self,
         i_state_vector_start : int,
         n_state_vector_entries : int,
         n_chunk : int,
         gas_id : int,
         iso_id : int,
      ):
      # initialise the parent class
      super().__init__(i_state_vector_start, n_state_vector_entries)

      # storing constants
      self.n_chunk = n_chunk
      self.gas_id = gas_id
      self.iso_id = iso_id

      # define the model parameters, specifically how they are stored in the state vector
      self.parameters : tuple[ModelParameter,...] = (
         ModelParameter(
            'chunk',
            slice(None),
            f'the {n_chunk} values of a piecewise function that define a volume mixing ratio for gas : {gas_id} isotope : {iso_id}',
            'RATIO'
         ), # NOTE: the comma here is required.
      )


   @classmethod
   def calculate(
         cls,
         atm : "Atmosphere_0",
         atm_profile_type : AtmosphericProfileTypeEnum,
         atm_profile_idx : int | None,
         chunk : np.ndarray[['mparam'],float],
      ):
      # update profile values
      temp = np.array(atm.VMR)
      fidx = np.linspace(0,chunk.size,temp.size)
      j=0
      for i in range(chunk.size):
         while fidx[j] < i+1:
            temp[j, atm_profile_idx] = chunk[i]
            j += 1
      atm.edit_VMR(temp)

      # find the functional derivatives
      xmap = np.zeros((chunk.size,atm.VMR.size[0]),float)
      j=0
      for i in range(chunk.size):
         while fidx[j] < i+1:
            xmap[i, j] = 1
            j += 1

      # return results
      return atm, xmap


   @classmethod
   def from_apr_to_state_vector(
         cls,
         variables : "Variables_0",
         f : IO,
         varident : np.ndarray[[3],int],
         varparam : np.ndarray[["mparam"],float],
         ix : int,
         lx : np.ndarray[["mx"],int],
         x0 : np.ndarray[["mx"],float],
         sx : np.ndarray[["mx","mx"],float],
         inum : np.ndarray[["mx"],int],
         npro : int,
         nlocations : int,
         runname : str,
         sxminfac : float,
      ) -> Self:

      # unpack values from VARIDENT
      gas_id = varident[0]
      assert gas_id > 0, f'model {cls.__name__} with id={cls.id} must have a gas ID as the first VARIDENT value'
      iso_id = varident[1]

      # read in apriori parameters
      # read number of chunks
      n_chunk = int(f.readline().strip())

      # ensure we have at least 2 profile points per chunk
      assert npro >= 2*n_chunk, f'{cls.__name__} id={cls.id} must have at most half as many chunks as there are points in a profile, have {n_chunk} vs {npro}'

      # create holder arrays
      chunk = np.zeros((n_chunk,),float)
      chunk_err = np.zeros((n_chunk,),float)

      # read value and error of each chunk
      for i in range(coeff.size):
         chunk[i], chunk_err[i] = tuple(map(float, f.readline().strip().split()))

      # Packing the state vector and covariance matrix
      ix_0 = ix # store the initial value
      for i in range(n_chunk):
         # store the log(chunk) values
         lx[ix] = 1
         inum[ix] = 1
         x0[ix] = np.log(chunk[i])

         # store the chunk variance, must use relative errors as we are storing log(value)
         sx[ix,ix] = (chunk_err[i]/chunk[i])**2

         ix += 1

      # return the constructed model instance
      return cls(ix_0, ix-ix_0, n_chunk, gas_id, iso_id)


   def calculate_from_subprofretg(
         self,
         forward_model : "ForwardModel_0",
         ix : int,
         ipar : int,
         ivar : int,
         xmap : np.ndarray,
      ) -> None:

      # unpack parameters from state vector
      chunk = self.get_parameter_values_from_state_vector(forward_model.Variables.XN, forward_model.Variables.LX)

      # get the profile type and index, and check the values
      atm = forward_model.AtmosphereX
      atm_profile_type, atm_profile_idx = atm.ipar_to_atm_profile_type(ipar)

      assert atm_profile_type == AtmosphericProfileTypeEnum.GAS_VOLUME_MIXING_RATIO, f'model {self.__class__.__name__} with id={self.id}. Only accepts gas volume mixing ratio profiles'

      assert atm_profile_idx == atm.locate_gas(self.gas_id, self.iso_id), f'model {self.__class__.__name__} with id={self.id}. The two different ways of getting the gas_vmr_idx must agree'

      atm, xmap1 = self.calculate(
         atm,
         atm_profile_type,
         atm_profile_idx,
         chunk
      )

      forward_model.AtmosphereX = atm
      xmap[self.state_vector_slice, ipar, 0:atm.NP] = xmap1

Introduction

Available models

Custom Model Class

General Steps

Example Custom Model Class

first definitions of the model class

defining the model scope

writing from_apr_to_state_vector

unpacking values from VARIDENT

reading apriori parameters

packing the state vector and the covariance matrix

construct the model class instance and return

code listing for from_apr_to_state_vector

writing __init__(...)

calling the parent class constructor

store data on the model instance

define model parameters

code listing for __init__

writing calculate_from_subprofretg

Unpack parameters and values

send to self.calculate(...)

store the results

code listing for calculate_from_subprofretg

writing calculate

find new profile values

find functional derivatives

code listing for calculate

Full final code listing

writing `from_apr_to_state_vector`

code listing for `from_apr_to_state_vector`

writing `init(...)`

code listing for `init`

writing `calculate_from_subprofretg`

send to `self.calculate(...)`

code listing for `calculate_from_subprofretg`

writing `calculate`

code listing for `calculate`