Adding graph formats¶
The format_translators
module stores all the classes
and functions to make translation between
graph formats. If you are planning to include a new format,
this is where you will need to start.
Below we firstly give a brief description of the module’s architecture and
then we show a practical example
for including a new input format and one for including an output format.
Translate overview¶
The module format_translators
contains a simple factory architecture, in which the subclasses of the abstract class
GraphFormatTranslator
implement the concrete translators to transform a graph object into
and from an Iron instance.
For each subclass the concrete implementation of at least one of two abstract methods
from_iron()
and
linchemin.cgu.graph_transformations.format_translators.GraphFormatTranslator.to_iron()
is developed.
The GraphFormatCatalog
class
is responsible to keep a registry of the implemented translators and
to instantiate the appropriate concrete class based on its name.
On the other hand, the conversion between data models is handled in the
data_model_converters module,
which is also composed of a simple factory structure.
The conversion occurs through the concrete subclasses of the
DataModelConverter
abstract class. The
SynGraph format is used as carrier of
the data model information and thus
each concrete factory must implement both
the
iron_to_syngraph()
and the
syngraph_to_iron() methods.
Also in this case,
a DataModelCatalog class
is used to register and call the appropriate concrete converters.
In the translate module, a series of handlers compose a
chain of responsibility the uses the Translators and Converters to enforce
a sequence of translations
from the selected input format to the output format. The steps of the sequence are:
Input format is translated to Iron
Iron is translated to SynGraph in the selected data model
SynGraph is translated to Iron
Iron is translated to the output format
Forcing the translation to pass through SynGraph ensures that the chemical information is handled
correctly, as Molecule and/or
ChemicalEquation instances are built while constructing
the SynGraph objects. Moreover, the conversion between data models (i.e., bipartite graph, monopartite
graph with only reactions or monopartite graph with only molecules) is handled exclusively by
SynGraph. This avoids the combinatorial explosion of possibilities to mix and match
data formats and data models.
Lastly, everything is wrapped by the facade function translator().
It takes the ‘name’ of the input format, the graph object in the input format,
the ‘name’ of the output format and the ‘name’ of the output data model and returns the graph
translated in the output format and in the selected data model.
Implementing a new input format¶
In order to include a new input format among those that LinChemIn can ‘read’, you
firstly need to create a new subclass of the abstract class
GraphFormatTranslator
in the format_translators module.
The new subclass should also be decorated with the @GraphFormatCatalog.register_format
decorator: it is to register
the new format among the available ones.
The decorator takes two arguments: the name that will be used
to select the format and a brief description that will appear in the helper functions.
@GraphFormatCatalog.register_format("new_input", "brief description")
class TranslatorNewInputFormat(GraphFormatTranslator):
""" Translator subclass to handle translations from NewInputFormat objects """
as_input = None
as_output = None
def from_iron(self, graph: Iron):
pass
def to_iron(self, route) -> Iron:
pass
What you are interested in registering the new translator as “input”, you will need to implement the
linchemin.cgu.graph_transformations.format_translators.GraphFormatTranslator.to_iron()
method, while the
from_iron()
can be left aside for the moment.
Now you need to take your time to develop the actual code that, starting from a graph object
in the format you are trying to add, returns an Iron instance. We recommend to add a ‘node_smiles’
key in the properties dictionary of the Iron nodes, so that the Iron object is suitable
to be translated into a SynGraph instance. Also, remember that the translators work with single graph
objects, not list of objects.
If you need more information regarding the Iron format, you can have a look at the
Iron description.
You should also set the as_input attribute of the subclass to implemented, so that
the new format will appear as available input format in the helper function.
When the code will be implemented you will have something similar to this:
@GraphFormatCatalog.register_format("new_input", "brief description")
class TranslatorNewInputFormat(GraphFormatTranslator):
""" Translator subclass to handle translations from NewInputFormat objects """
as_input = 'implemented'
as_output = None
def from_iron(self, graph: Iron):
pass
def to_iron(self, route) -> Iron:
# some super cool code
return iron_graph
All the available formats are stored in the _registered_graph_formats attributes of the
GraphFormatCatalog` class.
As previously mentioned, the factory can
self-register new options via the decorator.
That’s it, you are all done! Now your newly developed format is available to all the LinChemIn
functionalities that use the
translator() function. For example, you can use it to translate a
single route with the translator() function:
from linchemin.cgu.translate import translator
syngraph = translator('new_input', new_input_graph, 'syngraph', 'bipartite')
or you can work with a list of routes through the facade() function.
from linchemin.interfaces.facade import facade
routes, metadata = facade('translate', 'new_input', new_input_graph, 'syngraph', 'bipartite')
Implementing a new output format¶
The procedure to add a new output format is the same as the one described above,
with the only difference that you now need to implement the
from_iron()
method.
In this case, your code should take an Iron instance as input and,
after the appropriate transformations,
return a graph object in the new format.
@GraphFormatCatalog.register_format("new_output", "brief description")
class TranslatorNewOutputFormat(GraphFormatTranslator):
""" Translator to handle translations from NewOutputFormat objects """
as_input = None
as_output = 'implemented'
def from_iron(self, graph: Iron):
# some super cool code
return graph
def to_iron(self, route) -> Iron:
pass
Of course, if you want your format to be available as both input and output,
you will need to implement both methods and to set as 'implemented' both the
as_input and as_output attributes.