Developer information
This section contains some information snippets for developers. Will be extended in the future.
Reading output formats
While TomoTwin writes files with various extensions (.temb
, .tmap
, .tloc
, .tumap
), they are basically all pickled pandas dataframes.
They can all be read by:
import pandas as pd
df = pd.read_pickle("path/to/a/tomotwin/output/file")
In case you modify it, please also check the df.attrs dictionary (and copy it if necessary) of the dataframe. It contains important meta information that is used by TomoTwin.
Implementing new architectures
Adding new CNN architectures is straightforward in TomoTwin.
Add a class for your network in
modules/networks/
and implement the interface defined bymodules/networks/torchmodel.py
Add your new network to the
network_identifier_map
dictionary in themodules/networks/networkmanager.py
module.Create a network configuration file like in
resources/configs/config_siamese.json
. Thenetwork_config
entry should match the__init__
method of your new network.
Now you are in principle set to train your network (see How to train TomoTwin
).
How to train TomoTwin
Here we describe how to train the SiameseNet (bad name, as it is actually a tripletnetwork). Hardwarewise, 12GB of GPU memory should be enough.
1. Download training and validation data
Training and validation set can be found here:
https://zenodo.org/record/6637456
Download and untar training and validation data.
2. Download siamese network config
You find the configuration file here:
https://github.com/MPI-Dortmund/tomotwin-cryoet/blob/main/resources/configs/config_siamese.json
3. Start the training
To run it on one GPU for 300 epochs do:
CUDA_VISIBLE_DEVICES=0 tomotwin_train.py -v path/train/volumes/ --validvolumes path/valid/volumes/ -o out_train -nc path/to/siamese_network.json --epochs 300
How to evaluate TomoTwin
Will follow soon :-)