Network data

›Network files

This file type is designed to store any kind of unstructured network (such as delaunay tesselations or more generally cell complexes). In DisPerSE, its usage is restricted to networks of simplices though, and it is mainly used to store ascending and descending manifolds (voids and walls for instance) and persistence pairs as output by mse or Delaunay tessellations as output by delaunay_nD (skeleton files can also be converted to networks using skelconv -to NDnet). Within network files, networks are represented by setz of vertices and cells of any dimension. A n-cell is a cell of dimension n, which is therefore described, in the case of a simplicial network, as a set of n+1 vertices index. In the case of a simplicial complex, only the highest dimensional cell have to be explicitly given, but other type of cells may also be specified. Indeed, extended manifolds for instance are not described as complexes: an ascending 0-manifold in 3D is a set of tetrahedrons (3-cells), triangles (representing ascending 1-manifolds on its boundary), segments (ascending 2-manifolds on its boundary) and vertices (ascending 3-manifolds / critical points). Note that additional information can also be associated to each type of cell (see below).
The base network format is NDnet which is used internally, but this format may be converted to several other more or less complex formats of network files adapted to different applications (see option -to in program netconv, a list of available formats is displayed when running the program without argument).

Available formats:

-NDnet (Read / Write):
This is the format of the network files created or red by mse. It is a relatively complex binary format (it is actually more complex than needed as it is designed to store generic non-simplicial networks) that contains all the information on the geometry and topology of unstructured networks as well as additional data associated to each type of cells.

-NDnet_ascii (Read / Write):
This ASCII format contains the same amount of information as NDnet files, but restricted to simplicial networks. It is easy to read and write so it may be used to write reasonably sized networks used as input for mse.

-PLY and PLY_ascii (Read binary only / Write):
This is a rather popular and simple binary or ASCII format that can be used as an interface with other software.

-vtk, vtk_ascii, vtu and vtu_ascii (Write only):
These formats are binary and ASCII legacy and XML VTK formats that are readable by several 3D visualization tools, such as VisIt or ParaView for instance.

Additional data: In addition to the topology and geometry of the network, arbitrary additional information may be associated to each type of cell. Run netconv filename -info for a list of additional data available in the network file filename. By default, the name of additional data added by mse is relatively explicit, it includes (see also the additional data section of the skeleton file format description):

-field_value / log_field_value :
The value of the field and its logarithm. The tag field_value corresponds to the input function for mse, whose Morse-Smale complex is to be computed.

-cell:
The type and index of a cell in the original network (prefix may be added). The value is a double precision floating number whose integer part is the index of the cell and decimal part its type. For instance, the 156th vertex (i.e. 0-cell) in the cell complex is represented as 156.0, while the 123th tetrahedron is 123.3. Note that the index of the 0-cell correpond to the index of the pixel / vertices in the original network from which the skeleton was computed.

-type:
This usually corresponds to the critical index of a critical point (for instance, vertices of persistence pairs networks), or the type of a persistence pair (i.e. the minimum critical index of the CP in the pair, for segments of persistence pairs networks).

-index
Usually the index of a vertex (e.g. for persistence pairs, additional segment data tagged up_index and down_index correspond to the indices of the vertices with lowest and highest critical index in the persistence pair respectively).

-persistence / persistence_ratio / persistence_nsigmas :
The persistence (expressed as a difference, ratio or in number of sigmas) of the persistence pair containing the corresponding critical point. A negative or null value indicates that persistence is not relevant to this particular cell.

-parent_index / parent_log_index (vertices only):
For persistence pairs type networks, for each vertex representing an extremum (i.e. minima and maxima), the index of the vertex that corresponds to the other extremum into which it would be merged if its persistence pair was canceled (indices start at 0). This can be used to reconstruct the tree of the hierarchy of maxima and minima. The value is -1 for non extrema critical points. The difference between the two versions is that the second (parent_log_index) is the hierarchy computed from the logarithm of the field. The second version is useful only for discrete point samples whose MS-complex is obtained from the delaunay tessellation computed with delaunay_nD. Practically, parent_log_index can be used whenever persistence pairs are cancelled in order of increasing ratio (option -nsig in mse), and parent_index whenever persistence pairs are cancelled in order of increasing difference (option -cut in mse).

-source_cell / source_index:
For networks representing manifolds (voids, walls, ... obtained with option -dumpManifolds of mse), this represents for each simplex the critical point from which the manifold it belongs to originates (for instance, the minimum corresponding to a void, or the saddle point corresponding to a filament). In source_cell, the critical points is represented by its cell in the initial cell complex (see cell above), while source_index gives the index of the critical point in the skeleton file or persistence pair network obtained with mse (options -dumpArcs and -ppairs). See also here and there in the tutorial section.

›NDnet format

This is the native binary format of DisPerSE. Functions for reading and writing NDnet format in C can be found within the file ${DISPERSE_SRC}/src/C/NDnetwork.c (see functions Load_NDnetwork and Save_NDnetwork). The format may seem relatively complex, but most of it is actually optional and not used in disperse (only simplicial complexes are used in DisPerSE). To create DisPerSE input files, it is only necessary to define the highest dimensional n-simplices as a list of (n+1) vertices (see also function CreateNetwork).

Note: The scalar function whose MS-complex is computed by mse can be stored as an additional data field named 'field_value' (case sensitive).
Warning: in the following, for legacy reasons, the terms n-face and n-cell are used indifferently to designate polygons of dimension n (which are always simplexes in DisPerSE).

When using the C functions from Disperse, data is loaded into the following C structure which is close to the actual structure of the file (see file ${DISPERSE_SRC}/src/C/NDnetwork.h):

typedef struct
 {
   int type; // the cell-type
   char name[255];  // name of the field
   double *data;  // value for each of the nfaces[n] n-cells
} NDnetwork_Data;

// NDnetwork_SupData is not used in disperse ...
typedef struct
{
   int type; 
   char name[255];
   int datasize;
   char datatype[255];// a string to identity how data should be casted
   void *data;
} NDnetwork_SupData;

typedef struct 
{
   char comment[80];
   int periodicity;
   int ndims; // the number of spatial dimensions
   int ndims_net; // number of dimension of the network itself (e.g. 2 for a sphere embedded in 3D)
   int isSimpComplex;  // 1 if network is a simplicial complex (always true in disperse)
   double *x0;  // origin of the bounding box
   double *delta;  // size of the bounding box
   int indexSize; // size of NDNET_UINT type in Bytes
   int cumIndexSize; // size of NDNET_IDCUMT type in Bytes
   char dummy[160-4*2]; // dummy data reserved for future extensions
 
   NDNET_UINT nvertex;  // total number of vertices
   float *v_coord; //vertices coodinates (X_0,Y_0,Z_0,X_1,Y_1,...,Z_nvertex-1)
   
   NDNET_UINT *nfaces; // number of cells of a given type t is given by nfaces[t]
 
   int *haveVertexFromFace; // haveVertexFromFace[n] is 1 if we have an explicit definition of the n-cells (at least one type of cell must be defined).
   NDNET_IDCUMT **f_numVertexIndexCum;// cumulative number of vertice in the t-cells, NULL when cells are simplexes (isSimpComplex=1)
   NDNET_UINT **f_vertexIndex; // list of vertices defining the n-cells is stored in f_vertexIndex[n], all vertices being enumerated for each cell (the indices of the vertices in the kth n-cell start at f_vertexIndex[n][(n+1)*k] )
   // see also macro  NUM_VERTEX_IN_FACE(net,type,face) and VERTEX_IN_FACE(net,type,face)
 
   //This may be computed internally within DisPerSE but does not need to be defined explicitely
   int *haveFaceFromVertex; // haveFaceFromVertex[n] is 1 if we have an explicit list of all the n-cells that contain each vertex (used to navigate within the network)
   NDNET_IDCUMT **v_numFaceIndexCum; // cumulative number of t-cells a vertex v belongs to
   NDNET_UINT **v_faceIndex; // indices of the t-cells in the co-boundary of v ( the list of n-cells of vertex k starts at v_faceIndex[n][net->v_numFaceIndexCum[n][k]] and ends at v_faceIndex[n][net->v_numFaceIndexCum[n][k+1]] )
   // see also macro  NUM_FACE_IN_VERTEX(net,type,vertex) and  FACE_IN_VERTEX(net,type,vertex)
   
   // This can become extremely memory heavy ... NOT used in DisPerSE
   int **haveFaceFromFace; // haveFaceFromFace[k][n] is 1 if we have an explicit list of all the n-cells that have a boundary/co-boundary relation with each k-cell (used to navigate within the network)
   NDNET_IDCUMT ***f_numFaceIndexCum; //  cumulative number of n-cells having a boundary / co-boundary relation with each k-cell: f_numFaceIndexCum[k][n]
   NDNET_UINT ***f_faceIndex; // indices of the cells (similar to v_faceIndex)
   // see also macro NUM_FACE_IN_FACE(net,ref_type,ref_face,type) and FACE_IN_FACE(net,ref_type,ref_face,type)
   
   int haveVFlags;  // do we have flags associated to each vertex ?
   int *haveFFlags;  // do we have flags associated to each n-cell ?
   unsigned char *v_flag; // nvertex flag values (1 for each vertex) or NULL 
   unsigned char **f_flag; // nfaces[n] flag values (1 of each n-cell) or NULL

   int ndata; // number of additional data fields.
   NDnetwork_Data *data; // array of all additionnal data (data in total)
   
   int nsupData;
   NDnetwork_SupData *supData;

} NDnetwork;

The NDnet binary format is organized as follows (blocks are delimited by dummy variables indicating the size of the blocks for FORTRAN compatibility, but they are ignored in C):

NDnet binary format
field	type	size	comment
dummy	int(4B)	1	for FORTRAN compatibility
tag	char(1B)	16	identifies the file type. Value : "NDNETWORK"
dummy	int(4B)	1
dummy	int(4B)	1
ndims	int(4B)	1	number of dimensions of the embedding space
ndims_net	int(4B)	1	ndims spanned by the network (=ndims by default)
dummy	int(4B)	1
dummy	int(4B)	1
comment	char(1B)	80	a comment on the file (string)
periodicity	int(4B)	1	0=non periodic, if p^th bit is set, boundary are periodic along dimension p
isSimpComplex	int(4B)	1	1 if network is made of simplices (must be 1 for DisPerSE)
x0	double(8B)	ndims	origin of bounding box
delta	double(8B)	ndims	size of bounding box
index_size	int(4B)	1	size of NDNET_UINT integer format in Bytes
cumindex_size	int(4B)	1	size of NDNET_IDCUMT integer format in Bytes
dummy_ext	char(1B)	152	dummy data reserved for future extensions
nvertex	NDNET_UINT	1	number of vertices
dummy	int(4B)	1
dummy	int(4B)	1
v_coords	float(4B)	ndims×nvertex	coordinates of the vertices [X0,Y0, ...]
dummy	int(4B)	1
dummy	int(4B)	1
nfaces	NDNET_UINT	ndims+1	number of cells of each type (N0,N1,...)
dummy	int(4B)	1
dummy	int(4B)	1
haveVertexFromFace	int(4B)	ndims+1	are n-cells explicitly defined ? (0=no, 1=yes)
dummy	int(4B)	1
*	*	*	next 3 lines are repeated for each (ndims+1) possible cells type , only if haveVertexFromFace[n] is true.
dummy	int(4B)	1
f_vertexIndex[n]	NDNET_UINT	(n+1)×nfaces[n]	list of (n+1) vertex indices for each n-cell
dummy	int(4B)	1
dummy	int(4B)	1
haveFaceFromVertex	int(4B)	ndims+1	are n-cells in the co-boundary of each vertex explicitly defined ? (0=no, 1=yes)
dummy	int(4B)	1
*	*	*	next 6 lines are repeated for each (ndims+1) possible cells type, only if haveFaceFromVertex[n] is true.
dummy	int(4B)	1
numFaceIndexCum[n]	NDNET_IDCUMT	nvertex+1	cumulative count of n-cells on vertices co-boundary
dummy	int(4B)	1
dummy	int(4B)	1
v_faceIndex[n]	NDNET_UINT	numFaceIndexCum[n]	list of n-cells on the co-boundary of each vertex (vertex i have numFaceIndexCum[n][i+1]-numFaceIndexCum[n][i] of them)
dummy	int(4B)	1
dummy	int(4B)	1
haveFaceFromFace	int(4B)	(ndims+1)^2	are n-cells in the co-boundary of each k-cell explicitly defined ? (0=no, 1=yes)
dummy	int(4B)	1
*	*	*	This section describes boundary relation between n-cells and k-cells. It is usually empty in DisPerSE (see NDnetwork.c for details) so SKIP IT :)
dummy	int(4B)	1
haveVFlags	int(4B)	1	1 if vertex flags are defined
dummy	int(4B)	1
*	*	*	next 3 lines are skipped if haveVFlags=0
dummy	int(4B)	1
v_flag	uchar(1B)	nvertex	value of the flags for vertices
dummy	int(4B)	1
dummy	int(4B)	1
haveFFlags	int(4B)	ndims+1	1 if flags are defined for n-cells
dummy	int(4B)	1
*	*	*	next 3 lines are repeated for each n-cell such that haveFFlags[n]=1
dummy	int(4B)	1
f_flag[n]	uchar(1B)	nfaces[n]	value of the flags for n-cells
dummy	int(4B)	1
dummy	int(4B)	1
ndata	int(4B)	1	total number of additional fields
dummy	int(4B)	1
*	*	*	next 7 lines are repeated for each additional field (×ndata)
dummy	int(4B)	1
type	int(4B)	1	the type of cells (0=vertex, n = n-simplex)
name	char(1B)	255	name of the supplementary data
dummy	int(4B)	1
dummy	int(4B)	1
data	double(8B)	N	data associated to cells or vertices. N is nfaces[n] or nvertex depending on the type value.
dummy	int(4B)	1

›NDnet_ascii format

This ASCII format is a simpler version of the NDnet format, designed to be fully compatible but restricted to simplicial networks. It is easy to read and write and should probably be used for reasonably sized data sets.

Note: The scalar function whose MS-complex is computed by mse can be stored as an additional data field named 'field_value' (case sensitive).

NDnet_ascii format

ANDNET	header
ndims	the number of dimensions
#comments go here	OPTIONAL: should start with '#' if present (the 80 first characters are read and stored).
BBOX [x0_1 .. x0_d] [delta_1 .. delta_d]	OPTIONAL: the bounding box, defined by the 'ndims' coordinates of the origin 'x0' and extent 'delta'.
nv	number of vertices
vx[0] vy[0] ...	the ndims coordinates of the first vertex
...	One line for each vertex
Simplices definition (0-cells up to ndims-cells). One blue block should be added for each type of explicitly defined simplex. Note that only the highest dimension cells are sufficient to define a complex.
T N	network has N T-simplices (each T-simplex has (T+1) vertices).
i[0] j[0] ...	the T+1 indices (start at 0) of the vertices of the first T-simplex.
...	one line for each of the N T-simplices
[ADDITIONAL_DATA]	OPTIONAL: indicate the beginning of the additional data section
additional_data_name_1	name of the additional data (e.g. field_value for mse input files)
T	type of simplex it is associated to (T-simplex, 0 means vertices)
val[0]	value for the first T-simplex
...	One line for each T-simplex

›ply format

The PLY format is a relatively generic file format designed to store three dimensional data from 3D scanners with the possibility of associating properties to the polygons. Information on this format can be found on wikipedia (see also the External links section). A very efficient C library for reading and writing PLY files in ASCII or binary format is RPly (DisPerSE uses it for PLY files I/O, see files NDnet_PLY_IO.c and NDnet_PLY_IO.h).

A typical header for a PLY file readable by netconv or mse is as follows:

ply
format ascii 1.0
element bbox 1
property list uchar double x0 
property list uchar double delta
element vertex 77595
property float x
property float y
property float z
property double field_value
element face 482867
property list uchar uint vertex_indices
end_header

-format may be ASCII or little / big endian binary

-bbox element is used to define a bounding box if available (x0 is its origin and delta its extent)

-coordinates of the vertices are given as vertex properties labeled x, y, and z or x0, x1, ... (the number of this properties gives the dimension of the embedding space)

-faces are defined by the property vertex_indices, each element corresponding to a list of vertices. A cell is always supposed to be a simplex, so the number of vertices determine the dimension spanned by the complex (a 2D complex may be embedded in a 3D space).

-additional properties may be defined for cells and vertices. In particular, a vertex property labeled field_value would be used as input function in mse.

›vtk network format

VTK formats are developed for the Visualization Tool Kit library (VTK) and can be used for 3D visualization with software such as VisIt or ParaView. Networks are stored as VTK unstructured network data and can be output in fours different VTK formats:

-vtk: the legacy format
-vtk_ascii: ASCII version of the vtk format
-vtu: a more recently developed XML version of the vtk format,
-vtu_ascii: ASCII version of the vtu format

The specifications for these formats can be found in this PDF file. See also here and there for additional information.