SiGN-Proc Manual
Introduction
SiGN-Proc is a CUI-based (command line) tool to process gene network files, including converting file formats,
extracting subnetworks, coloring the specified nodes, and so on.
In SiGN-Proc, you specify filters to process networks.
Many filters are available in SiGN-Proc.
One filter reads a network from a file. One extracts a sub network.
You can specify multiple filters.
In such a case, a network processed by a filter is passed to the next filter.
SiGN-Proc for Linux x86-64 is available
at DOWNLOAD page of
SiGN-BN.
Note: This document is currently under development.
Synopsis
Direct execution on the computation node (interactive job) of the HGC supercomputer system
~tamada/sign/signproc [ options ] [ filters... ]
Execution as a Grid Engine job on the HGC supercomputer system
qsub [ GE options ] ~tamada/sign/signbn-hcbs.sh --bin signproc [ options ] [ filters ]
GE options
Options
Example
Here are some examples. Specify the following as filters above when you execute SiGN-Proc.
File format conversion
--read type=sgn3,file=network.sgn3 --output type=csml,file=network.csml
This example converts the network file network.sgn3 written in the SGN3 format
into the CSML format file network.csml. See File Formats
for details of the available file formats.
Subnetwork extraction
--read type=csml,file=network.csml --subnet node=IL6,dist=1 --output type=csml,file=subnet.csml
This extracts IL6 and its parents and children from network.csml.
Filters
The filter can be specified by --filter_name followed
by its key=value style arguments concatenated by commas.
White spaces can be inserted after the camma.
Execute "signproc --help filter" for the full list of available
filters, and "signproc --help filter,filter_name" for the
detailed description of filter_name.
Edge Property filter (--edgeprop)
This removes all edges except for ones whose properties specified by
name satisfy the condition specified by op and
value.
name=property_name
type=property_type
Property type such as int, double, string, etc...
op= { eq | ge | gt
| le | lt }
Operator: the values correspond to equal, greater or equal, greater,
less or equall, and less.
value= value
String expression of a value to be compared.
noprop= { stop | remove | ignore }
What to do if the property is not found in an edge.
Examples:
--edgeprop name=BS.Prob,type=double,op=gt,value=0.5
This removes edges whose bootstrap probabilities are less than or equal
to 0.5.
--edgeprop name=up/down,type=string,op=eq,value=up
This leaves edges that are estimated as up-regulated ones.
Read filter (--read)
This filter reads a network, data frame, or dataset from a file,
and then passes it to the next filter.
type= { edf | frame | network_format }
Type of a file to read.
See
File Formats for the network file format.
file= file_name
Subnet filter (--subnet)
The subnet filter extracts a sub network based on a node names
given by a file or a list of node names.
file=file_name
File name containing a list of node names. Each row in a file corresponds to a node name.
The file can be a tab-separated file. If so, the column position can be specified by the col
option below.
col=n
Column position of the node list file. The first (left-most) columns is 1 (default).
node=node1:node2:...
Instead of specifying a list of nodes by a file, you can give them directly by this option.
If this is specified, file option is ingored.
dist=n
Distance of nodes and edges to include from the nodes of the above list.
The distance of the adjacent nodes connected to a specified node is 1 (default).
type=induce
If this is specified, only edges connecting nodes that exist in a list are returned.
Output filter (--output)
Output a network into a file. This passes the received network to the next filter without
modifying or processing it. Therefore, by using multiple output filters, users can output
a network in files in various formats.
file=file_name
Output file name. The path can be included in file_name.
type=network_format
See
File Formats for the available network file format.
By default,
csml is assumed.
args=\{key1=value1,...\}
Arguments of the file format.
BS filter (--bs)
The BS filter is to compile the bootstrapped networks into a single
network. This expects that the file names are consecutively numbered with the fixed prefix.
For example, by default, file_name_prefix.000001, file_name_prefix.000002, ...,
file_name_prefix.001000.
This simply ignores the files that are not found.
The edges whose bootstrap probabilities are greater than the threshold will be included in the
output network.
prefix=file_name_prefix
The prefix of the file names to process. This filter expects that the file names are in the form of
file_name_prefix.000000
where 000000 is a 6-digit bootstrap ID.
dynamic
If specified, bootstrapped networks to process are expected to be the dynamic model.
ed=n
The beginning index (ID) of the file names to process. By default, n = 1.
bg=n
The ending index (ID) of the file names to process. By default, n = 1000.
th=threshold
The threshold of the bootstrap probability. Edges whose bootstrap probabilities are greater than
the threshold will be included in the resultant network. By default, threshold = 0.05.
ver=n
Version. 1 and 2 are available. Do not mind details ;-). By default, n = 2.
Node Color filter (--nodecolor)
file=file_name
The line-by-line file containing the list of node names to color.
The column position of the node names in a line can be specified by the
col argument below.
col=n
color=r:g:b [:a ]
Color of the node by RGB. The integer value ranging 0 to 255 is acceptable for
r (red), g (green), and b (blue). The alpha blending (a)
can also be specified optionally. If it is not specified a=255 is assumed by default.
Comp filter (--comp)
The Comp filter is to compare two network structures.
This filter does not change the network. Instead, it compares the network with
another one specified by the arguments.
After the comparison, it prints the comparison result.
This regards the network given by the arguments as a true network of the demanded one
and counts the number of true positive (TP), false positive (FP), true negative (TN) and
false negative (FN) edges of the network passed by SiGN-Proc.
The TP edges are ones that exist in the both network.
The FP edges are ones that exist only in the original network
passwd by SiGN-Proc. The FN edges are ones that exist only in the
true network given by the arguments.
The TP, FP, FN edges can be saved as separate network files by specifying
the tp, fp, and fn arguments.
file=file_name
The file name of the true network to read and compare with.
type=file_type
The type of the file format to read. See
File
Formats for the available file formats.
args=\{key1=value1,...\}
Arguments of the file format.
tp=file_name
If specified, TP edges, i.e., edges that exist in the both networks, are saved as a
separate network in a file file_name.
tptype=file_type
File type of the TP network.
tpargs=\{key1=value1,...\}
Arguments of the file format for the TP network.
fp=file_name
If specified, FP edges, i.e., edges that exist only in the network passed by
SiGN-Proc, are saved as a separate network in a file file_name.
fptype=file_type
File type of the FP network.
fpargs=\{key1=value1,...\}
Arguments of the file format for the FP network.
fn=file_name
If specified, FN edges, i.e., edges that exist only in the network given by
the file argument above, are saved as a separate network in a file
file_name.
fntype=file_type
File type of the FN network.
fnargs=\{key1=value1,...\}
Arguments of the file format for the FN network.
Score filter (--score)
The score filter calculates the network score.
This filter does not change the network structure at all.
This prints the calculated score to the standard output.
data=file_name
The file name of the input data matrix.
score_args=\{key1=value1,...\}
Arguments for the score function.
dynamic
Specifies to use dynamic model.
mem=memory_size
Specifies the memory size in MiB for the score calculation.
Change History
ver. 0.22.7 (2021-03-29 Mon)
- The mem option was added to the score filter.
ver. 0.22.6 (2020-10-08 Thu)
- The TXT format supports name option to include the unique edge name column in the output file.
ver. 0.22.5 (2019-07-12 Fri)
- Fixed a bug in the Score filter that fails to calculate the score with dynamic model.
ver. 0.21.1 (2018-02-07 Wed)
- The dag filter was added.
ver. 0.19.0 (2014-08-21 Thu)
- The Comp filter supports to save TP, FP, FN edges.
- The NODELIST format supports the calculation of the closeness centrality.
- The MFP mode is added to the BS filter.
ver. 0.16.0 (2014-01-21 Tue)
- Edge Property filter (--edgeprop) is newly added.
- The default value of the ver option of the BS filter is changed.
Copyright © 2012-2021
SiGN Project members.
All Rights Reserved.
Contact: Yoshinori Tamada <tamada ATMARK ytlab.jp>