NAME
cite-effect -- A static analysis tool to understand programs.
cite-effect is a command-line tool to help :
+ understand legacy programs,
+ validate programs formally,
+ and refactor code.
It reports :
+ call tree - from all entry points;
+ call paths to accesses of a particular variable (read and/or
write).
By the use of a database it is suitable for large programs (30K+
lines). It limits the size of reported trees by finding the so-called
'home' function of a symbol : the function during which all references
occur.
It is designed for integration into IDEs such as jEdit or emacs.
See also http://cite-effect.sourceforge.net
VERSION
This documentation applies to cite-effect version 1.4. It parses pro-
grams written in C.
To check version, enter :
cite-effect --version
COPYRIGHT
Copyright (C) 2005-2009 Hubert Carbonnelle.
mailto:hcarbo@users.sourceforge.net
For use under the terms of the GNU General Public License and the GNU
Free Documentation License (from Free Software Foundation).
SYNOPSIS
Short form
First create cite-effect.lst, a list of source files to parse. Then,
in the same directory, execute
cite-effect <symbol>
This cretes the internal database by parsing the source files, then
reports accesses to the <symbol> from within the call tree.
Subsequent execution will reuse the existing database. On Unix sys-
tems, the database is automatically recreated if any of the source
files has been modified.
Use
cite-effect <symbol>=
to report write accesses only.
Full form
First build the database, then query it any time, as follows. <dbPath>
must be the first argument (it is the 'object' on which cite-effect
operates).
cite-effect <dbPath> [--verbose] { <sourceFilesList> | -}
cite-effect <dbPath> [--fullPrint] [-from:<pattern>] [-to:<pat-
tern>[=]] [-in:<sourceFile>]
Where <pattern> is a simplified regular expression of the form:
* matches all
<prefix>* matches symbols with given <prefix>
<symbol> matches <symbol>
Make sure you escape the '*' by using '\*' in the command line.
EXAMPLES
Short form - Windows
dir /B *.h *.c > cite-effect.lst
cite-effect x=
This reports 'write' accesses to x from the C files in current direc-
tory.
Short form - Unix
ls *.h *.c > cite-effect.lst
cite-effect x
Typical output
main()
| foo()
| | x=
| | bar()
| | | foo()...
Function calls are listed one per line, indented to the right of the
caller function. In this example, function 'main' refers to (i.e.
calls) function 'foo' - other calls do not affect variable 'x' and are
not listed. Function 'foo' writes 'x' and refers to function 'bar',
which in turn refers to function 'foo' listed earlier (as noted by the
ellipsis '...').
Full form
To build database :
ls *.h *.c | cite-effect myDb -
To report accesses to variable x in file foo.c :
cite-effect myDb -to:x -in:foo.c
Full directory tree analysis
To build database (progress is shown by outputting files being parsed)
:
find . -name '*.h' -print > files.txt
find . -name '*.c' -print >> files.txt
cite-effect myDb --verbose files.txt
To report write accesses on variable x from function foo :
cite-effect myDb -to:x= -from:foo
To report error values returned by function foo :
cite-effect myDb -to:ERROR_* -from:foo
Editor integration
To list all accesses and their respective places :
cite-effect myDB --fullPrint -to:x
Output will look like :
main()
| foo() [main.c:9]
| | x= [main.c:20]
| | bar() [main.c:21,25]
| | | foo()... [main.c:31]
Filenames and line numbers allow jumping to the source line (if suit-
able macros are provided in a customizable editor).
DESCRIPTION
Definitions
Reference (by a referer to a referee): a function call, a variable
access, or in general the use of a symbol.
Entry point : function that is not called by any other function
Home function of a specific variable or function : the function, lowest
in a call tree, within which all references to the specific referee
occur. There may be several home functions if there are several entry
points.
Building the database of references
<dbPath> is the base name for database files.
Database files are created (or overwritten) by supplying a list of file
names to parse (one per line). In the short form, the list is implied
(cite-effect.lst), and output is verbose (file names are output on std-
out as they are processed). In the long form, the list is supplied
either in <sourceFilesList> file or with stdin (by invoking with a dash
'-'). File names are output on stdout in --verbose mode only.
Symbols are recognized as function or variable with the help of .h
files, which should be listed before .c files. In case a symbol is
wrongly typed as function instead of variable (or the opposite), make
sure the header file declaring the symbol is listed before any file
referring to the symbol.
Parsing
The files are expected to be C-compliant (and compile without errors).
They are parsed without pre-processing; this means that
- files are not actually included by '#include' statements
- braces, curly braces and brackets ((){}[]) must be balanced within
macro definitions and within clauses of conditional compilation.
If you encounter parsing error(s), consider pre-processing the files
first.
The fuzzy parser recognizes the following constructs :
o variable definition (outside function body)
<identifier> { , | = | ; }
<arrayItem> { , | = | ; }
#define <identifier> ... <non-escaped new-line>
Macro's defined with this later syntax are regarded as variables;
functions they call (if any) will not be reported as nested.
<arrayItem> describes indexed array :
<identifier of type 'variable'> { [ ... ] }*
o function definition (outside function body)
<identifier> ( ... ) ;
<identifier> ( ... ) { ... }
#define <identifier>( ... ) ... <non-escaped new-line>
o writing a variable
<arrayItem> { = | += | -= | *= | ... }
<arrayItem> { ++ | -- }
{ ++ | -- } <arrayItem>
& <arrayItem>
The later form catches access to variables through a pointer
(actual assignments to the variable may occur only later when the
pointer is dereferenced). In most cases, unary operator & is dis-
tinguished from the binary operator &.
o reading a variable
<identifier of type 'variable'>
unless a write access is recognized.
o function calls (only within function bodies)
<identifier of type 'function'> ( ... )
<identifier of type 'function'>
The later catches assignments of a function pointer (the call
actually occurs only when the function pointer is dereferenced).
Parser warnings and errors are reported on stderr
Reports
A command-line query generates a tree-like report that lists refer-
ences. The roots of the tree are either:
the symbols selected by -from:<symbol> or -from:<prefix>*
the entry points (if -from:* is provided)
the home of the -to: symbols.
The command-line parameter -to:<pattern>[=], selects the target of the
search, which are the leaves of the output tree.
References are listed in the order they occur in a function, indented
right by two (additional) space characters, in order to depict the call
stack history as a tree.
A function appears only once in the tree. Additional references to the
function are simply noted with an ellipsis (...). In full mode report-
ing (--fullPrint option) , however, all references to a relevant func-
tion are reported, but only once per function.
USAGE
To validate a program formally, you may follow these steps:
1. Output the full call tree and check that it is complete; possibly
rework program to simplify call tree.
2. Select a global variable and output its accesses in the call tree;
possibly rework program to reduce number of accesses and undesired
side-effects.
3. Repeat step above for all sensitive global variables.
IMPLEMENTATION NOTES
The following old-style function definition is not recognized :
<identifier>() ... { ... }
The parser does not handle name scopes and does not distinguish con-
flicting names.
Entry points cannot be called recursively (since such an entry point
would no longer be recognized as a call-tree root; entire part of the
call graph would be lost).
Function nesting is currently clipped to 60 levels during output.
'Tree too deep' is simply output instead of the list of callee func-
tions / symbol accesses.
Type definitions (with 'typedef') are not distinguished from variable
definitions.
The & token in front of an <identifier> is interpreted as the 'address
of' operator and not the 'and' operator, except in this simple case :
<arrayItem> & <arrayItem>
RETURN VALUES
cite-effect returns 0 unless an error occured (independently of number
of warnings).
The warning
assignment to complex target is ignored
means that the left side of an assignment is not an <arrayItem>. Con-
sequently, the parser cannot record the assignee.
FILES
cite-effect.lst text file listing the source files to parse.
cite-effect.sym or <dbPath>.sym
database file holding symbol definitions.
cite-effect.ref or <dbPath.ref>
database file holding details of symbol refer-
ences.
BUGS
Program aborts without explanation when a memory overflow occurs while
reading the database. Try increasing available memory or use -in:
option.
Dec 4, 2009 version 1.4 cite-effect(1)
Man(1) output converted with
man2html