↑↑↑ Home | ↑↑ UNIX | ↑ Updateware |
nsum is a rewrite of Suso Banderas' numsum with additional features. I found myself needing something to process plain text files containing measurement data. While numsum is nice, it did not quite do all I needed. On the other hand, many people might appreciate its simplicity — so decided against adding the functionality and submitting a patch. I wrote nsum instead.
Its features in addition to numsum's:
Get nsum: Download
Make it executable, put it somewhere in your path, extract the manual page with pod2man or pod2html, and enjoy!
Update July 2010: The current version uses numerically accurate algorithms for computing the sum and mean and RMS deviation of floating-point data.
nsum - sum up numbers from text files
nsum [options] [files ...]
nsum is a more sophisticated replacement for Suso Banderas' numsum. It adds up numbers embedded in text files, discarding all non-numerical data. In particular it can add up tabulated ASCII numerical data separated by commas or white space.
nsum processes the files given on the command line, or standard input if the first file argument is "-" or none is given. The options -f, -g and -i determine which kind of numbers nsum looks for - floating point in decimal and optionally exponential notation or just integers. Most other options control its output - just the total, or row and/or column sums and optionally partial sums. The -s option gives you mean and standard deviation instead of sums.
The current version uses compensated (Kahan) summation for sums and the Knuth/Welford algorithm for computing the mean and standard deviation to minimise numerical errors.
Print requested sums for all files separately, instead of a grand total at the end. This applies to totals (-t) and column sums (-c). It has no effect when reading from stdin. +a disables this feature.
Print column sums, separated by tabulator characters. +c disables this feature.
Sum up floating-point numbers in decimal (not exponential) notation. "e" and "E" are regarded as part of record separators.
Sum up floating-point numbers in decimal or exponential notation. This is the default.
Sum up integers. Period characters are taken as part of record separators, not decimal points.
Print input data, without chaff but in its original format. If -p is also given, input columns and cumulative sums alternate, starting with input data. Columns which are not present in a line but for which partial sums exist are printed as 0 to keep the alignment predictable, but do not count as a zero datum towards the mean for -s. +I disables this feature.
Print partial column sums as columns are added up. If -a is given, the partial sums are file by file, otherwise they are cumulative across files. If -s is given, the mean and standard deviation of each column's data so far is printed instead of the partial sum. +p disables this feature.
Print row sums. For rows which do not contain any numbers, an empty line (not 0) is output. If -p is also given, the row sum is the last column output (or for -s, the last two columns), after the partial column sums. +r disables this feature.
Instead of each sum, print mean and standard deviation. Affects all of -c, -p, -r and -t. Two values will be printed everywhere in place of the single sum. +s disables this feature.
Print overall total. This is enabled by default. +t disables it.
Because nsum ignores all non-numerical data anyway, this is superfluous.
Because you can use sed to pick out rows and awk or cut to select columns.
Because awk can do that after nsum has printed the sum. (int()
resp. ...% 1
)
Use reservoir by Simon Tatham, http://www.chiark.greenend.org.uk/~sgtatham/utils/.
You get the drift.
numsum(1), numaverage(1), awk(1), sed(1), cut(1)
nsum is Copyright (c) 2009-2010 Volker Schatz. It may be copied and/or modified under the same terms as Perl.