Starting in 1985, this hypothetical model of a multitape Turing
machine has been developed by
A. Schönhage as a medium
for an efficient implementation of algorithms dealing with
multi-precision data. The TP model has seven tapes for storing
data, the underlying alphabet is the set of all 32-bit
words for a TP32 (or 16-bit words for a TP16), and
the finite control is embodied in some Turing program
for a little RISC processor written in assembly language.
The very first implementation was undertaken in 1985/86, running fast
integer multiplication on a small MC68000-based machine.
Later on, thanks to the programming skills of my coauthors
A.F.W. Grotefeld and E. Vetter, we could then
establish advanced versions running on SUN workstations and other
platforms, with our TP-book
plus more recent Addenda as the
main reference - also serving as a programmer's guide.
TP Source Modules
The assembly language TPAL is a low level programming
language well suited for multi-precision arithmetic modulo the
word base & = 232 of TP32
(= 216 for TP16), and to control every detail
down to the bit level.
Example source file primes.t
of a small self-contained program for listing primes may give a
first impression. Its compressed version
primes.dry without any comment and
using anonymous label naming illustrates that such `dry' plaintext
can provide some sort of encryption.
Libraries of such TP routines are organized in modules.
Chapters 6-9 of the TP-book describe the modules
INTARI for integer arithmetic, QARI for rational
arithmetic and integer gcd's, RECARI for computing with
r-codes and c-codes (binary rationals approximating real and
complex numbers), and CREPARI with routines for complex and
Meanwhile we have added many new routines. Due to increased size,
module CREPARI had to be split into separate modules
CREPAR + CREPEX, and there are now two additional modules
PORTA, PORTB containing our new routines for approximate
factorization of complex polynomials and approximate computation of
For documentation of the precise interfacing conditions of the
new routines and any updates for the old routines of
Chapters 6-9 we are now maintaining so-called
docfiles like intardoc etc. as
listed at the end of the Addenda. These are pure
ASCII-files to provide robust portability.
Availability of Implementations
After E. Vetter had left our team, we had to cope with his
tpc compiler (see TP-book Chapter 3)
as it was at that time. Because of several difficulties, for instance
arising with changes caused by new releases of the GNU compiler `gcc'
and other obstacles, the services of Section 3.1.1
(How to get this software) could no longer be maintained,
and since my retirement in early 2000 there was simply no
manpower to reinstall an operable version of tpc.
To resolve these problems, I decided to write a new implementation
of TP32 myself - then called TPS02
(written in 2001/02) - this time as a machine dependent
TP-system for Intel's IA-32 Architecture (Pentium II or
later, for use of MMX instructions) under GNU-Linux,
employing the gcc with its assembler and the
GNU Libc. Accordingly this is free software under the
terms of the GNU General Public License,
see TP-book page 89 for additional conditions as well.
After disposal of my ancient ATARI, at present there seems to be no
workable implementation of a TP16. On the other hand, one could wish
to use a TP64, in order to exploit the power of modern
64-bit processors, but that would also require substantial
modifications in many of our TP routines.
In case somebody wants to implement TP32 for some other platform,
s|he may take advantage of the fact that our assembler
tpa02, when applied to a TPAL source file
XYZ.t, performs a first pass translating this
into a module file XYZ.m32
being stored as a sequence of 32-bit words in a machine
independent intermediate code. We have designed that
m32 format such that
it may also serve as a basis for a TP32 in hardware.
TPS02 for IA-32 under GNU Linux
By downloading and unpacking
tps02.tar.gz you obtain a
directory tpdir/ containing an
ASCII-file readme about how to
install TPS02, a related makefile, and
tpdir/doc/, tpdir/dry/, tpdir/sou/
with these files:
with a brief outline of TPS02 in
intardoc, qardoc, recardoc, crepsdoc, portsdoc,
and License, a copy
of the GNU Library General Public License;
with `dry' compressed TP modules
recari.t, crepar.t, crepex.t, porta.t, portb.t;
tpdir/sou/ with the source files
for the TP-assembler tpa02,
tpm.c for the
startup part tpm.o of any Turing program,
tprun.s for the
runtime routines in tprun.o,
demo.t in TPAL,
for a little demo example,
tpinit.c for C-function
tpinit starting a "TP-server",
expl.t in TPAL,
for a little TP-server example.
In case of any problems with this distribution, please
contact A. Schönhage
Former and more Recent TPS02 (TPS04) news:
October 2005: That Summer I finished a new version
TPS04 furnishing special back-end coding for Pentium-4
processors, quite analogous to TPS02, obtainable by downloading
and unpacking tps04.tar.gz.
It improves the TPS02 performance by 30 to 40 percent; for more
details, see the corresponding docfile tps04.pdf .
September 2006: To illustrate our splitting circle method,
we have applied routine MFSP (monic linear factor splitting,
cf. item 2.12 of portsdoc) to the polynomial
x60 - 1 . File
split60.dvi provides a graphical
report of the main splitting steps and related performance data for
December 2006, and December 17, 2008: We have replaced
older preliminary routines for squaring of real and complex polynomials
(as well as enhanced routines for Graeffe's root squaring) with
improved versions (now asymptotically fast also for large degree).
To use them, one needs new crepar.t, crepex.t
(now versions 61,57, see below) from the new
tps04.tar.gz ; for
compatibility then also module portb.t must be a
newer one (now version no 21). For more details and other related
new routines, see the updated docfile crepsdoc .
February 2010: I have extended the basic module INTARI
by three little (fast) routines SMLTI, IMLTI, and CIMLTI for
(approximately) predicting the (somewhat erratic) timings of routines
SML, IML, and CIML, thus providing useful tools for finer tuning
when deciding in intermediate ranges, for inputs of medium
size, whether use of one of these (asymptotically fast)
routines might already be faster than some other elementary method.
June 24 and July 21, 2010: Recently we had to fix
several bugs in routines CPSQU/CP1SQU, in CDFT, and then also in
routines GAPFD and CRTKS. Accordingly, there are new modules
crepar.t (version 61), crepex.t (version 57),
porta.t (version 24), and portb.t (version 21),
available from the new tps02.tar.gz
or tps04.tar.gz , respectively.
May 2012/ January 2013: Since its version no 223,
module INTARI contains two little auxiliary routines SDBL,
SMLR2 for doubling an sml-code mod
(&m + 1), respectively to multiply it with
"sqrt(2)" = 224m - 28m . In its new
version 224, it now also offers a routine IPOWER
for integer exponentiation xn.
Moreover, quite recently we had to fix a bug in routine QSUB of our
Rational Arithmetic (cf. TP-book item 7.1.24). In one of its branches
it returned y-x instead of x-y. So use the new module
qari.t (version 20) available from the new
tps04.tar.gz , respectively.
July 2018: Most likely, these are final news
on our TP-project: For various reasons, a couple of years ago we
had to stop our continuous efforts to implement further of our
"Fast Algorithms", and meanwhile we have also realized
that completion of our ambitious book project (mentioned at the
beginning of the TP-book Preface) is impossible as it would require
too much further work.
Some features of TPS02, of TPS04:
- Assembly command tpa02 -?? XYZ.t
for translating TPAL source XYZ.t
admits options ?? to output the intermediate code
module XYZ.m32 (also as hex dump
XYZ.h32), or to retranslate it to compressed
format XYZ.dry; under TPS04,
tpa4 replaces tpa02 ;
- tpa02, tpa4 support the new directives and
commands related to table look-up, as specified
in Section 1.2 of the Addenda;
- debugging at runtime is supported by a rich repertoire of tracing
facilities, like stepping with source code display, setting
breakpoints, display of timer accounts, of the stack, dumps of
tape segments with decoding support, keyboard interrupts;
- large-scale memory supply: besides allocation of
m MB in main memory (with m·217
bytes per tape), any additional disk space can be accomodated
for the Turing tapes (or for the stack) via so-called
- to combine some Turing program with other software, typically
with some master program written in C or C++, it can be run
as a child process; such TP-server receives its
input via an INTAPE pipe from the parent process
and returns its results via another pipe written by OUTAPE.
Last modified: Sun Jul 1 17:11:34 CEST 2018