Background of these demos.

### Links to demos

1. Using Survo in Touch mode

2. Editorial computing in Survo

3. Plotting curves

4. Worm mode

5. Simulating a bivariate normal distribution

6. Table formatting and sorting (Origin of the editorial approch)

7. Simple bar chart

8. A closed curve

9. Mr. Cole gives a talk

10. Linear regression analysis

11. Chernoff's faces

12. Sum of independent random variables tends to normal distribution

13. Lissajous curve variation (Knitting a carpet)

14. Histogram

15. Factor analysis

16. Comparing two samples

17. Fisher's exact test for contingency tables

18. Miscellaneous conversions

19. Computus: calculating the date of Easter (by KV)

20. Pascal's triangle

21. Grid lines

22. Why 0.3-0.2-0.1 is not zero in PC's?

23. Arrow diagram of a correlation matrix

24. "Origin of Species"

25. Cooling of a coffee cup

26. Symbolic derivatives

27. Chords of ellipses

28. Linear dependencies in a matrix

29. Temperature in Helsinki

30. Example P160 from Survo Book (1992)

31. Fence lines

32. Small problem of Ramanujan

33. July mean temperature and rainfall in Helsinki

34. Approximate squaring of a circle

35. Symmetric random walk

36. Reversing

37. Birds (word and phrase continuation)

38. Graphical rotation in factor analysis

39. Pythagorean points on a green meadow

40. Unbiased coin-flips with a biased coin

41. Omega coin tossing

42. Rational approximations by listening

43. Color changing

44. Permutation test

45. HH - HT game (analysis)

46. HH - HT game (simulation)

47. Solving a Survo puzzle

48. Lines going through 3 points in a 9x9 grid

49. Solving a Survo puzzle by the swapping method

50. Prime numbers listed by a sucro

51. Ulam spiral in color

52. Testing the correlation coefficient

53. Age pyramid (Finland 2009)

54. Letter frequencies in Shakespeare's Sonnets

55. Shakespeare's Sonnets as a Markov chain

56. Linear regression analysis by orthogonalization

57. The most common words in Shakespeare's Sonnets

58. Discriminant analysis of Iris flower data set

59. Cluster analysis of Iris flower data set

60. F1 connections

61. "Rotated arrowheads"

62. Influence curves for the correlation coefficient

63. Colored texts in bar/pie charts

64. "Hello World!"

65. Virtual keyboard

66. Cycloid

67. Prime factors of numbers m^n-1

68. Some properties of Magic Squares (by KV)

69. Some further properties of Magic Squares (by KV)

70. Solving linear equations

71. Polynomial regression

72. Marking columns by SHADOW SET

73. Hunting quanta

74. Survopoint display mode

75. Four-dimensional cube

76. 'Word' processing by mouse

77. Edges and diagonals of a regular n-sided polygon

78. Examples from a presentation in 1987

79. Thurstone's box problem

80. Finding recursive formula for number of grid lines

81. Testing the correlation coefficient

82. Matrix interpreter (regression analysis)

83. "Origin of Species" 2

84. Simulating multivariate normal distribution

85. Genesis of multivariate normal distribution 1

86. Genesis of multivariate normal distribution 2

87. Monthly temperature and rainfall in Helsinki

88. Early sound experiment on Elliott 803 computer in 1962

89. Tracing a sucro program

90. Finding primes by the Sieve of Eratosthenes

91. Combining Survo operations by sucros

92. Tracing a sucro program (Finding prime numbers)

93. Multiple discriminant analysis in linguistic problems

94. Probability of Matching Column Drums (1/2)

95. Probability of Matching Column Drums (2/2)

96. Distance distributions in networks 1

97. Distance distributions in networks 2

98. Distance distributions in networks 3

99. Battle over Degrees of Freedom

100. Problem of minor chord in music

101. Dissonance functions

102. Random music with Slutzky-Youle effect

103. Synthetic bird song

104. Sounds of statistical data 1

105. Sounds of statistical data 2: "Cuckoos singing in the rain"

106. Resurrection of SURVO 66

107. Cross tabulations with SURVO66

108. Printing a small document

109. Circle estimation

110. Contour ellipses on a graph paper

111. Sampling from a discrete uniform distribution

112. Merits of slow plotting

113. Tuning roots of algebraic equations by "listening"

114. Equation for the sum of chord lengths in a regular polygon

115. Regular polygons: Solving riddle of q coefficients

116. Regular polygons: Testing roots

117. Loan payment calculator

118. Game of Life

119. Digression analysis

120. Digression analysis (Mixed oscillations)

121. Plotting solutions of Diophantine equations X^a+Y^b=cZ

122. Patterns of roots in Diophantine equations X^4+Y^4=cZ

123. Grids of roots in Diophantine equations X^4+Y^4=17*Z

124. Patterns of roots in Diophantine equations X^n+Y^n=cZ

125. Solutions (X,Y) of X^n+Y^n=cZ from minimal setup

126. Symmetries of roots in Diophantine equations X^n+Y^n=cZ

### Background of demos

Always when the
main web page
is opened, one randomly selected Survo demo is running on that page
as a GIF animation.
### Demos in YouTube and as MP4 videos

Some demos have been made available also in
YouTube
thus enabling **better possibilities for navigation**.
It is possible to pause a demo for studying the
current situation more carefully and then continue. It is also easy
to jump forward or backward. These controls are also available
in MP4 videos of examples from ex92 onwards (and work well at least on
Edge, Firefox, and Internet Explorer).

2. Editorial computing in Survo

3. Plotting curves

4. Worm mode

5. Simulating a bivariate normal distribution

6. Table formatting and sorting (Origin of the editorial approch)

7. Simple bar chart

8. A closed curve

9. Mr. Cole gives a talk

10. Linear regression analysis

11. Chernoff's faces

12. Sum of independent random variables tends to normal distribution

13. Lissajous curve variation (Knitting a carpet)

14. Histogram

15. Factor analysis

16. Comparing two samples

17. Fisher's exact test for contingency tables

18. Miscellaneous conversions

19. Computus: calculating the date of Easter (by KV)

20. Pascal's triangle

21. Grid lines

22. Why 0.3-0.2-0.1 is not zero in PC's?

23. Arrow diagram of a correlation matrix

24. "Origin of Species"

25. Cooling of a coffee cup

26. Symbolic derivatives

27. Chords of ellipses

28. Linear dependencies in a matrix

29. Temperature in Helsinki

30. Example P160 from Survo Book (1992)

31. Fence lines

32. Small problem of Ramanujan

33. July mean temperature and rainfall in Helsinki

34. Approximate squaring of a circle

35. Symmetric random walk

36. Reversing

37. Birds (word and phrase continuation)

38. Graphical rotation in factor analysis

39. Pythagorean points on a green meadow

40. Unbiased coin-flips with a biased coin

41. Omega coin tossing

42. Rational approximations by listening

43. Color changing

44. Permutation test

45. HH - HT game (analysis)

46. HH - HT game (simulation)

47. Solving a Survo puzzle

48. Lines going through 3 points in a 9x9 grid

49. Solving a Survo puzzle by the swapping method

50. Prime numbers listed by a sucro

51. Ulam spiral in color

52. Testing the correlation coefficient

53. Age pyramid (Finland 2009)

54. Letter frequencies in Shakespeare's Sonnets

55. Shakespeare's Sonnets as a Markov chain

56. Linear regression analysis by orthogonalization

57. The most common words in Shakespeare's Sonnets

58. Discriminant analysis of Iris flower data set

59. Cluster analysis of Iris flower data set

60. F1 connections

61. "Rotated arrowheads"

62. Influence curves for the correlation coefficient

63. Colored texts in bar/pie charts

64. "Hello World!"

65. Virtual keyboard

66. Cycloid

67. Prime factors of numbers m^n-1

68. Some properties of Magic Squares (by KV)

69. Some further properties of Magic Squares (by KV)

70. Solving linear equations

71. Polynomial regression

72. Marking columns by SHADOW SET

73. Hunting quanta

74. Survopoint display mode

75. Four-dimensional cube

76. 'Word' processing by mouse

77. Edges and diagonals of a regular n-sided polygon

78. Examples from a presentation in 1987

79. Thurstone's box problem

80. Finding recursive formula for number of grid lines

81. Testing the correlation coefficient

82. Matrix interpreter (regression analysis)

83. "Origin of Species" 2

84. Simulating multivariate normal distribution

85. Genesis of multivariate normal distribution 1

86. Genesis of multivariate normal distribution 2

87. Monthly temperature and rainfall in Helsinki

88. Early sound experiment on Elliott 803 computer in 1962

89. Tracing a sucro program

90. Finding primes by the Sieve of Eratosthenes

91. Combining Survo operations by sucros

92. Tracing a sucro program (Finding prime numbers)

93. Multiple discriminant analysis in linguistic problems

94. Probability of Matching Column Drums (1/2)

95. Probability of Matching Column Drums (2/2)

96. Distance distributions in networks 1

97. Distance distributions in networks 2

98. Distance distributions in networks 3

99. Battle over Degrees of Freedom

100. Problem of minor chord in music

101. Dissonance functions

102. Random music with Slutzky-Youle effect

103. Synthetic bird song

104. Sounds of statistical data 1

105. Sounds of statistical data 2: "Cuckoos singing in the rain"

106. Resurrection of SURVO 66

107. Cross tabulations with SURVO66

108. Printing a small document

109. Circle estimation

110. Contour ellipses on a graph paper

111. Sampling from a discrete uniform distribution

112. Merits of slow plotting

113. Tuning roots of algebraic equations by "listening"

114. Equation for the sum of chord lengths in a regular polygon

115. Regular polygons: Solving riddle of q coefficients

116. Regular polygons: Testing roots

117. Loan payment calculator

118. Game of Life

119. Digression analysis

120. Digression analysis (Mixed oscillations)

121. Plotting solutions of Diophantine equations X^a+Y^b=cZ

122. Patterns of roots in Diophantine equations X^4+Y^4=cZ

123. Grids of roots in Diophantine equations X^4+Y^4=17*Z

124. Patterns of roots in Diophantine equations X^n+Y^n=cZ

125. Solutions (X,Y) of X^n+Y^n=cZ from minimal setup

126. Symmetries of roots in Diophantine equations X^n+Y^n=cZ

All these examples are created as pure Survo applications as sucros
(Survo macros) by letting Survo to save all actions of the user
in a sucro file. After possible editing of that file, the session
is repeated automatically by Survo and saved as a flash file by the
**ScreenFlash**
program. The flash file is finally saved as an animated
GIF picture by
ScreenFlash. That technique was applied for demos ex1 - ex91.

The latest items (ex92-) in this collection are also created
as sucros but converted to MP4 videos by
BB FlashBack Recorder.

By clicking the animation, background information about the
current topic will be displayed on another web page (this page)
containing short descriptions of all Survo animations.

**
You can play any of the examples simply by clicking its sample picture.
**

Sucros behind most of the gif animations are available
when using SURVO MM by the command

/LOAD <Survo>\U\EX\INDEX / or by soft buttons DEMO HIGHLIGHTS

This gives a list of these sucros and any of them may be run and
studied more closely.

The only web browser able to control GIF animations decently
seems to be **Mozilla Firefox** with the
SuperStop add-on.
When you select an example from the
list below by clicking its sample picture, it is possible to
pause the demo by pressing the **shift-ESC** key and continue thereafter
by the **F5** key.

In Internet Explorer, hitting ESC may stop the animation and you can examine the current situation more accurately but there are no means to continue and the demo has to be restarted.

Chrome allows no interventions from the user.

These demos are created by using
SURVO MM, the original Windows
version of Survo, now freely available.

Another free alternative is
Survo R (Muste)
available on all common operating systems.

Example #1 (Start the demo by clicking the picture!)

Touch mode is one of the smart calculation modes in Survo.

The function key F3 is the TOUCH key for entering the Touch mode of the Survo editor.

- See: Touch mode and Mathematical operations

Example #2 (Start the demo by clicking the picture!)

Editorial computing provides unique means for simple arithmetics and
for making extensive computation schemes.

In Survo, graphics is produced either in PostScript format (by PLOT commands)
or in EMF (Enhanced Meta File) format (by GPLOT commands).
GPLOT pictures appear automatically in their own windows and several
graphs may appear simultaneously on the screen according to
user-specific layouts.

Survo Graphics windows typically do not overlay the Survo main window. However, in these GIF-animations the graphs are placed on the main window.

- See also: Plotting curves and Graphics in Survo

Example #4 (Start the demo by clicking the picture!)

"Worm mode" is one special feature of
Touch mode enabling forming sequences
in any direction from characters displayed in the window and moving
these sequences in any directions. The display thus created may be
set permanent.

When some computer specialists claimed that Survo is able for 'simple text editing only', "Worm mode" was created in 1994 for demonstrating that they were wrong :)

- See also "Defining a `worm' in touch mode and moving it" in
Touch mode and

Comments on this sonnet.

Example #5 (Start the demo by clicking the picture!)

This demo in YouTube

MNSIMUL operation in Survo is a general tool for generating samples
from any multivariate normal distribution. The parameters of the
distribution are given by a correlation matrix and a matrix of
means and standard deviations. In this application standardized
variables (with means=0 and standard deviations=1) are created and
this is indicated by an asterisk (*) as the second parameter.

As a technical detail, it also shown how the graph of the sample is positioned in the display.

Screen graphics (created by GPLOT commands of Survo) are displayed by default in separate windows typically outside the Survo main window.

- See also: MNSIMUL operation, Multivariate analysis, and Scatter diagrams

Example #6 (Start the demo by clicking the picture!)

This demo in YouTube

The editorial approach was originally created for a musical application.

See the Flash demo

About the idea of editorial approach

and

Example of the first version of Survo Editor.

Soon after this experiment it was realized that the same approach
could be used for many more purposes, too.
For example, it was a pleasure to detect how easily a formatted table
of several columns could be sorted according to any column.

Plotting music in 1982 by using the first Survo Editor (Click the picture to see the plotter working)

Example #7 (Start the demo by clicking the picture!)

In Survo, graphics is produced either in PostScript format (by PLOT commands)
or in EMF (Enhanced Meta File) format (by GPLOT commands).
GPLOT pictures appear automatically in their own windows and several
graphs may appear simultaneously on the screen according to
user-specific layouts.

Survo Graphics windows typically do not overlay the Survo main window. However, in these GIF-animations the graphs are placed on the main window.

- See also: Bar charts etc. and Graphics in Survo

Example #8 (Start the demo by clicking the picture!)

This demo in YouTube

This is one of the oldest Survo applications (created in 1976) in this series.
The original graph was produced by using
SURVO 76
on a
Wang 2200
minicomputer.
Plotting that graph on a drum plotter took about one hour.

A closed curve defined by a plotting scheme

HEADER= FRAME=0 MODE=1024 XDIV=0,1,0 YDIV=0,1,0 T=0,2*pi,pi/4000 pi=3.14159265 XSCALE=-3.0,3.0 YSCALE=-1.5,1.5 R=cos(78*T)+cos(80*T) A=R*cos(T) B=R*sin(T) s=0.8 u=0.06 LINETYPE=[color(0.1,0.3,1,0.2)] COLORS=[/BLACK] SLOW=400 GPLOT X(T)=A+s*B+u*sin(5*B),Y(T)=B+u*sin(5*A)

is drawn. The graph consists of a single curved line traversing through
the origin 2x80=160 times. Plotting is here slowed down 400 times (see
the SLOW specification above). In the snapshot above the cycle is not
completed.

- See also: Plotting curves and Graphics in Survo

Example #9 (Start the demo by clicking the picture!)

This demo in YouTube

The original version of this demonstration was made in 1990
in Finnish and it still belongs to a collection of tutorials made
in the Sucro language of Survo.

In fact, all these GIF animations were originally made as such tutorials by letting the ScreenFlash program to 'watch' them and save as Flash movies. These movies were then converted to GIF animations by the same program.

This example was also present in 1990 in a school version of Survo. The aim here was to point out how diligent people in old times - without computers and calculators - were ready and able to do very demanding numerical computing.

The arithmetical calculations presented here were carried out by a family of arithmetical sucros (<Survo>\OPETUS\AR) made just for this presentation.

A related demo: Prime factors of numbers m^n-1

Example #10 (Start the demo by clicking the picture!)

Several operations for regression analysis are available.
The oldest of them is LINREG which is applied here to a 'historical'
data set DECA belonging to the repertoire since 1970ies.

- See also: Linear regression and Statistical operations

This demo in YouTube

Survo offers several means for illustration of multivariate statistical
data.

One of them is
Chernoff's faces.

The original numerical data was given
here

(almost 20 years before Chernoff invented his faces).

- See also: Displays of multivariate data

Example #12 (Start the demo by clicking the picture!)

This demo in YouTube

This demo illustrates the power of the Central limit theorem of probability and statistics.

It is a combination of two sucros, the first one for selecting one
of the given discrete distributions and the second one for computing
distribution of sums of
independent variates from the selected distribution.

In each stage it is shown graphically how close the standardized sum distribution is to the normal distribution. The gap between the sum and the normal distribution is also given numerically as a deviation corresponding to the standard Kolmogorov-Smirnov test statistics.

Two examples are shown. The first one tells how the binomial distribution tends quickly to normal distribution. The second example (due to a very heavy tail on the right) has more dramatic features but eventually normalization is its inevitable destiny, too.

- This tutorial with all alternatives is available in SURVO MM

as a sucro /DISTRSUM.

Example #13 (Start the demo by clicking the picture!)

This demo in YouTube

This graph belongs to a series of cover pictures I made by Survo
for the magazine "Dimensio" of
"the Finnish Association of Mathematics and Science Education Research"
in 1990-91. The original graph was published in the 9/91 issue
of the magazine and it contained also a short article where
I described the facts and details related to graphs like this.

The graph here is slightly simplified, due to a limited resolution on the screen but given as a stepwise presentation revealing the complete symmetry finally at the last steps.

The basis of the graph is a Lissajous curve getting a more surprising appearance by "rounding" the function values to integers.

The entire setup in a Survo edit field for making the graph is

GPLOT X(T)=int(M*sin(N*T)+0.5), Y(T)=int(N*cos(M*T)+0.5) M=29 N=19 T=[line_width(4)],0,2*pi,pi/3100 pi=3.14159 HEADER= FRAME=0 XSCALE=-M,M YSCALE=-N,N XDIV=0,1,0 YDIV=0,1,0 MODE=652,381 WSIZE=652,381 WHOME=0,0 WSTYLE=0 SLOW=300 Slowing the speed by drawing each line segment 300 times

The corresponding Lissajous curve without "rounding" by int():

Example #14 (Start the demo by clicking the picture!)

- See also Plotting histograms

There are many options in Survo for factor analysis and related topics.

This is a straightforward example of the classical approach.

- See also: Factor analysis

The final output (with additional comments on lines 31-39) of the
COMPARE program of Survo is displayed above. This example was created
in 1986.

- See also: Compare operation, Permutation tests and Resampling

The computers are now so fast that P values for exact statistical tests
are obtained in reasonable time and accuracy by simple simulation.
This approach has been used in Survo already from 1986.

- See also: TABTEST operation and Fisher's exact_test

- See also: Numerical conversions

This demo is created by Kimmo Vehkalahti.

It is a good example of co-operation between Editorial computing and Survo data file operations.

- See also: Computus

Two ways for creating Pascal's triangle are presented.
In the first one, matrix commands and the library function C(n,m)
giving the binomial coefficients are used.
The second way is based entirely on efficient utilization of Touch mode.
The details of this construction are given in the
User Guide
(1992) on page 73.

- See also: Touch mode and Pascal's triangle

This demo in YouTube

The formulas behind the computational setup

L(N)|=if(N<2)then(0)else(2*L1(N)-L(N-1)+R1(N)) L1(N)|=if(N<3)then(1)else(2*L(N-1)-L1(N-1)+R2(N)) R1(N):=4*S(N) S(N):=for(I=2)to(N)sum(totient(I-1)-e(I)) e(N):=if(mod(N,2)=0)then(0)else(totient((N-1)/2)) R2(N):=if(mod(N,2)=0)then((N-1)*totient(N-1))else(R21(N)) R21(N):=if(mod(N,4)=1)then((N-1)*totient(N-1)/2)else(0)

were found experimentally by using Survo as described in
my document (pages 11-15) and
shown step by step in
YouTube
and also as a flash demo.

More information about this topic in
Another formula in Sloane's Encyclopedia of Integer Sequences has been presented earlier but it is much slower in computations.

Before finding the fast recursive formulas, I could make a conjecture
that an accurate asymptotic expression for L(n) is

L(n)=[3/(2*pi)*n^2]^2+O(n^2.5)

based on calculations for values n<=15000 by the slow formula as told in my document on pages 8-9.

Later (by using the fast formulas) I have computed the L(N) values for all N values to 10^11 by Mathematica code (on page 27) controlled directly from Survo. The graph indicates that the accuracy of the asymptotic expression really seems to be of order O(n^2.5) and this conjecture has been validated in a paper by Ernvall-Hytönen, Matomäki, Haukkanen, and Merikoski provided that the Riemann hypothesis is true. In the same paper also my other empirical findings have been proved.

Finding recursive formula for number of grid lines

Two routines for dealing with binary numbers were needed.
When using Survo, the quickest way to create such auxiliary tools
is to make them by using Survo's own macro language as
sucros.

Here are listings of those 'ad hoc' sucros (readily available for
all Survo users):

*TUTSAVE BIN-CONV / /BIN-CONV number,n * converts a positive decimal number <1 into binary form with n bits. / def Wx=W1 Wacc=W2 Wn=W3 Wint=W4 / *{init}{tempo -1}{Wn=0}{Wint=0.}{R}{erase} + A: {Wx=2*Wx}{ref}{line end}{print Wint}{R} *{erase}int({print Wx})={act}{l} {save word Wint}{Wx=Wx-Wint} *{Wn=Wn+1} - if Wn < Wacc then goto A *{line start}{erase} + E: {tempo +1}{end} * * *TUTSAVE BIN-SUB / /BIN-SUB makes the difference of two binary numbers, / either integers or fractions in (0,1) according to following setup: / / .. .... .... (borrowed bits appearing during calculation) / 1001110000110010 A / /BIN-SUB 0010010010100111 B (activate at the last bit) / 0111011110001011 A-B / *{tempo -1} + A: {ref set 1}{save char W1} - if W1 '=' {sp} then goto E - if W1 '=' . then goto B - if W1 '=' 0 then goto C / W1=1 *{u}{save char W2} - if W2 '=' 1 then goto D1 *{u}{save char W2}{d} - if W2 '=' . then goto D2 / + F: {l}{u}.{l}{d}{save char W2} - if W2 '=' 0 then goto F *{ref jump 1}{W3=1}{goto S} + D1: {u}{save char W2}{d} - if W2 '=' . then goto D3 *{d}{W3=0}{goto S} + D3: {goto F} + D2: {d}{W3=0}{goto S} / + C: {u}{save char W2} - if W2 = 1 then goto C1 *{u}{save char W2}{d} - if W2 '=' . then goto C2 *{d}{W3=0}{goto S} + C2: {d2}1{ref jump 1}{l}{goto A} + C1: {u}{save char W2}{d} - if W2 '=' . then goto C3 *{d}{W3=1}{goto S} + C3: {d}{W3=0}{goto S} + B: {W3=.} + S: {d}{print W3}{ref jump 1}{l}{goto A} + E: {tempo +1}{end}

- See also: Floating point numbers

This demo in YouTube

Relations between 12 variables are visualized by connecting any
pair by a line if the correlation is strong enough. Positive
correlations are indicated by a red line, negative by a blue line.
The line thickness reflects the size of the correlation coefficient.

It is not possible to make this kind of graphs quite automatically since there are so many options. However, a ready-made template corresponding to this example exists for Survo users. It is easy to modify this template for at least up to correlation matrices with 30 variables and get a good general view on the relations at hand.

To gain enough accuracy, this arrow or vector diagram is drawn as a PostScript picture. The final picture is obtained by combining two graphs using the EPS JOIN command of Survo for PostScript files generated by Survo PLOT commands. The final display here is dramatically slowed down by a SLOW=3000 specification when making the arrow diagram.

- See also: Arrow diagrams, PostScript in Survo and Plotting multivariate data

This demo in YouTube

I made this graph for the first time using Survo
in 1976 on a drum plotter connected to a
Wang 2200
minicomputer.
It was plotted in separate parts so that its size was over one
square meter.

**The graph illustrates a fact how little information is needed for
creating various forms** starting from a simple circle (ovum) at the center.
The plotting scheme

XDIV=0,1,0 YDIV=0,1,0 SIZE=1180,1180 HEADER= FRAME=3 HOME=300,500 A=-8,10,1 B=-8,10,1 T=0,2*pi,pi/30 pi=3.14159265 XSCALE=-9,11 YSCALE=-9,11 DEVICE=PS,SPECIES.PS PLOT X(T)=A+0.225*SIN(T)+0.139*SIN(A*T)+0.086*SIN(B*T), Y(T)=B+0.225*COS(T)+0.139*COS(A*T)+0.086*COS(B*T) /GS-PDF SPECIES.PS

reveals that the entire graph is created by activating a **single PLOT command** making
a family of curves depending on two parameters A and B both varying
from -8 to 10 by step 1 and effecting simultaneously to the location
of each partial graph and to its form.

The graph is created as a PostScript file
SPECIES.PS and converted into the PDF format.

A more accurate version (tenfold size and step length pi/300) is made as follows:

*XDIV=0,1,0 YDIV=0,1,0 SIZE=11800,11800 HEADER= FRAME=3 HOME=0,0 *A=-8,10,1 B=-8,10,1 T=[line_width(0.96)],0,2*pi,pi/300 *pi=3.141592653589793 *XSCALE=-9,11 YSCALE=-9,11 DEVICE=PS,SPECIES10.PS * *PLOT X(T)=A+0.225*SIN(T)+0.139*SIN(A*T)+0.086*SIN(B*T), * Y(T)=B+0.225*COS(T)+0.139*COS(A*T)+0.086*COS(B*T) * *..................................................................... *PRINT CUR+1,E TO K.PS / Reduction to original size % 1240 - [left_margin(1)] - picture species10.ps,*,*,0.1,0.1 E

- See also:

Another demo about this topic

Families of curves,

ESTIMATE was the first statistical program I created for the new version
of Survo (SURVO 84C in 1985) written in the C language.
It is still the general tool in Survo for nonlinear regression analysis
and maximum likelihood estimation, for example.
In fact, simultaneously I also wrote a C program (currently DER) for
computing symbolic derivatives of real functions since they are valuable
when forming the gradient and searching for the optimum of the object
function in nonlinear regression, etc.

Another demo about ESTIMATEI had done same things one year before for the Wang PC by using interpretative Basic. When I heard some programming experts in Finland to say that "Basic spoils your brain!":) I wanted to test my brain when getting a chance to start learning C by selecting these rather demanding targets as my first examples in C programming.

Usually I wrote already then all my programs at the computer without pen and paper but all this happened during Summer 1985 during my summer vacation in Central Finland where I had no access to any computer. So I wrote these programs by hand and got the first chance to test them only after returning home in August and by starting using my brand new IBM PC (AT model) and the new Microsoft C compiler.

This tiny example tells how ESTIMATE is used in calculating parameters and related statistics of a nonlinear regression model. The predecessor of ESTIMATE (on Wang PC in 1984) was probably one of the first statistical programs able to evaluate symbolic derivatives automatically and see (by studying derivatives of the second degree of the model function) whether they all are zero or not and thus determine if the model is linear with respect to parameters to be estimated or not. Then the program could decide what kind of numerical algorithm to select.

The the first model in this example was

MODEL CUP1 / Exponential decay T=T0+a*exp(-b*t)ESTIMATE is able to distinguish what are the parameters to be estimated (a,b) since it detects that T and t are variables in the data set CUP.

I created the DER program together with the
ESTIMATE
program for the new version of Survo (SURVO 84C in 1985) written in the
C language. ESTIMATE uses the DER code for creating symbolic derivatives
of the object function and converts them into inverted
Polish notation.

I made the original Finnish version of this demo in 1990 in connection with a limited SURVOS version intended for use in Finnish schools.

- See also: DER command

This plot is a collection of 51x36x10=18360 chords inside 36x10=360 ellipses
created according to the plotting scheme

SIZE=681,381 XDIV=0,1,0 YDIV=0,1,0 MODE=681,381 SCALE=0,7 FRAME=0 SLOW=100 t=0,50,1 n=0,35,1 r=-1.0,-0.1,0.1 GPLOT X(t)=int(n/6)+1+r*cos((-7*r+n)*t), Y(t)=n+1-6*int(n/6)+r*sin((-7*r+n)*t) COLORS=[/BLACK] COLOR_CHANGE=n-10*r,16The COLOR_CHANGE specification takes care of selecting one of 16 colors according to value mod(n-10*r,16). SLOW=100 makes the output 100 times slower than normally.

- See also: COLOR_CHANGE and Families of curves

This demo in YouTube

This example is related to the problem of selecting variables (for
example, for multiple regression analysis). However, here the selection
is not based on any external information (like on the regressand) but
it must be done solely by internal criteria.

The example is an abridged version of the chapter 'Column space'
in my paper
I encountered this problem when making the first computer program
in 1962 for Cosine rotation in factor analysis.
This rotation technique was devised and applied as a hand calculation
and graphical procedure by by Yrjö Ahmavaara and Touko Markkanen in the
1950ies.

As far as I know, before 1962 no analytical approach to the problem of
selecting the 'factor variables' had been presented.
In the cosine rotation program the target is to select the factor
variables as the maximally orthogonal subset of variables by a determinant
criterion. That principle is demonstrated in this example.

Matrix computations in Survo (1999).

- See also: Matrix operations in Survo

This demo in YouTube

This graph of a time series was created by the Survo plotting scheme:

YLABEL=[Arial(25)],Yearly_mean_temperature_in_Helsinki_(1829-2009) GPLOT HEL_MEAN,Year,Temp / SIZE=652,381 YDIV=50,291,40 XSCALE=1829(20)2009 YSCALE=1.5(0.5)8 TICK=5,1 TICK2=5,1 LINE=[WHITE],1 TREND=[BLACK],0 PEN=[BLACK] FILL=[RED],1,1,181,Trend,1 FILL-=[BLUE] XDIV=50,539,50 HEADER= WHOME=0,0 WSIZE=652-5,381-25

- See also: Graphs of time series, FILL specification, Combining observations and Linear regression analysis

This demo in YouTube

Here is the entire setup in the edit field for making this experiment:

FILE CREATE SIMUDATA,4,1,64,7,10000 Sample (N=10000) from a mixture of two normal distributions FIELDS: 1 N 4 X END VAR X TO SIMUDATA X=if(rnd(1)<0.7)then(X1)else(X2) X1=probit(rnd(1)) X2=0.5*probit(rnd(1))+2 ....................................................................... DENSITY MIXNORM(p,m1,s1,m2,s2) y(x)=c*(p/s1*exp(-0.5*((x-m1)/s1)^2)+(1-p)/s2*exp(-0.5*((x-m2)/s2)^2)) c=0.39894226 GHISTO SIMUDATA,X,22 X=-10(0.2)10 XSCALE=-10(2)10 YSCALE=0(100)600 FIT=MIXNORM INIT=0.5,0.5,1.5,2.5,0.7 HISTO: Estimated parameters of MIXNORM: p=0.7044 (0.0123) m1=0.0088 (0.0290) s1=1.0136 (0.0185) m2=2.0186 (0.0200) s2=0.5200 (0.0139) ...

Since in this demo a ready-made example given in
User Guide
(p.160) was employed, the graph it generated in a separate window had to
be dragged manually upon the main window.

- See also: Making histograms and User-defined distributions

Fence lines gives a possibility to make adaptive setups in the
Survo edit field so that the results of commands do not disturb
other contents.

This technique is available in SURVO MM versions 3.16+.

This example is taken from the book "The Man Who Knew Infinity, A Life
of the Genius Ramanujan" (1991) by Robert Kanigel.

The formulas are typeset by the PRINT operation of Survo as PostScript files, then converted to bitmap files by the ImageMagick program, then to EMF format by the Photoline program, and finally displayed on the main window of Survo by a GPLOT FILE command of Survo.

The formulas are typeset by the PRINT operation of Survo as PostScript files, then converted to bitmap files by the ImageMagick program, then to EMF format by the Photoline program, and finally displayed on the main window of Survo by a GPLOT FILE command of Survo.

This demo in YouTube

The final scatter diagram is compiled by overlaying two plots.
In the first one each observation is represented by a dot and in the
second one by a year label. The first one is saved as an
EMF
file A.EMF
by a specification OUTFILE=A. The second plot then overlays it by
a specification INFILE=A.

Although there is much confusion in labels in the middle of the graph, the exceptional and thus the most interesting years can be clearly detected. The first versions of this graph were made in late 1970ies by using SURVO 76.

- See also: Plotting scatter diagrams, Temperature in Helsinki and Finland's Climate

It is a great pity that classical Euclidean plane geometry plays a minor
role in the curriculum of mathematics in high schools, for example.
Especially constructions with compass and straight edge could be used
to reinforce visual perception.

In Survo a special program called by a GEOM command is available for making such constructions in conjunction with some other Survo functions.

Here a construction for approximate circle squaring is presented.

It is based on a random search in a square grid as explained in my paper

Statistical accuracy of geometric constructions (2008)
on pages 35-37.

This construction is described in an edit field as follows:

*/GEOM *GEOM CUR+1,E *CL4 *O=point(2,2) *A=point(2,0) *_C1=circle(O,2) *LX4=line(A,O) *B=cross_cl(C1,LX4,2,4) *c2=circle(B,*2) *LY4=perpendicular(LX4,*B) *C=cross_cl(C2,LY4,4,4) *D=cross_cl(C2,LY4,0,4) *LY2=perpendicular(LX4,O) *LX2=perpendicular(LY4,*D) *F=cross(LX2,LY2) *G=midpoint(B,O,LX4) *H=midpoint(D,F,LX2) *C3=circle_p(O,G) *J=cross_cl(C3,LY2,3,2) *eag=edge(A,G) *C4=circle(J,EAG) *L=line(H,C) *E=cross_cl(C4,L,0,3) *Edge=edge(A,E) *save edge(Edge) EGEOM is typically called by a sucro /GEOM which creates suitable Survo data files for various geometric objects appearing in the construction. Thus /GEOM also calls GEOM for making the construction so that points are saved in _POINTS.SVO, lines in _LINES.SVO, circles in _CIRCLES.SVO, and edges in _EDGES.SVO.

The construction can then be displayed by using various forms of the Survo operation PLOT. An ready-made template as a SURVO edit field is available so that the entire construction is saved as a PostScript file.

Everyone who has experience of making geometric constructions in practice knows how much attention must be paid to a careful placement of the compass and the straightedge in each step of the construction in order to achieve as accurate results as possible.

In my paper, the accuracy of these placements is described by a simple statistical model and the accuracy of the entire construction is estimated on this basis. Then it is natural to consider the accuracy of the construction as a measure of its complexity. This measure is expected to give better possibilities for comparing complexities of constructions than the characteristics of Lemoine's geometrography. My approach is mainly computational. Although the error distribution of placements is defined precisely, the error distributions related to entire constructions are so complicated that the only way is to use Monte Carlo simulation for estimating essential statistics.

When considering the accuracy of this approximate circle squaring construction, the nominal accuracy (pi-3.14152=0.00007) is not a sufficient measure since it can be attained only when there are no errors in construction steps.

For example, the relative root mean squared error (defined on page 24 and computed on page 37 of my paper) of this construction is 2.125 while, for example, that of Kochanski when extended to approximate construction of sqrt(pi)r (the side of the square) is 2.908, although the nominal accuracy of the latter is 0.00006 and thus slightly better.

This demo in YouTube

By detecting that (by rotation of 45 degrees)
the symmetric random walk in the plane can be seen
as a combination of two independent and simultaneous one-dimensional
random walks, it was easier to study asymptotic properties.

The background of this presentation is William Feller's
*An Introduction to Probability Theory and Its Applications, Vol.I*
(Second Edition 1957) Ch. XIV.7 "Random Walks in the plane and space".

As a student of mathematics and statistics I wrote an essay about this topic in 1959 after inventing a simpler formula for the transition probability (of moving from the origin to the point (x,y) in n steps) compared to that given as a double integral expression by Feller.

I sent a letter about my findings to Feller and got immediately a friendly answer from him where he promised to use my result "if any" in the forthcoming edition of his book. However, to my disappointment, in next editions nothing had been changed in this respect.

There has been an context sensitive autotext feature in the Survo Editor
already from 1989 by means of the key combination F2 J.

F1 J is an extended alternative for F2 J for completing phrases found elsewhere in the current edit field. As in this example, a list containing 'all possible phrases' may be loaded to the end of the edit field. This list then serves as a source of information during writing process by giving synonyms, technical terms etc.

It is easy to create such lists on different topics by pasting them from websites, for example. The list used in this example originates from Birds of Sweden.

This demo in YouTube

This feature has been available already in SURVO 76.

The traditional graphical rotation is described e.g. in

Ledyard Tucker and Robert MacCallum:
Exploratory Factor Analysis,
Chapter 10.

The principles of Cosine rotation and Transformation analysis were
introduced in Yrjö Ahmavaara's dissertation

Transformation Analysis of Factorial Data, Helsinki Ann.
Acad. Sci. Fenn., B 88, 2, 1954.

The current algorithm for Cosine rotation was created in 1961 and described in Matrix computations in Survo.

This demo in YouTube

A comprehensive documentation is given in my paper

Visualization and characterization of Pythagorean triples.

Interactive 'graphical' identifying of Pythagorean triples is available
as a sucro /P_TRIPLE.

It is surprising that a biased coin (even without knowing its
probabilities p for HEADS p and 1-p for TAILS) may be used like a fair
coin by observing coin-flips in pairs. The expected number of flips of
the biased coin for extracting one unbiased coin-flip is 1/(p*(1-p))
which gets its minimum value 4 for p=1/2, i.e. when the coin actually
is fair. Thus typically four or more flips are needed if we do not
believe that the coin is fair.

The original data of 1200 flips with a biased coin (p=1/3) was generated by Survo as follows:

FILE CREATE COIN,1,1 FIELDS: 1 N 1 X END FILE INIT COIN,1200 p=1/3 MATRIX P /// 0 p 1 1-p MAT SAVE P RND=URAND(20106) TRANSFORM COIN BY #DISTR(P) MAT SAVE DATA COIN AS COIN2 MAT COIN3=VEC(COIN2,20) / *COIN3~VEC(COIN2) 20*60 MAT LOAD COIN3,##,CUR+1 MATRIX COIN3 VEC(COIN2) /// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ... 1 1 0 0 1 1 1 1 1 0 1 1 1 0 0 1 1 1 ... 2 0 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 0 ... 3 1 1 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 ... 4 1 0 1 1 1 1 1 0 1 1 1 0 1 0 1 1 1 ... 5 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 ... . . . . . . . . . . . . . . . . . . ...

and Fair Coin (in Wikipedia)

This approach is based on the following facts:

An integer can be decomposed into prime factors in only one way.

There is strong number-theoretic evidence for the fact that for (large) integers the number of prime factors is even or odd with equal probabilities. This is also intuitively obvious.

An integer can be decomposed into prime factors in only one way.

There is strong number-theoretic evidence for the fact that for (large) integers the number of prime factors is even or odd with equal probabilities. This is also intuitively obvious.

To give credence to the fact that the number of prime factors is even or odd with probablity 1/2, I took a random sample of 10 million integers with 16 digits by the following Mathematica code

a=10^15 SeedRandom[1]; t1=TimeUsed[]; tab:=Table[PrimeOmega[RandomInteger[{a,a+8999999999999999}]],{n,1,10^7}]; Export["Sample.txt",tab,"Table"] TimeUsed[]-t1

and converted the output file Sample.txt of 10^7 Omega values into
a Survo data file. By the STAT program of Survo the following
frequency distribution was obtained:

Omega f % *=65536 obs. 1 277575 2.8 **** 2 1053208 10.5 **************** 3 1860960 18.6 **************************** 4 2105565 21.1 ******************************** 5 1783889 17.8 *************************** 6 1245149 12.5 ****************** 7 765283 7.7 *********** 8 433642 4.3 ****** 9 232340 2.3 *** 10 120919 1.2 * 11 60947 0.6 : 12 30650 0.3 : 13 15225 0.2 : 14 7482 0.1 : 15 3655 0.0 : 16 1814 0.0 : 17 876 0.0 : 18 408 0.0 : 19 217 0.0 : 20 102 0.0 : 21 49 0.0 : 22 22 0.0 : 23 15 0.0 : 24 3 0.0 : 25 2 0.0 : 27 1 0.0 : 28 1 0.0 : 29 1 0.0 :

The relative frequence of odd Omega values was then 0.5001034
(and 0.4998966 for even values).
Since the standard error of these estimates is 0.00016, the
deviation from 0.5 is less than this standard error.

Thus "Omega coin tossing" works like a fair coin.

Thus "Omega coin tossing" works like a fair coin.

In number theory the function lambda(n)=(-1)^Omega(n) getting "randomly" values -1 and +1 is known as Liouville function and it completely corresponds to "Omega coin tossing". The sums of lambda(n) has been studied experimentally in Sign changes in sums of the Liouville function by Borwein, Ferguson, and Mossinghoff (2010).

Musicians (who often claim that they know nothing about mathematics)
are clever in recognizing intervals and chords even when the sound is
not pure. Thus when hearing an interval x as a major third they
unconsciously find that 5/4 is the best approximation for x.

The VAR and PLAY commands of Survo give an opportunity to create and play sound files (in WAV format). For example, the neutral third and pure intervals close to can be listened as follows:

s(X):=sin(X*ORDER) f1=11/9 1.2222222222222... Interval_11_9 f2=sqrt(5/4*6/5) 1.2247448713916... Neutral_third f3=5/4 1.25 Major_third f=0.2 'basic frequency' FILE MAKE Test,1,24000,X,2 / creates data file Test. VAR X=10000*(s(f)+s(f*f1)) TO Test / computes the wave form PLAY DATA Test,X / WAV=Interval_11_9 / converts the wave form into WAV VAR X=10000*(s(f)+s(f*f2)) TO Test PLAY DATA Test,X / WAV=Neutral_third VAR X=10000*(s(f)+s(f3*f)) TO Test PLAY DATA Test,X / WAV=Major_third PLAY SOUNDS / plays sound files created Interval_11_9 Neutral_third Major_third Slightly off the topic: The neutral third sqrt(3/2) is recognized correctly from its approximate value 1.2247448713916 by the INTREL command INTREL 1.2247448713916 / giving X=1.2247448713916 is a root of 2*X^2-3=0

The dissonance function diss(c,x,m,n) is plotted for various c values
as a family of curves in
User Guide
(1992) on page 330.

Other demos related to this topic:

Dissonance functions

Tuning roots of algebraic equations by "listening"

Other demos related to this topic:

Dissonance functions

Tuning roots of algebraic equations by "listening"

This demo in YouTube

The graph is a modification of an old Survo application
presented in
User Guide
(1992) on page 334. This 'animated' version is inspired by
Life A User's Manual
(La Vie mode d'emploi) by Georges Perec.

In the plotting scheme the jumps are triggered by int() functions in X(t) and Y(t) expressions.

HEADER= HOME=0,0 WHOME=0,0 WSIZE=652,381 WSTYLE=0 MODE=652,381 XSCALE=-14,14 YSCALE=-10,10 FRAME=3 SIZE=652,381 XDIV=0,1,0 YDIV=0,1,0 a=53 t=0,50*pi,pi/150 pi=3.14159 GPLOT X(t)=12*sin(int(0.35*t))+0.8*sin(a*t)*sin(t)+0*A, Y(t)=8*sin(int(0.3*t))+0.8*sin(a*t)*cos(t) A=0(1)4 PALETTE=BGY3 COLORS=[background(8)] COLOR_CHANGE=A,3 SLOW=100 slowing down the plotting speed

This is an abbreviated version of a Finnish teaching program made in
1998.

I wanted to show how simple is the 'theory' behind
randomization tests when compared to that of the t test, for example,
and how a time-consuming permutation test can be replaced by its
randomized alternative.

The COMB operation and the Survo matrix interpreter are the key components in this experiment. Also a few other special features of Survo are applied.

This demo in YouTube

This is an abbreviated version of a Finnish teaching program made in
1998. It is shown how the winning probabilities and expected values
can be derived simply by considering them conditionally with respect
to the first pair of coin-flips.
The probabilities are found by solving a system of four linear
equations.

This example is continued by a simulation experiment HH - HT game (simulation)

This demo in YouTube

This is an abbreviated version of a Finnish teaching program made in
1998 and continuation of
HH - HT game (analysis).

I presented Survo Puzzle in 2006. More information is available on the home page.

The puzzle solved here has a moderate degree of difficulty (400).
It was selected randomly by using a
Java applet.
By clicking the applet and entering the serial number #352-23824,
this Survo puzzle can be solved more easily by the
Swapping method.
(This is shown in another
demo.)
But when using this technique the uniqueness of the solution cannot be
confirmed.

If your browser does not support Java, a corresponding Javascript version is available.

If your browser does not support Java, a corresponding Javascript version is available.

Survo supports solving of these puzzles in many ways. Besides the COMB operation also Editorial computing and Touch mode are useful tools. The editorial interface of Survo is also suitable for general book-keeping during the solving process.

This demo in YouTube

This is an illustration related to my paper

On lines through a given number of points in a rectangular grid of points.

On the cover page of that paper a picture of
all L(16,4)=548 lines connecting exactly four points in a 16x16 grid
is presented.

A related example
Although efficient computing of numbers L(n,j), i.e. # of lines going through j points in and nxn grid of points, is not trivial, making a list of these lines is still more demanding task for large n values. However, such a list for small n is easy to generate by brute force on current computers. Thus the Survo program module GRIDP simply starts from all pairs of points in the grid and sorts out the required lines.

The graph is a typical example of how in Survo complicated pictures are compiled of several overlaying parts, here starting from a black background, then drawing the line segments, and finally setting the points as small 'hollow' circles.

Combining Survo PostScript files

This demo in YouTube

When a Survo puzzle is solved by the swapping method there is no guarantee
that the solution is the only possible. The same puzzle is solved in
another
demo systematically showing at the same time that
the solution is unique.

If this puzzle is given as an open Survo puzzle without any fixed numbers in the form

A B C D E 1 * * * * * 41 2 * * * * * 28 3 * * * * * 51 34 8 13 28 37it has also another solution which is obtained by three swaps. Try to find that solution by going to http://www.survo.fi/swap/puzzles and click the game board and type #352-23824 ENTER

If your browser does not support Java, a corresponding Javascript version is available.

The prime numbers are found by using a variation of the 'trial division'
method:

Given a number n, one divides n by all numbers m less than or equal to
the square root of that number. If any of the divisions come out as an
integer, then the original number is not a prime. Otherwise, it is a
prime.

After numbers 2, 3, and 5 are listed, both n values (Wnumber in PRIMES) and m values (Wfactor) are selected so that they are not divisible by 2 or 3.

A sucro program cannot be efficient in purely numerical problems
since all objects processed by a sucro are presented as strings of
characters. The 'values' of variables are saved in a 'sucro memory'
which is simply a string.

For example, at the end of the current application
this string is

5000@5003@71@29@4@2@5041@

giving values

WN=5000

Wnumber=5003 (first integer exceeding 5000: 4999+Wi)

Wdivisor=71

Wremainder=29 (last accepted prime 4999 mod 71 is 29)

Wi=4 (oscillating between 2 and 4)

Wj=2 (similarly)

Wsquare=5041 (71^2=5041)

It is evident that repeating string conversions to numerical values
and vice versa are slowing down the speed of computation.

In typical applications of sucros, like teaching programs and demos (like these GIF animations) and combining several Survo operations, this feature is unimportant. Although Survo program modules are written in C, many system routines are sucros.

A general description of the sucro language is given in User Guide (1992) chapter 12 (pages 399 - 443).

Background information about the Ulam spiral in Wikipedia, for example.

When making the spiral the main task is to map values of n to x,y
coordinates.

I derived the formulas

x(n)=x(n-1)+sin(mod(int(sqrt(4*(n-2)+1)),4)*pi/2)

y(n)=y(n-1)-cos(mod(int(sqrt(4*(n-2)+1)),4)*pi/2)

by observing that the turning points of the spiral may be described in
this way

12 11 11 11 11 11 11 12 8 7 7 7 7 10 . 12 8 4 3 3 6 10 . 12 8 4 1 2 6 10 14 12 8 5 5 5 6 10 14 12 9 9 9 9 9 10 14 13 13 13 13 13 13 13 14

1,2,3,3,4,4,5,5,5,6,6,6,7,7,7,7,8,8,8,8,9,9,9,9,9,...

with a general term a(n)=int(sqrt(4n+1)), n=0,1,2,...

This is verified easily by observing that a(n) grows exactly on
values n=k^2 and n=k(k+1), k=0,1,2,... and then the gaps between
growing points are 1,1,2,2,3,3,4,4,... leading to the sequence in
question.

The increments in x coordinates are 1,0,-1,-1,0,0,1,1,1,0,0,0, following the same pattern of value changes as a(n) but with cyclic variation 1,0,-1,0 of length 4. Therefore the increment in x values can be expressed as sin(mod(int(sqrt(4*(n-2)+1)),4)*pi/2),

thus as a composition of four 'elementary' functions.

This is a bit overuse of trigonometric functions but nice for the VAR operation of Survo. The increments of the y coordinates follow the same pattern by changing sin to -cos.

The increments in x coordinates are 1,0,-1,-1,0,0,1,1,1,0,0,0, following the same pattern of value changes as a(n) but with cyclic variation 1,0,-1,0 of length 4. Therefore the increment in x values can be expressed as sin(mod(int(sqrt(4*(n-2)+1)),4)*pi/2),

thus as a composition of four 'elementary' functions.

This is a bit overuse of trigonometric functions but nice for the VAR operation of Survo. The increments of the y coordinates follow the same pattern by changing sin to -cos.

This is a typical example of how by means of editorial computing and
text processing a general computation scheme is created for a particular
application.

The Fisher zeta transformation makes the correlation coefficient approximately normally distributed with standard deviation 1/sqrt(n-3). The cumulative normal distribution function is available as a library function N.F(m,s^2,x).

If you like to use this template in your own Survo,

please copy/paste it from here.

The age pyramid (TYPE=PYRAMID) is one of types of bar charts in Survo.

Shakespeare's 154 Sonnets were imported from

http://www.shakespeares-sonnets.com/allsonn.htm.

The same textual data is studied in demos
Shakespeare's Sonnets as a Markov chain and

The most common words in Shakespeare's Sonnets.

It is shown how Survo can deal with literal and partially disorganized
data. In many cases such data sets have to be scanned, filtered,
and purified from unsystematic features. In this example, various
forms of the
LINEDEL
command of Survo were useful.

The letter frequencies in Shakespeare's Sonnets counted in this demo seem to deviate significantly from current English at least for some common letters like a,c,h,t as one can see in the following table.

Letter Sonnets English Difference % % a 6.8 8.2 -1.4 b 1.7 1.5 0.2 c 1.8 2.8 -1.0 d 3.8 4.3 -0.5 e 12.5 12.7 -0.2 f 2.3 2.2 0.1 g 1.9 2.0 -0.1 h 7.0 6.1 0.9 i 6.4 7.0 -0.6 j 0.1 0.2 -0.1 k 0.8 0.8 0.0 l 4.2 4.0 0.2 m 2.9 2.4 0.5 n 6.2 6.7 -0.5 o 7.8 7.5 0.3 p 1.4 1.9 -0.5 q 0.1 0.1 0.0 r 5.7 6.0 -0.3 s 6.8 6.3 0.5 t 9.9 9.1 0.8 u 3.2 2.8 0.4 v 1.3 1.0 0.3 w 2.6 2.4 0.2 x 0.1 0.2 -0.1 y 2.7 2.0 0.7 z 0.0 0.1 -0.1

Shakespeare's 154 Sonnets were imported from
http://www.shakespeares-sonnets.com/allsonn.htm.

The same textual data is studied in demos
Letter frequencies in Shakespeare's Sonnets and

The most common words in Shakespeare's Sonnets.

The simulations were made according to a technique presented by
Claude Shannon in

A Mathematical Theory of Communication
(1948).

The matrix formulas used in this demo are derived, for example, in
Survo User Guide
pp. 377-378. The
REGDIAG
program of Survo uses the same algorithm based
on orthogonalization of the regressor matrix.

The automatic labelling of matrix rows and columns has been possible already in the matrix interpreter of SURVO 76 (in 1977). It is important to notice the rules for labels in derived matrices. For example, labels are transposed not only when transposing a matrix but also when matrix is inverted, etc. A simple label 'algebra' ensures that in the matrix of regression coefficients the names of regressors appear as row labels and the names of regressands as column labels.

Shakespeare's 154 Sonnets were imported from
http://www.shakespeares-sonnets.com/allsonn.htm.

The same textual data is studied in demos
Letter frequencies in Shakespeare's Sonnets and

Shakespeare's Sonnets as a Markov chain.

The most important tools were the WORDS, STAT, and SORT commands.

The order of the most common words differs from that of common English for obvious reasons.

Multiple discriminant analysis is performed in Survo by computing
covariance structures (correlations, means and standard deviations)
for each group of observations by the
CORR operation. Then the
actual analysis takes place using these results by the sucro command
/DISCRI. The computations are made
automatically by the MAT commands (matrix interpreter) of Survo.
/DISCRI saves the results as matrix files and lists suitable
commands in the edit field for retrieving them.

One of those commands is

MAT LOAD DISCRXR.M,END+2 / Correlations variables/discriminators

giving in this case

MATRIX DISCRXR.M Correlations_between_variables_and_discriminators /// Discr1 Discr2 Sepal_L -0.79189 -0.21759 Sepal_W 0.53076 -0.75799 Petal_L -0.98495 -0.04604 Petal_W -0.97281 -0.22290and shows that the dominant dicriminator depends essentially on the petal size of the flower.

The discriminant scores were computed by a
LINCO command

LINCO Iris,DISCRL.M(D1,D2)

(as suggested by /DISCRI).

The same data is studied in Cluster analysis of Iris flower data set. by classifying the observations into three groups without any prior information about the species of flowers. It turns out that then clustering according to Wilks' Lambda criterion will be identical to that obtained by reclassification of the original observations according to Mahalanobis distances after discriminant analysis.

Survo offers alternative means for making
cluster analysis.
In this case the best result was achieved by statistical clustering
based on Wilks' Lambda criterion.

The same data is studied in
Discriminant analysis of Iris flower data set
Pekka Korhonen has presented an effective stepwise procedure for
computation of lambda values in his doctoral thesis "A stepwise
procedure for multivariate clustering", Computing Centre, University of
Helsinki, Research Reports N:o 7 (1979).

In Korhonen's research a pivot operation plays an essential part
in a form presented earlier by Hannu Väliaho in his doctoral thesis
"A synthetic approach to stepwise regression analysis",
Comm.Phys.Math., vol.34, No.12, 91-132 (1969).

In the CLUSTER program of Survo, the dual procedure of Korhonen's stepwise method is applied. I was Korhonen's opponent in his dissertation and then I took a task to check his algorithms by implementing them to Survo.

In gif-animated demos (and in particular in this one) the size of
the 'window' limits possibilities of showing interplay between Survo
and other programs. Only a hands-on approach can tell the whole truth.
Recommended!

This demo in YouTube

A new feature (valid in SURVO MM from ver. 3.21) for denoting points
by arrowheads in scatter diagrams is presented.
New point types are 21 (arrow) and 22 (filled arrow) are available
and the orientation of arrows is selected by a variable, say A, giving
the direction angle in degrees and determined by a code [rotation(A)]
in the
POINT
specification.

This demo in YouTube

The plotting setup

PLOT z(x,y)=abs(r*(1-w)+u*v)/w u=sqrt(n/(n*n-1))*(x-mx)/sx v=sqrt(n/(n*n-1))*(y-my)/sy w=sqrt((1+u*u)*(1+v*v)) TYPE=CONTOUR SCREEN=NEG ZSCALING=20,0

The formulas are derived in my note.

I presented this example among others in my talk about Survo
in Compstat 1992 (Neuchatel).

This graph was used by the organizers
of Compstat as a cover page (upside down!!)

in the proceedings of the symposium.
http://en.wikipedia.org/wiki/Computational_statistics

In bar and pie charts, labels of variables can be written in the graph
by a LABELS specification and values of variables by a VALUES
specification. From the version 3.22 of SURVO MM these texts can be
colored individually by using an extended form of the SHADING
specification (here on line 8) referring to both fill colors and
to text colors (separated by a slash /).

The colors to be used are defined by COLOR(n) specifications telling the color components of each SHADING value n according to the CMYK color model.

Before SURVO MM (until year 2000) the mouse had practically no role
while using Survo. Thereafter natural functions of the mouse were
adopted. This example shows how almost everything may now be done
without the keyboard just by the mouse and a virtual keyboard.
Needless to say, practice of mouse-oriented use to such extent is not
very convenient.

The user may edit soft buttons and also create new ones while using Survo.
The default set of soft buttons is defined in the edit field

<Survo>\U\SUR-SOFT.EDT specifying, for example,
the main button line EXIT in the form:

Most of the soft buttons lead to activation of a Survo macro (sucro).
For example, clicking the START button activates sucro /SURVO-START
(see lines 60-61).

A cycloid is a curve defined by the path of a point on the edge of
circular wheel as the wheel rolls along a straight line.

See: The origin of Survo Editor
The first Survo Editor (1979) was originally programmed for input
and editing of musical manuscripts and for converting them into
a printable form. The slurs (arched curves connectiong a group of
notes) were then plotted as slightly modified cycloids.

This 67th demo is a tribute to
Mr. Cole who found by numerical
computations in 1903 that the Mersenne number M67=2^67-1 was not
a prime number but a product of integers 193707721 and 761838257287.
Finding of these factors was made easier e.g. by the fact that it was
known beforehand that each potential factor has the form c*67+1
since 67 is a prime. In this case 193707721=2891160*67+1 and
761838257287=11370720258*67+1.

It seems that, in general, numbers of type m^n-1 typically have many (and sometimes all) prime factors of the form c*n+1. By plain numerical calculations I have tried to study their abudance and found some systematic results reported in my paper. These results may have been proved already before. Thus if somebody knows about such proofs, please, let me know.

Editorial computing in Survo makes inventing and testing of this kind of numerical hypotheses easy and comfortable according to the style used in this demo. Making of suitable sucros (Survo macros) is also helpful. The most important sucro /MPN used in this connection has the following listing in a Survo edit field:

*TUTSAVE MPN / /MPN m_max,n / SM 4.12.2010 / assuming that n is a prime number / computes the prime factors of numbers (m^n-1)/(m-1) for / m=2,3,...,m_max / and represents them in the form c*n+1. / If m-1 divides n, the smallest factor is n. / / See: ../papers/MustonenPrimes.pdf / / def Wmax=W1 Wn=W2 Wm=W3 Wprod=W4 Wc=W5 Wfactor=W6 Wpow=W7 / *{init}{tempo 0}{disp off}{Wm=2}{R} *int(exp(log(9000000000000000)/{print Wn}))={act}{l} {save word Wc} *{line start}{erase}{u}{disp on}{tempo 2} - if Wmax <= Wc then goto A *{Wmax=Wc} + A: {R} *({print Wm}^{print Wn}-1)/({print Wm}-1)={act}{l} {line end} *(10:factors)={act} / Remove text "(10:factors)=": *{l13}{del12}{r}{ref set 1} / *{save line Wprod}{erase}{R}{print Wprod}{line start} / Replace *'s by spaces: + B: {r}{save char Wc} - if Wc '=' {sp} then goto C - if Wc '<>' * then goto B / Replace * by a space: * {goto B} + C: / Each factor to a separate line: *{home}{u}{ins line}TRIM 1{act}{del line}{home} *{save char Wc} - if Wc '<>' {sp} then goto D *{del line} / + D: {save word Wfactor}{Wpow=1} - if Wfactor = 0 then goto G - if Wfactor > Wn then goto D1 / Wfactor is n *{form}{goto F} + D1: / Search for ^ + D2: {r}{save char Wc} - if Wc '=' ^ then goto D3 - if Wc '<>' {sp} then goto D2 else goto D4 + D3: {save word Wpow}{home}{save word Wfactor} + D4: {line start}{erase}({print Wfactor}-1)/{print Wn}={act} / *{l} {save word Wc}{line start}{erase}({print Wc}*{print Wn}+1) / *{line start}{save word Wfactor} + F: {ref jump 1}{write Wfactor} - if Wpow = 1 then goto F2 *^{print Wpow} + F2: *{ref set 1}{R}{del line}{goto D} + G: {ref jump 1}{l}{del} / - if Wm = Wmax then goto E *{Wm=Wm+1}{goto A} + E: {end}

This demo is created by Kimmo Vehkalahti.

Functions of the Survo matrix interpreter are shown in connection with Magic Squares.

This demo is created by Kimmo Vehkalahti.

Functions of the Survo matrix interpreter are shown in connection with Magic Squares.

A system of linear equations

X1+X2=27 X2+X3=32 X3+X4=32 X1+X3=25is represented as a matrix equation A*X=B by saving matrices

MATRIX A /// X1 X2 X3 X4 r12 1 1 0 0 r23 0 1 1 0 r34 0 0 1 1 r13 1 0 1 0 MATRIX B /// freq r12 27 r23 32 r34 32 r13 25

An essential tool for making polynomial regression is the POWERS program.
It computes powers of selected variables up to a given degree as new
variables. Thereafter polynomial regression analysis is carried out
by standard tools like LINREG.

In this example, polynomial regression is applied for determining unknown coefficients of a certain polynomial of two variables from a sample of values.

Originally, this calculation was presented in my
note (in Finnish) (2004)
related to computation of a distribution of
the city block distance
D between two random points in a grid of N x N points.

F(N,K)/N^4 is then the probability P[D=K] for K=1,2,...,N-1.

Shadow characters play an essential role in the editorial interface
of Survo. In fact, each line in the Survo Editor may have an optional
line consisting of shadow characters. Their existence is indicated
by various display effects. For example, '1' as a shadow character
makes the corresponding (main) character red and when lines are
printed, these red characters typically appear in boldface.

Survo has certain tools for management of shadow lines. The SHADOW SET command is the most recent one (included at the end of year 2011). It enables filling columns of tables with selected shadow characters thus enhancing their appearance.

In fact, this new SHADOW SET does exactly the same job for the shadow lines as the 'classic' SET command for ordinary edit lines.

Consider a data set x_1, x_2,..., x_n where each observation is
an approximate integral multiple of one of positive numbers
q_1, q_2,..., q_k where typically k=1 or another small integer.

Our task is to estimate the values of quanta q_1, q_2,..., q_k on the condition that each of them exceeds a certain minimum value q_min.

D.G.Kendall has in his paper Hunting Quanta (Royal Society of London. Mathematical and Physical Sciences A 276, 231-266) proposed using a "cosine quantogram" of the form

n phi(q) = sqrt(2/n)* SUM cos(2*pi*eps(i)/q) (Kendall 1974) i=1

where 0<=eps(i)<q is the remainder when x_i is divided by q.
The q-values of highest upward peaks of this function will be considered
as candidates for quanta.

My idea is that the quanta are estimated by a selective, conditional least squares method where the sum

n ss(q_1,...,q_k) = SUM min[g(x_i,q_1)^2,...,g(x_i,q_k)^2] (SLS 2005) i=1

where g(x,q) in the least absolute remainder when x is divided by q,
is to be minimized with respect of q_1,...,q_k on the condition that
each q_i is at least q_min.

A more detailed description is found in my paper Hunting multiple quanta by selective least squares.

It was a rather simple task to implement this technique (available
in SURVO MM from ver.3.35).
The lines in the edit field of Survo are displayed by using pointer
variables of the C language. Thus when switching edit lines, no line is
actually moved; only their pointers are temporarily 'updated'.

All Survopoint lines are indicated by the '~' (tilde) character in the control column. At the end of such a line a marking of type ~x must exist. x is any of the lowercase characters a,b,...,z. For any x, a line having x in its control column must exist in the same edit field (typically outside the Survo window) and this line tells how the corresponding Survopoint line is displayed.

For example, in this demo the display mode of English proverbs (appearing in the latter part of the demo) is defined as follows:

e 30 159 S * The road to hell is paved with good intentions. * He laughs best who laughs last. * A smooth sea never made a skilled mariner. * Truth is stranger than fiction. * A friend to all is a friend to none. * Be swift to hear, slow to speak. * Knowledge in youth is wisdom in age. ...On the 'e' line, 30 indicates the rate of change (this Survopoint line is altered only once in 30 sequent refreshments of the display). 159 is the number of proverbs in the list and 'S' indicates a systematic change.

My
lecture notes (1995)
on multivariate statistical methods (in Finnish)
include an appendix about multidimensional hyperspheres and
hypercubes.
The main purpose is to show properties of such abstract
objects and give an idea how things become more complicated in higher
dimensions but are still tractable.

One of the illustrations is a graph of a 4-dimensional cube represented as 2-dimensional projections. This graph resembles a draftsman's plot (scatterplot matrix) of multivariate statistical data.

Here this graph is generated by a series of Survo operations triggered by an /ACTIVATE sucro command. Below is a complete description (an extract from a Survo edit field) about how the graph has been created:

65 * 66 *The following sucro command activates all commands having a '+' in the 67 *control column and thus the final graph will be automatically created: 68 * 69 */ACTIVATE + (Activated commands are displayed here in red.) 70 *It is possible to draw each 2-dimensional projection of a 4-dimensional 71 *cube as a single line graph of edges since the degree of each vertex 72 *is 4. Then there exists an Eulerian circuit where each edge is 73 *traversed just once. 74 *Consider the cube in a 4-dimensional space so that vertices have 75 *coordinates (x_1,x_2,x_3,x_4) where each x_i is either 0 or 1. 76 *Then the following matrix gives an Eulerian circuit in this 77 *4-dimensional cube: 78 * 79 *MATRIX C4 /// 80 *0 0 0 0 81 *0 0 0 1 82 *0 0 1 1 83 *0 0 1 0 84 *0 0 0 0 85 *0 1 0 0 86 *0 1 0 1 87 *0 1 1 1 88 *0 1 1 0 89 *0 1 0 0 90 *1 1 0 0 91 *1 1 0 1 92 *0 1 0 1 93 *0 0 0 1 94 *1 0 0 1 95 *1 1 0 1 96 *1 1 1 1 97 *0 1 1 1 98 *0 0 1 1 99 *1 0 1 1 100 *1 0 0 1 101 *1 0 0 0 102 *1 1 0 0 103 *1 1 1 0 104 *0 1 1 0 105 *0 0 1 0 106 *1 0 1 0 107 *1 0 1 1 108 *1 1 1 1 109 *1 1 1 0 110 *1 0 1 0 111 *1 0 0 0 112 *0 0 0 0 113 * 114 +MAT SAVE C4 115 +MAT TRANSFORM C4 BY X#-0.5 / Centering (0,1) -> (-0.5,0.5) 116 +MAT CLABELS "X" TO C4 / Column labels X1,X2,X3,X4 117 * 118 *The regular 2-dimensional projections of this hypercube are plain 119 *squares and thus not very interesting. 120 * 121 *A better view is obtained by making an "arbitrary" 4-dimensional 122 *rotation: 123 * 124 +MAT T=ZER(4,4) 125 +MAT TRANSFORM T BY sin(31*I#*J#) / "arbitrary" T 126 * 127 +MAT GRAM-SCHMIDT DECOMPOSITION OF T TO Q,R / Orthogonalization of T 128 +MAT K=C4*Q / Rotation of the hypercube by orthogonal Q 129 +MAT CLABELS "dim" TO K / Column labels dim1,dim2,dim3,dim4 130 * 131 *Combining the rotated and original cube into one matrix KB: 132 * 133 +MAT KB=ZER(33,8) 134 +MAT KB(1,1)=K 135 +MAT KB(1,5)=C4 136 * 137 *....................................................................... 138 *Plotting all six 2-dimensional projections separately: 139 * 140 *SIZE=1000,1000 SCALE=-1,1 HEADER= XDIV=0,1,0 YDIV=0,1,0 FRAME=3 141 *FRAMES=F F=0,0,1000,1000 PEN=[SwissB(30)] 142 *XLABEL= YLABEL= LINE=[line_type(2)][line_width(0.2)],1 TEXTS=T 143 * 144 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,A12.PS T=1_-_2,750,50 145 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,A13.PS T=1_-_3,750,50 146 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,A14.PS T=1_-_4,750,50 147 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,A23.PS T=2_-_3,750,50 148 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,A24.PS T=2_-_4,750,50 149 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,A34.PS T=3_-_4,750,50 150 * 151 *....................................................................... 152 *Plotting two opposite 3-dimensional cubes in different colors (blue and 153 *red): 154 * 155 *SIZE=1000,1000 SCALE=-1,1 HEADER= XDIV=0,1,0 YDIV=0,1,0 FRAME=0 156 *XLABEL= YLABEL= 157 * *blue=[color(1,1,0,0)],1 *red=[color(0,1,1,0)],1 158 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,B12.PS IND=X1,-0.5 LINE=*blue 159 +PLOT KB.MAT,dim1,dim2 / DEVICE=PS,C12.PS IND=X1,0.5 LINE=*red 160 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,B13.PS IND=X1,-0.5 LINE=*blue 161 +PLOT KB.MAT,dim1,dim3 / DEVICE=PS,C13.PS IND=X1,0.5 LINE=*red 162 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,B14.PS IND=X1,-0.5 LINE=*blue 163 +PLOT KB.MAT,dim1,dim4 / DEVICE=PS,C14.PS IND=X1,0.5 LINE=*red 164 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,B23.PS IND=X1,-0.5 LINE=*blue 165 +PLOT KB.MAT,dim2,dim3 / DEVICE=PS,C23.PS IND=X1,0.5 LINE=*red 166 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,B24.PS IND=X1,-0.5 LINE=*blue 167 +PLOT KB.MAT,dim2,dim4 / DEVICE=PS,C24.PS IND=X1,0.5 LINE=*red 168 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,B34.PS IND=X1,-0.5 LINE=*blue 169 +PLOT KB.MAT,dim3,dim4 / DEVICE=PS,C34.PS IND=X1,0.5 LINE=*red 170 * 171 *Coloring the projections: 172 +EPS JOIN K12,A12,B12,C12 173 +EPS JOIN K13,A13,B13,C13 174 +EPS JOIN K14,A14,B14,C14 175 +EPS JOIN K23,A23,B23,C23 176 +EPS JOIN K24,A24,B24,C24 177 +EPS JOIN K34,A34,B34,C34 178 * 179 *Entering coordinates for projections in the final setup: 180 *K12=K12,0,2000 181 *K13=K13,0,1000 K23=K23,1000,1000 182 *K14=K14 K24=K24,1000,0 K34=K34,2000,0 183 * 184 *Combining the parts: 185 +EPS JOIN CUB4,K12,K13,K14,K23,K24,K34 186 * 187 *Creating the result 188 *as a PostScript file: 189 +PRINT CUR+1,X TO Cube4.PS 190 % 1500 191 - picture CUB4.PS,*,*,0.47,0.47 192 X 193 *Making and displaying 194 *a PDF file Cube4.PDF: 195 +/GS-PDF Cube4.PS 196 * 197 *The result is converted 198 *and displayed here 199 *as an EMF file. 200 *

The previous extract from an edit field is a typical example
of templates created for advanced applications.
It is one of the options of Survo for
self documenting
and
literate programming
available as essential features of the editorial approach since 1979.

Another example related to hypercubes

In the
appendix (pp. 181-)
it is also shown e.g. that the number of m-cubes in an n-cube
is K(n,m)=C(n,m)2^(n-m), m=0,1,2,...,n. Thus, for example, the number of
edges in a cube is C(3,1)*2^(3-1)=12 and the number of cubes in a
4-dimensional cube is C(4,3)*2^(4-3)=8. These 8 cubes are shown below
as 2-dimensional projections to first two coordinate axes.

The first pair is the same as in the 1-2 plot defined on lines 157-159 in the template above and the remaining three pairs are obtained by changing X1 on lines 158 and 159 to X2,X3,X4, respectively.

The generating function of K(n,n-m) numbers is f(s)=(s+2)^n in the same way as (s+1)^n is the generating function of the binomial coefficients C(n,m). The total number of "parts": vertices (m=0), edges (m=1), faces (m=2), cubes (m=3), etc. in an n-dimensional cube is then f(1)=(2+1)^n=3^n.

Copies of various items in the edit field can be made in various ways.

Traditional means are the COPY command, the key alt-F4 for rectangular
blocks, and the key alt-F2 for text.

1. Click the word to be copied by the
For 'words' (contiguous strings separated by blanks) the best method
from version 3.37 onwards is based on two mouse-clicks:

2. Select the place where to copy the word by the

Immediately after the first copy, more copies can be made by the
leftmost mouse button.

If the mouse is pointing at a blank space between existing 'words', the copy is inserted between these words.

If the mouse is pointing at a 'word' (a non-blank character), this 'word' is replaced by the copy.

3. The copying process is terminated by the DEL key.

### Edges and diagonals of a regular n-sided polygon

Example #77 (Start the demo by clicking the picture!)

### Examples from a presentation in 1987

Example #78 (Start the demo by clicking the picture!)

### Thurstone's box problem

Example #79 (Start the demo by clicking the picture!)

If the mouse is pointing at a blank space between existing 'words', the copy is inserted between these words.

If the mouse is pointing at a 'word' (a non-blank character), this 'word' is replaced by the copy.

3. The copying process is terminated by the DEL key.

- - - - - - - - - -

When using Survo there is no absolute need for working with a mouse.
For many people, operating with a classical mouse is a nuisance causing
physical stress and pain.

For many years, at last for me, a 'RollerMouse' (coupled with the splendid
IBM PC/AT
keyboard) has been much better
as a pointing device. When using it there is no need to
move hands or wrist away from the keyboard. All mouse functions can be
executed by minimal moves of the fingertips.

By means of Survo and Mathematica I have found certain properties related to lengths of
edges and diagonals of a regular n-sided polygon inscribed in a unit
circle.

In particular, in this demo it was shown experimentally that in the regular heptagon the squared lengths of the chords are roots of an algebraic equation

X^3-7*X^2+14*X-7=0 (known already by Kepler).

In general, I found experimentally that the squared lengths of the diagonals in a regular n-sided polygon are roots of an algebraic equation with coefficients as simple expressions of binomial coefficients.

These results were formally proved in my paper with Pentti Haukkanen and Jorma Merikoski.

In particular, in this demo it was shown experimentally that in the regular heptagon the squared lengths of the chords are roots of an algebraic equation

X^3-7*X^2+14*X-7=0 (known already by Kepler).

In general, I found experimentally that the squared lengths of the diagonals in a regular n-sided polygon are roots of an algebraic equation with coefficients as simple expressions of binomial coefficients.

These results were formally proved in my paper with Pentti Haukkanen and Jorma Merikoski.

See also

http://www.survo.fi/papers/Polygons2013.pdf

http://www.survo.fi/papers/Roots2013.pdf.

and

Mustonen, S., Haukkanen, P., Merikoski, J. (2014).

Some polynomials associated with regular polygons.

Acta Univ. Sapientiae, Mathematica, 6, 2, 178-193.

http://www.acta.sapientia.ro/acta-math/

More extensive demos about the same subject:

Equation for the sum of chord lengths in a regular polygon

Regular polygons: Solving riddle of q coefficients

Regular polygons: Testing roots

This demo in YouTube

This is partial reproduction of my talk
"Editorial approach in statistical computing"

in
the Second International Tampere Conference in Statistics (1987).

This demo gives two examples about usage of Survo.

In the original
video

many details related to these examples are difficult to see.

The last example '**Estimation of a circle**' of my talk is also available
in
YouTube.

The final paper in conference proceedings does not include these examples.

This demo in YouTube

The same in better resolution YouTube

Thurstone's problem is presented in a more general form. Values of the derived variables have substantial 'measurement' errors.

The Thurstone's original experiment is described, for example, in Richard L. Gorsuch: Factor Analysis pp. 10-11.

This demo in YouTube and also as a flash demo.

At first sight, one could assume that the number of lines going through
at least two points in an n x n regular grid could be
presented by a simple algebraic formula. However, this is not possible
since the number of lines with a given slope u/v (u,v integers) depends
essentially on the divisibility of integers v=2,3,...,n-1.
Therefore Euler's totient function plays an essential role in 'residual
terms' of the formulas.

Although these recursive formulas are more complicated than the direct double sum formula (presented in the beginning) they give results much faster. For example, already when n=10^4 recursive formulas are over 1000 times faster than the double sum formula. Furthermore, the recursive formulas are applied iteratively so that results are obtained at the same time also for all integers less than n and it is efficient to continue iteration for greater n values step by step on the basis of values L(n,n), L(n-1,n), and R1(n).

By these means I have computed L(n,n) values for all n <= 10^8 in 100 sequences of a million n values and it would take less than 3 hours on my current PC. The same task by using the double sum formula would last over 100'000 years!

In 2015 I extended this calculation for all n <= 10^11.

More information about this topic is given in Grid lines where the asymptotic behaviour of L(n,n) numbers is reported.

See also

This demo in YouTube

A typical Survo computation scheme (template) is presented for
testing the correlation coefficient by the Fisher z transformation.

This one of the oldest example (in 1981) used for demonstrating 'self-documenting' and 'literate programming' in the Survo Editor. These terms were unknown to me since apparently they were introduced later e.g. by Donald Knuth.

It should be noted that all information (formulas and data) needed for computation of test statistics is given within the text typed in the edit field.

This demo in YouTube

This demo was a part of my talk in

**The Eighth International Workshop on Matrices and Statistics (1999)**

at the University of Tampere, Finland.

The matrix interpreter is an essential tool for making extended
calculations from the results given by statistical Survo operations,
for example. As shown at the end of this demo, such operations often
give their results also as matrix files.

It is easy to convert a matrix file to a Survo data file by

FILE SAVE MAT <matrix_file> TO <Survo_data_file>and conversely any Survo data (table or file) to a matrix file by

MAT SAVE DATA <Survo_data> TO <matrix_file>but Survo matrix files (like COUNTRIES.MAT in this demo) can be used also as data in statistical operations.

The matrix interpreter is also useful for teaching methods related to linear models, for example.

The automatic labelling of matrix rows and columns has been possible already in the matrix interpreter of SURVO 76 (in 1977). It is important to notice the rules for labels in derived matrices. For example, labels are transposed not only when transposing a matrix but also when matrix is inverted, etc. A simple label 'algebra' ensures that in the matrix of regression coefficients the names of regressors appear as row labels and the names of regressands as column labels.

This demo in YouTube

This is a variation of an
earlier demo.
and displaying partially randomly selected forms. Each figure is a 2x2
setup of graphs from the family of curves defined by the following
plotting scheme in the Survo edit field: (This represents the first example above.)

GPLOT x(t)=X0+R*(sin(t)+r*sin(A*t)+r^2*sin(B*t)), y(t)=Y0+R*(cos(t)+r*cos(A*t)+r^2*cos(B*t)) r=(sqrt(5)-1)/2 s=1/(1+r+r^2) R=3*r/2 t=0,2*pi,pi/1000 pi=3.141592653589793 SCALE=-4,4 OUTFILE=A XDIV=0,1,0 YDIV=0,1,0 SIZE=381,381 HEADER= FRAME=3 MODE=381,381 i=0(1)3 a=5 b=15 T1=5;15,170,170 TEXTS=T1 PEN=[color(0.2,0.4,1,0)][SwissB(20)] FILL(-2)=0.9,0.6,0.3,0 LINETYPE=[line_width(2)],1 SLOW=300 A=x0*a+1 X0=2*x0 x0=int(sqrt(2)*(sin(pi/2*(i+0.5)))+0.5) B=y0*b+1 Y0=2*y0 y0=int(sqrt(2)*(cos(pi/2*(i+0.5)))+0.5) WSIZE=381,381 WHOME=800,0 WSTYLE=0 FRAMES=F F=0,0,381,381,-2 The setup of graphs corresponds to values (here A=5, B=15) in this way: -A,B A,B -A,-B A,-B

The parameters (a,b,PEN,FILL,LINETYPE, etc.) in the plotting scheme
are controlled by a sucro (Survo macro) taking the values from a matrix file

MATRIX AB /// a b C M Y c m y L W S 1 5 15 0.200 0.400 1.000 0.900 0.600 0.300 1 40 300 2 0 0 1.000 1.000 1.000 0.000 0.000 0.000 1 7 50 3 1 1 1.000 1.000 1.000 0.000 0.000 0.000 1 4 20 4 2 2 1.000 1.000 1.000 0.000 0.000 0.000 1 3 10 5 3 3 1.000 1.000 1.000 0.000 0.000 0.000 1 2 5 ... .. .. ..... ..... ..... ..... ..... ..... . .. ... 95 9 7 0.298 0.498 1.000 0.872 0.605 0.204 1 1 0 96 9 15 0.260 0.397 1.000 0.858 0.655 0.175 1 1 0 97 11 32 0.323 0.374 1.000 0.917 0.711 0.211 1 1 0 98 16 96 0.313 0.427 1.000 0.838 0.537 0.237 1 1 0 99 45 50 0.170 0.401 1.000 0.918 0.640 0.245 1 8 100 100 10 60 1.000 1.000 1.000 0.000 0.000 0.000 1 25 300

This demo in YouTube

MNSIMUL in Survo is a general tool for generating observations from
multivariate normal distribution with a given matrix of correlations
and vectors of expected values and standard deviations.

This is a replicate of a my flash demo created in 2006. The most significant distinction is that the computation times (including loading and saving) measured in Survo by TIME COUNT START - TIME COUNT END commands were now only about a third of those obtained eight years earlier. There are no changes in the program code but PC's have become somewhat faster.

This demo in YouTube

As a special case of the central limit theorem it is shown how
the distribution of
linear combinations of independent, uniformly distributed variables
tends to multivariate normal distribution.

Throughout the calculations the matrix interpreter of Survo is used.

The simple formula for the correlation coefficient in the current case is derived as follows:

It is also noteworthy that the value of of the correlation coefficient
depends on the parameters m and s only through their ratio m/s.

A related demo: Genesis of multivariate normal distribution 2
According to my experience, in general, the most clear-cut way to define multivariate normal distribution is by a linear transformation of independent N(0,1) variables.

This demo in YouTube

Defining
the multivariate normal distribution through a linear
transformation of independent N(0,1) variables has been my favorite for
a long time.
This definition is much more comprehensible in teaching
(at least on undergraduate level) than starting from the density or
characteristic function.

For example, from this standpoint one can readily understand that there can be only linear dependencies between component variables, or see that its marginal and conditional distributions are (multi)normal.

In my lecture notes on multivariate statistical methods (in Finnish) almost everything is derived on this basis without a need for working with integrals, etc. The creation of two-dimensional normal normal distribution is characterized there (p.16) in this way

The main tool in calculations is here the matrix interpreter of Survo.

Graphics were generated by Survo plotting schemes in the following way:

(here the first figure Z of a sample of N2(0,I) distribution)

* *Matrix Z of independent N(0,1) variables: *MAT Z=ZER(100000,2) / Z is a data frame. *MAT #TRANSFORM Z BY #RAND(20140) / Fill with uniform[0,1] values, *MAT #TRANSFORM Z BY probit(X#) / convert to independent N(0,1) values * A Common specifications: *WHOME=271,0 WSIZE=381,381 HEADER= XDIV=0,1,0 YDIV=0,1,0 XLABEL= YLABEL= *WSTYLE=0 FRAME=3 LINETYPE=[line_width(3)],1 POINT=11 MODE=1024,1024 *SCALE=-5,5 B *.................................. *Coordinate axes: (common backgroud for each graph) *GPLOT X(t)=c*t,Y(t)=(1-c)*t / SPECS=A,B *c=0,1,1 t=-9,9,9 *OUTFILE=Z0 *.................................. *Scatter diagram of two independent N(0,1) variables, 100'000 cases *GPLOT Z.MAT,1,2 / SPECS=A,B CONTOUR=[RED],0.001,0.5 *PEN=[BLACK][SwissB(100)] TEXTS=T T=Z,20,900 *INFILE=Z0 OUTFILE=Z1 *

A related demo: Genesis of multivariate normal distribution 1

This demo in YouTube

FILE MEDIT is the most extensive program in SURVO MM for
displaying and editing of data files.

Data of many variables are displayed on a set of consecutive pages defined automatically or specified by the user.

Each page may contain fields for variables and free-format textual comments. Comments may be conditional, depending on the observation at hand.

Also derived fields containing functions of original variables may appear.

Sound effects, voice comments and graphical displays can be inserted. General checking facilities for data integrity are provided. Also various search facilities are available.

A related demo: Temperature in Helsinki
Data of many variables are displayed on a set of consecutive pages defined automatically or specified by the user.

Each page may contain fields for variables and free-format textual comments. Comments may be conditional, depending on the observation at hand.

Also derived fields containing functions of original variables may appear.

Sound effects, voice comments and graphical displays can be inserted. General checking facilities for data integrity are provided. Also various search facilities are available.

In the second half of this demo (displaying graphs), another Survo session is called to create graphics by PLOT commands. The instructions for the entire application are given in an edit field HELTERA3.EDT

This demo in YouTube

I started my work with computers in 1960 by programming statistical
software for the Elliott 803 computer. One of the special devices
in this computer was a tiny loudpeaker which received a pulse
each time when the program executed a jump instruction.
Since computer programs always contain many jumps creating loops of
fixed or variable lengths,
each program generated a characteristic sequence
of sounds.

This was a useful feature because at that time a typical statistical analysis would take a long time, often more than ten minutes, and one learned to recognize the various stages of standard programs just by listening the sound.

During my summer holiday in 1962 I decided to create a program just for producing sound sequences created more or less randomly. The program included subroutines for random 'trills' and 'glissandos', for example. It was also able to make random 'variations' on a 'theme' given by the operator from the keyboard or by the program itself.

It may have been the first program for both generating 'music' and
playing it in real time. I had a rough estimate that it would
take about 10^50 years before the program starts to repeat
itself **☺**.

The most serious drawback was that the highest tone
was only about H=1135 Hz (caused by the shortest possible loop) and all
other tones were H/2, H/3, H/4, ... Hz so that the scale of tones
was primitive indeed.

This example contains small captions of the sound output generated
by this program. The samples were taken by Erkki Kurenniemi on a recorder
of the Department of Music in the University of Helsinki in 1962.
They are now available on a CD "On-Off" produced by Petri Kuljuntausta.

See also
Peter Onion's video
about my program in YouTube. In this video the program code punched
on a paper tape is fed into the ferrite core memory of Elliott 803
and the program starts
immediately thereafter according to default settings.

This demo in YouTube

Use that YouTube version if you want to pause and/or slow down. Then
you have time for a more detailed study.

In the snapshot above numbers Wfirst=-0.4 and Wnext=1.4 are to be compared by an if statement on line 36. Since now Wfirst is not greater than Wnext, the program will continue on line 37 and inserts -0.4 in the beginning of line 45.

By a new 'echo' option it is possible to watch how a sucro program is working. This is a useful feature for debugging a sucro program and for teaching sucro programming.

The 'echo' option is available on various levels, the most stringent one demonstrated here. On this level the sucro code has to be visible in the Survo main window when the sucro is running and the sucro has to be saved just before it is activated with the ECHO2 parameter without scrolling the main window.

If this is not possible, the parameter ECHO has to be used and then only the information displayed on the bottom line, current command and values of (selected) variables, is available.

In the snapshot above numbers Wfirst=-0.4 and Wnext=1.4 are to be compared by an if statement on line 36. Since now Wfirst is not greater than Wnext, the program will continue on line 37 and inserts -0.4 in the beginning of line 45.

By a new 'echo' option it is possible to watch how a sucro program is working. This is a useful feature for debugging a sucro program and for teaching sucro programming.

The 'echo' option is available on various levels, the most stringent one demonstrated here. On this level the sucro code has to be visible in the Survo main window when the sucro is running and the sucro has to be saved just before it is activated with the ECHO2 parameter without scrolling the main window.

If this is not possible, the parameter ECHO has to be used and then only the information displayed on the bottom line, current command and values of (selected) variables, is available.

*TUTSAVE SORT / /SORT sorts numbers below the command line in ascending order. / Defining variables: / def Wfirst Wnext Wr Wc Wc2 / / Initialization and setting maximum speed: *{init}{tempo 0} * / Making room for the sorted list: *{R}{ins line} / / Setting a 'wall' |: * | / / Setting a space in front of the original list: *{line start}{u}{ins} {ins}{l}{Wc=1} / / Finding next number and recording its location: + A: {next word}{ref set 1}{save cursor Wr,Wc2} / / If no next number found, going to End: - if Wc2 = Wc then goto End / / Saving new number: *{Wc=Wc2}{save word Wfirst}{R} / / Finding next word on result line: + B: {next word}{save word Wnext} / / If it is the wall, inserting value of Wnext to the end: - if Wnext '=' | then goto C / / Repeating from B if proper location for Wfirst not yet found: - if Wfirst > Wnext then goto B / / Writing newest number to its place in the sorted list: + C: {ins}{write Wfirst} {ins}{ref jump 1}{goto A} / / Finishing by removing the wall and the original list: + End: {R}{del}{line end}{l}{del}{line start}{u}{del line} *{end}

This demo in YouTube

All prime numbers below 10^6 are found by means of a sucro /SIEVE
using a simple sieve method.

A crucial step in the elimination of composite numbers is application of a new extended form of the SET command where an extra parameter (default is 1) gives a gap between lines to be treated.

A crucial step in the elimination of composite numbers is application of a new extended form of the SET command where an extra parameter (default is 1) gives a gap between lines to be treated.

That command appears on the line 33 above in the form

SET A+{write Wstart},A+{write Wmax},A,{write Wprime}and it takes actual forms

SET A+4,A+1000000,A-1,2 SET A+9,A+1000000,A-1,3 SET A+25,A+1000000,A-1,5 SET A+49,A+1000000,A-1,7in the four first rounds.

The speed of this process is much higher than in the older example, but even here the fact that integers are represented as character strings and converted to double precison floating point numbers slows down the computation dramatically.

Then even this approach is very slow when compared a corresponding task carried out by a pure C program presented as a Survo command SIEVE below. Finding of all primes below a million and loading them to the edit field (see lines 80- in the C code below) takes only 0.245 seconds and is thus about 100 times faster.

Thus in pure numerical computations the sucro technique is inefficient. Sucros are at their best when makining tutorials like these demos or when the task at hand is a sequence of standard Survo operations and the same job has to be repeated many times from various initial conditions. See, for example Combining Survo operations by sucros

1 *SAVE SIEVE 2 * 3 a/* _sieve.c 8.4.2015/SM (8.4.2015) */ 4 * 5 *#include <stdio.h> 6 *#include <stdlib.h> 7 *#include <malloc.h> 8 *#include <math.h> 9 *#include <survo.h> 10 *#include <survoext.h> 11 * 12 *unsigned int n; 13 *char *prime; 14 *int j,h; 15 * 16 *void main(argc,argv) 17 *int argc; char *argv[]; 18 * { 19 * unsigned int i,n,max,p,count,output; 20 * if (argc==1) return; 21 * s_init(argv[1]); // initializing Survo environment for this program 22 * if (g<2) 23 * { 24 * sur_print("\nUsage: SIEVE N [output=0,1,2]"); 25 * WAIT; return; 26 * } 27 * n=atoi(word[1]); 28 * prime=(char *)malloc(n+1); // reserving space for prime indicators 29 * for (i=0; i<n+1; ++i) prime[i]='1'; // at the start all are primes 30 * max=(int)sqrt((double)n); // max number to be tested is sqrt(n) 31 * p=1; output=2; 32 * if (g>2) output=atoi(word[2]); // selecting scope of output 33 * while (p<max) 34 * { 35 * ++p; 36 * while (prime[p]=='0') ++p; // finding next prime p 37 * for (i=p*p; i<=n; i+=p) prime[i]='0'; // multiples of p composite 38 * } // all primes found 39 * count=0; 40 * j=r1+r; new_line(); 41 * for (i=2; i<n+1; ++i) 42 * if (prime[i]=='1') 43 * { 44 * ++count; // counting number of primes 45 * if (output==2) 46 * { 47 * h+=sprintf(sbuf+h,"%u ",i); // collecting primes on a line 48 * if (h>c-7) // visible line length - 7 49 * { 50 * out(); 51 * new_line(); 52 * } 53 * } 54 * } 55 * if (output>0) 56 * { 57 * if (output==2) out(); 58 * new_line(); 59 * sprintf(sbuf,"Number of primes < %u is %u.",n,count); 60 * out(); 61 * s_end(argv[1]); // output to be catched by the editor 62 * } 63 * return; 64 * } 65 * 66 *new_line() 67 * { 68 * ++j; h=0; *sbuf=EOS; // output=2 69 * return(1); 70 * } 71 * 72 *out() 73 * { 74 * edwrite(space,j,1); 75 * edwrite(sbuf,j,1); 76 * return(1); 77 * } 78 A 79 * 80 *TIME COUNT START / Continuous activation by F2 ESC 81 *SIEVE 1000000 82 *TIME COUNT END 0.245 83 *2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 84 *97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 85 *181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 86 *277 281 283 293 307 311 313 317 331 337 347 349 353 359 367 373 379 87 *383 389 397 401 409 419 421 431 433 439 443 449 457 461 463 467 479 88 *487 491 499 503 509 521 523 541 547 557 563 569 571 577 587 593 599 89 *601 607 613 617 619 631 641 643 647 653 659 661 673 677 683 691 701 90 *709 719 727 733 739 743 751 757 761 769 773 787 797 809 811 821 823 91 *827 829 839 853 857 859 863 877 881 883 887 907 911 919 929 937 941 92 *947 953 967 971 977 983 991 997 1009 1013 1019 1021 1031 1033 1039 - - - - - - - - - - - 7817 *999529 999541 999553 999563 999599 999611 999613 999623 999631 999653 7818 *999667 999671 999683 999721 999727 999749 999763 999769 999773 999809 7819 *999853 999863 999883 999907 999917 999931 999953 999959 999961 999979 7820 *999983 7821 *Number of primes < 1000000 is 78498. 7822 *

This demo in YouTube

This as flash demo

In many demanding Survo applications sucros play an essential role.
When the user finds out that a certain task consisting of a series
of Survo operations is encountered repeatedly, it is profitable to
let Survo to save all the actions belonging to that task in the
tutorial mode in a sucro file.
Usually such a sucro needs some editing and polishing. This is
done easily by loading the sucro code into the edit field (TUTLOAD)
and after modifications by saving the code back to a sucro file
(TUTSAVE).

Practically all demos in this collection have been created as sucros.

For example, the entire sucro code of this demo is:

11 * 12 */M 13 *TUTSAVE M 14 *{tempo -1}{init}{jump 1,1,1,1}SCRATCH {act}{line start} 15 / 16 *COLX W20{act}{line start}{erase}{tempo 2}{wait 100}{tempo -1} 17 *{R} 18 * {form7} Combining Survo operations by sucros {R} 19 *{R} 20 *Sucros are at their best when the task at hand is a sequence of{R} 21 *standard Survo operations and the same job has to be repeated{R} 22 *many times from various initial conditions.{R} 23 *For example, sucros have been created for performing some multistage{R} 24 *forms of statistical analysis.{R} 25 *Here a sucro /FACTOR is presented. It carries out the standard steps{R} 26 *of factor analysis:{R} 27 *{R} 28 *1. Computing correlations CORR.M{R} 29 *2. Computing eigenvalues by spectral decomposition of CORR.M{R} 30 *3. The number of factors f is determined as follows:{R} 31 * Eigenvalues e(1)>=e(2)>=e(3)>=...{R} 32 * Ratios s(i)=e(i+1)/e(i), if e(i)>=0.9, s(i)=1 else{R} 33 * Let e(j)>=1 and e(j+1)=<1 and s(k)=min(s(j),s(j+1),s(j+2)){R} 34 * Then f=k.{R} 35 *4. Computing the maximum likelihood solution FACT.M by FACTA{R} 36 *5. Computing the rotated factor matrix AFACT.M by ROTATE{R} 37 *{R} 38 */FACTOR is now applied to the dataset DECA on the 48 best athletes{R} 39 *of the world in 1973. 40 / 41 *{tempo 2}{90}{R} 42 *{d5}{tempo 0}{d14}{u19}{tempo 2}{10} 43 * 44 *The 10 event{tempo 0} variables will be considered 45 * and thus set active in DECA:{tempo 2}{20}{R} 46 *FILE ACTIVATE DECA{keys 2}{act} 47 / 48 *--AAAAAAAAAA--{exit} 49 / 50 *{keys 0}{10}{R} 51 *Sucro /FACTOR{tempo 0} is activated with dataset DECA: 52 *{tempo 2}{10}{R} 53 */FACTOR DECA{keys 2}{act}{keys 0}{30} 54 * 55 *{d14}{20}{d14}{20}{del line}{d21}{u18}{10} 56 / 57 *The rotated{tempo 0} factor matrix is made 58 * more informative by another sucro:{tempo 2}{20} 59 *{R} 60 */LOADFACT{keys 2}{act}{keys 0}{30} 61 / 62 *{d17}{u2} 63 *Typically{tempo 0} for Survo, the work has documented itself and it may be {R} 64 *repeated.{tempo 2}{20} 65 * Now e.g. {tempo 0}a four-factor solution can be obtained 'manually' 66 *{R} 67 *and another rotation technique can be adopted: 68 *{tempo 2}{30}{home}{5}{u55}{5}{r13}{10}4{r}35 {keys 2}{act}{keys 0} 69 / 70 *{10}{d}{l6}4{r}49{erase}{5} / ROTATION=ORTHO_CLF (by Jennrich) 71 *{keys 2}{act}{keys 0}{10}{R} 72 *{d28}{20}{d22}{u13}{10}SCRATCH{10}{act}{10}{R} 73 *{u2}{keys 2}{act}{keys 0}{end} 74 *

The source code of each demo in this collection lives in its own folder
and therefore a fixed short name **M** for the file (see lines 12,13 above)
is selected. The first lines (14-19), except the header text, are common
for all these demos.

/FACTOR has the following code when loaded into the edit field:
In sucros intended for teaching or demonstrating it is important to
regulate timing of the process.
This takes place by **wait** codes ( **{wait 20}** or simply
**{20}** means
a wait for two seconds ) and **tempo** codes
( **{tempo 0}** sets the fastest
speed for 'writing' and **{tempo 2}** a normal speed ).

An essential part of sucro programming is the 'cursor choreography'
i.e. how the cursor is moved in the edit field. Also this sucro
contains plenty of codes like **{d14}** (14 steps downwards) or
**{r6}** (6 steps to the right).

The **key** codes control echoing of key strokes in the lower right corner
of the Survo window. **{key 2}** starts echoing
and **{key 0}** cancels it. This property is used on lines 46-50 when
selecting active variables from DECA and in most activations of
Survo commands.

Sucro /FACTOR used here as a 'subroutine' has a different nature as
a tool for a rapid automatic execution of the typical stages of factor
analysis. There are also conditional statements for determination
of the number of factors, for example.

11 * 12 *TUTLOAD <Survo>\S\FACTOR 13 / /FACTOR <data> / 10.6.1991/SM (13.5.1994) 14 / or /FACTOR <data>,<number_of_factors> 15 *{tempo -1}{init}{R} 16 *SCRATCH {act}{home} 17 - if W1 '=' ? then goto A 18 - if W1 '<>' (empty) then goto S 19 + A: /FACTOR <data>{R} 20 *makes a factor analysis from active variables and observations of{R} 21 *a Survo data <data>.{R} 22 *The steps of analysis are:{R} 23 *1. Computing correlations CORR.M{R} 24 *2. Computing eigenvalues by spectral decomposition of CORR.M{R} 25 *3. The number of factors f is determined as follows:{R} 26 * Eigenvalues e(1)>=e(2)>=e(3)>=...{R} 27 * Ratios s(i)=e(i+1)/e(i), if e(i)>=0.9{R} 28 * s(i)=1 else.{R} 29 * Let e(j)>=1 and e(j+1)=<1 and s(k)=min(s(j),s(j+1),s(j+2)){R} 30 * Then f=k.{R} 31 *4. Computing the maximum likelihood solution FACT.M by FACTA{R} 32 *5. Computing the rotated factor matrix AFACT.M by ROTATE{R} 33 *{R} 34 *The user can also enter the number of factors f by activating{R} 35 */FACTOR <data>,f{R} 36 *{goto E} 37 / 38 + S: CORR {print W1}{act} / Correlation matrix saved as CORR.M{R} 39 - if W2 > 0 then goto FAC 40 *MAT SPECTRAL DECOMPOSITION OF CORR.M TO &S,&D{act}{R} 41 *MAT DIM &S{act}{find =} {save word W3} 42 - if W3 = 1 then goto F 43 *{home}{erase}{ref}MAT LOAD &D,CUR+1{act}{R} 44 *{d2}{W1=0} 45 + Next_line: {R} 46 *{W1=W1+1}{next word}{next word}{save word W2} 47 - if W2 >= 1 then goto Next_line 48 *{W4=W1-1} 49 - if W4 < W3 then goto D 50 + F: {R} 51 *{ins line}Not a proper correlation matrix for factor analysis! 52 *{goto E} 53 + D: {} 54 / 55 / def We1=W3 We2=W4 We3=W5 We4=W6 56 / def Wsmin=W7 Wf=W8 Ws=W9 57 *{u}{save word We1}{d}{save word We2}{d}{save word We3}{d} 58 *{save word We4}{Wsmin=We2/We1}{Wf=W1-1} 59 - if We2 < 0.9 then goto C 60 *{Ws=We3/We2} 61 - if Ws > Wsmin then goto B 62 *{Wsmin=Ws}{Wf=W1} 63 + B: {} 64 - if We3 < 0.9 then goto C 65 *{Ws=We4/We3} 66 - if Ws > Wsmin then goto C 67 *{Wsmin=Ws}{Wf=W1+1} 68 + C: {ref}{ref}{u}SCRATCH {act}{home}MAT &D=&D'{act}{home}{erase}MAT L 69 *OAD &D,12.12,CUR{act}{home}{del line}{erase}MAT KILL &*{act}{home} 70 *{erase}Eigenvalues of the correlation matrix CORR.M:{R} 71 *{del9}{R} 72 *{del9}{R} 73 *{goto FAC2} 74 / 75 + FAC: {Wf=W2} 76 + FAC2: {} 77 - if Wf = 1 then goto F 78 *FACTA CORR.M,{print Wf},END+2{act} / Factor matrix saved as FACT.M{R} 79 *{ins line}{u} 80 / 81 *ROTATE FACT.M,{print Wf},END+2{act} 82 / 83 * / Rotated factor matrix saved as AFACT.M{R} 84 + E: {tempo +1}{end} 85 *

This demo in YouTube

In principle the same technique for finding prime numbers is used
as in an
earlier demo.

Now instead of having a long column, the integers a listed compactly on edit lines and then permitting a better chance for viewing the process.

The sucro code with additional features (in red) used for tracing in the latter part of this demo can be studied here:

In this two-window mode, sucro SIEVE is saved by a sucro command /TUTSAVE (on line 24 above) and with parameter ECHO2.

Now instead of having a long column, the integers a listed compactly on edit lines and then permitting a better chance for viewing the process.

The sucro code with additional features (in red) used for tracing in the latter part of this demo can be studied here:

In this two-window mode, sucro SIEVE is saved by a sucro command /TUTSAVE (on line 24 above) and with parameter ECHO2.

It is possible to omit echoing for selected parts of the code by
control codes {-} and {+} appearing here on lines 34 and 41.

Thus the code on lines from 35 to 40 (used for finding next number
after the newest prime in the list) is not echoed.

This demo in YouTube

An old application presented in

(1965). Multiple Discriminant Analysis in Linguistic Problems. Statistical Methods in Linguistics, 4, 37-44.

and here is the result when the same words are plotted according to
this new experiment
(1965). Multiple Discriminant Analysis in Linguistic Problems. Statistical Methods in Linguistics, 4, 37-44.

is revisited now 50 years later by using a tenfold dataset.

Systematic samples of 3000 words from each of languages Finnish,
Swedish, and English are collected from word lists
Finnish,
Swedish, and
English

When creating 43 numerical variables, some of them are based on (Finnish) hyphenation of a word. For this purpose a new key combination (F1 T) was introduced (in SURVO MM) for hyphenating the word touched by the cursor in the edit field.

In the old experiment certain words were plotted in the two-dimensional discriminant space as presented in this picture

In the latter picture the vertical axis had to be reversed for a proper comparison.

My early (1965) experiment has been described recently (2013) by
Steve Pepper.

This demo in YouTube

The main source of this demo is my joint
paper
with Jari Pakkanen.

Certain probabilities related to preserved columns of a ruined Temple of Zeus in Lambrounda are obtained by using editorial computing in Survo.

This demo in YouTube

A study related to my joint
paper
with Jari Pakkanen is continued.

The values of probabilities obtained in the previous demo are compared to empirical frequencies now obtained by simulation.

This demo in YouTube

This study originated from a practical research problem of determining
the distribution of the distances travelled in Finland by post parcels
in early 1960ies. A random sample of postal traffic in Finland was
collected for this and many other purposes.

I was then working as an assistant of Professor Leo Törnqvist and
this practical problem gave us an idea for a more theoretical
problem of deriving the distribution of the distance between
two points chosen uniformly randomly in a metric network.
Törnqvist achieved the result, according to his phenomenal intuitive
thinking, without any formal derivation or proofs.
My task was to formalize the problem, verify the results, and generalize
them. This was done in my
doctoral thesis
in mathematics (1964).

Now I have selected a more practical approach. Due to enormous
progress in computing speed and capacity during 50 years, the
distributions can now be studied by plain simulations giving also more
possibilities for generalizations of the original problem.

Here a simple, regular network G(2) consisting of 2x2 unit squares is
studied for finding the density function of the length of the shortest
path (along the edges) between two random points selected according to
uniform distribution over the entire network of length 12 units.
The results were obtained by using results given in my dissertation.
In particular, the expected value of the distance is 5/3=1.666...

In the graph below the theoretical means for networks G(n), n=0,1,2,3,4,
are given

and it is obvious that generally the mean for G(n) is (2n+1)/3. When n grows, this mean divided by n approaches 2/3. This is validated by the fact that it is the expected value of the distance by city metrics between two random points inside a unit square i.e 2 times mean distance in G(0) = 1/3 as explained below.

When we got interested in this topic (in the beginning of 1960ies),
it was natural to start by calculating means for the simplest networks.
like a single edge G(0), or a ring corresponding to G(1).

In the latter case it is easy to see that the first point may be fixed
and thereafter it is seen immediately that the mean is one fourth
of the total length. The fact that for an edge of length 1 the mean is
just 1/3 requires more effort. My favourite (but a little heuristic)
explanation was: "After selecting two random points on the edge,
let's select a third random point. The probability that it falls
between the two earlier ones is 1/3 for symmetrical reasons.
Due to uniformity, the last point covers 1/3 of the total length
'on average'."

This statement can be even generalized: If n random points are selected
from an unit interval, the expected value of the distance of the
extreme points is (n-1)/(n+1). This is 'proved' just in the same way
by selecting once more.

A simple strict proof that for G(0) the mean is 1/3 goes as follows: Let x be he mean. By splitting the unit edge into two equal parts of lenghts 1/2, the probability of selecting the two point on the same half is 1/2 and from different halves also 1/2. The conditional means are 1/2 in the first case and x/2 in the second case. Then we get an equation x=1/2*1/2+1/2*x/2 wherefrom x=1/3. (This was presented by Hannu Väliaho.)

When preparing my PhD thesis I made a program in Elliott Autocode
for the Elliott 803B computer. That program created the exact
density function, but required a lot of computer time.
For example, the G(10) case took about 3 hours.

Now, by using the GDIST operation, approximate results (10^6
simulations) are obtained in less than 10 seconds on my current (2015)
PC. For example, I got the mean 6.995827 which is close enough to
the exact value 7.

The GDIST program calculates shortest distances between any points
on the edges of the network as follows. At first the distance matrix
between all end points is calculated by
Dijkstra's algorithm.
and then, after selecting random points, the various distances through
the end points of the corresponding edges (4 alternatives) are
considered and the minimal distance is selected. As a special case,
the points may be selected on the same edge. Then as the fifth
alternative the distance between the points on that edge must be
considered, too. It should be noted that on 'curved' edges the
last alternative is not necessarily the shortest one.

Distance distributions (Twin cities bridge problem)

Distance distributions (n-dimensional cube)

This demo in YouTube

As a more 'practical' example the new GDIST operation is used here for
optimization of a traffic network. The mean distance is minimized
by selecting the best place for a second bridge for connecting two
parts of a city.

In the A matrix

MAT SAVE AS A 1 2 10 1 2 3 5 1 3 4 5 1 1 5 4 1 2 6 4 1 3 7 4 1 4 8 4 1 5 6 10 1 6 7 5 1 7 8 5 1 7 12 6 0 9 10 7 1 - - - -

the fourth element on each row tells the intensity of sending/receiving
traffic so that the probability of selecting a random point in the
network from a particular edge is proportional to the product of
its length and intensity. Thus in this example no journey has an origin
or a destination on the edge of a bridge (7 12 6 0).
The intensities can be any non-negative numbers.

Another generalization (not used in this example) is a possibility to add a fifth element (1 or 2) on A-rows implying the edges denoted by 1 as sending regions and edges denoted by 2 as receiving regions so that on traffic between two regions is to be considered.

When applying this possibility to this "twin-cities" so that only traffic between the upper and lower city is studied, in the original situation (no second bridge) the mean distance grows from 17.30 to 24.53 because the internal traffic in the two parts is excluded, but this modification does not change the optimal solution for the additional bridge.

It would be easy make modifications to the GDIST program so that certain gravitation principle is adopted (the probability for taking a journey is dependent on the distace). Obviously this feature can taken into account by a suitable transformation of the density function afterwards.

Distance distributions (Background information)

Distance distributions (n-dimensional cube)

This demo in YouTube

This is so far the only example where the distance distribution is
asymptotically normal. It is shown that in an n-dimensional unit cube
the distance (along the edges) between randomly selected vertices
is Bin(n/2,1/2). The distribution between random points on the
edges has similar properties but it is more complicated to handle.

The exact expected value for the distance (along the edges) between two random points on the edges of the 4-dimensional cube can be computed as follows: Because this 'network' is symmetric for each edge, it is sufficient to assume that the first random point is selected from a given edge. Each element (vertex, edge, square, cube) of this hypercube can be denoted by a string of the form abcd where each a,b,c, and d can be 0, 1, or x, where x covers the range (0,1). Thus for example, 0000 is the first vertex (origo), x000 is the edge from origo to (1,0,0,0), and xxx0 and x1xx are two of the eight cubes located in the hypercube. The mean distances between x000 and the other edges can be easily computed and they are presented in the following table. edge mean edge mean edge mean edge mean x000 1/3 0x00 1 00x0 1 000x 1 x100 5/3 1x00 1 10x0 1 100x 1 x010 5/3 0x10 2 01x0 2 010x 2 x110 8/3 1x10 2 11x0 2 110x 2 x001 5/3 0x01 2 00x1 2 001x 2 x101 8/3 1x01 2 10x1 2 101x 2 x011 8/3 0x11 3 01x1 3 011x 3 x111 11/3 1x11 3 11x1 3 111x 3 The means in the first column are related to 'opposite' edges of x000. According to terminology used in my dissertation (1964) they are mirror point sets for x000. It is clear that mean inside x000 is 1/3, but why the mean distance between random points on opposite edges is of a unit square square is 5/3 ? .----. | | | | .----. Let it be x. If two random points are selected from the edges of a unit square, the probablity that they are selected from the same edge is 1/4, from opposite edges similarly 1/4, and from neighbouring edges 1/2. The means are 1/3, x, and 1 respectively. Since the total mean in unit square is 1, we have an equation 1 = 1/4*1/3 + 1/4*x + 1/2*1 giving x=5/3. The remaining means in the first column are thereafter easy to comprehend. The means in the remaining three columns are obvious, since they are either adjacent to x000 or adjacent through a route of a constant integer length. Now the total expected value of the distance in the entire 4-dimensional cube is calculated as ((1*2+3*5+3*8+1*11-1)/3+2*3*(1*1+2*2+1*3))/(4*2^3)=2.03125 The general structure becomes still clearer in the 5-dimensional case where the expected value has the form ((1*2+4*5+6*8+4*11+1*14-1)/3+2*4*(1*1+3*2+3*3+1*4))/(5*2^4)=2.5291666666667 and 2.5291666666667(12:ratio)=607/240 (0.00000000000003) On the basis of these expressions it is obvious that in the n-dimensional case the expression for the mean distance can be presented in the form E(n)=((P(n+1)-1)/3+2*(n-1)*Q(n-2))/(n*2^(n-1)) Both P(n) and Q(n) are 'weighted' sums of binomial coefficients. In fact P-sequence is an inverse binomial transform of an arithmetic sequence 2,5,8,11,14,... and Q-sequence a similar transform of natural numbers 1,2,3,4,5,... The P(n) values for n=2,...,7 are n P(n) 2 1*2=2 3 1*2+1*5=7 4 1*2+2*5+1*8=20 5 1*2+3*5+3*8+1*11=52 6 1*2+4*5+6*8+4*11+1*14=128 7 1*2+5*5+10*8+10*11+5*14+1*17=304 By consulting OEIS (The On-Line Encyclopedia of Integer Sequences) it is found that the sequence 2,7,20,52,128,304,... is A066373 and P(n)=(3*n-2)*2^(n-3), n=2,3,... The Q(n) values for n=0,1,...,5 are n Q(n) 0 1*1=1 1 1*1+2*1=3 2 1*1+2*2+1*3=8 3 1*1+3*2+3*3+1*4=20 4 1*1+4*2+6*3+4*4+1*5=48 5 1*1+5*2+10*3+10*4+5*5+1*6=112 and OEIS tells that 1,3,8,20,48,112,... is A001792 and Q(n):=(n+2)*2^(n-1), n=0,1,... By substituting expressions of P(n+1) and Q(n-2) into the the previous formula of E(n) we obtain E(n)=(((3*n+1)*2^(n-2)-1)/3+2*(n-1)*n*2^(n-3))/(n*2^(n-1)) and this can be simplified into the form E(n)=n/2+(1-2^(2-n))/(6*n).

Distance distributions (Background information)

Distance distributions (Twin cities bridge problem)

Projections of a 4-dimensional cube

This demo in YouTube

I created this example as a
flash application
originally in 2006. There the weblink to Fisher's paper is not valid anymore.

Currently this paper "Bayes' theorem and the fourfold table" (1926) is available from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984620/.

Currently this paper "Bayes' theorem and the fourfold table" (1926) is available from https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2984620/.

This demo in YouTube

It is shown that although the pure minor triad is theoretically more
complicated than the corresponding major triad, this contrast is
essentially diminished when these triads are compared in equal
temperament. In the history of music the transition from meantone tuning
(with pure thirds) towards more practical temperaments has a substantial
role in the development of tonal western music, since most of the
composers have used keyboard instruments with a fixed intonation like
clavichord, harpsichord, organ or piano as tools in their work.

According to my mind, to the 'problem of the minor chord' this is a simpler solution than purely theoretical explanations presented by many musicologists.

The oscillograms for pure triads were created by the following plotting schemes of Survo:

According to my mind, to the 'problem of the minor chord' this is a simpler solution than purely theoretical explanations presented by many musicologists.

The oscillograms for pure triads were created by the following plotting schemes of Survo:

HEADER=[Swiss(20)],Major_triad_(pure) HOME=0,175 SIZE=649,174 XSCALE=0:_,10*pi:_ YSCALE=[SMALL],-3:_,3:_ pi=3.14159265 X=0,10*pi,pi/60 XDIV=29,600,20 FRAME=0 GPLOT Y(X)=sin(20*X)+sin(25*X)+sin(30*X) 20:25:30=4:5:6 ........................................................ HEADER=[Swiss(20)],Minor_triad_(pure) HOME=0,0 SIZE=649,174 XSCALE=0:_,10*pi:_ YSCALE=[SMALL],-3:_,3:_ pi=3.14159265 X=0,10*pi,pi/60 XDIV=29,600,20 FRAME=0 GPLOT Y(X)=sin(20*X)+sin(24*X)+sin(30*X) 20:24:30=10:12:15 ........................................................ In the corresponding tempered triads the proportions 20:25:30 and 20:24:30 were replaced by 20:20*2^(4/12):20*2^(7/12) and 20:20*2^(3/12):20*2^(7/12).

The sound file for the pure A Major triad was built in the standard Waveform Audio File (WAV) format as follows:

At first a Survo data file TEST of 100000 observations for an integer variable X was created by FILE MAKE TEST,1,100000,X,2 A fundamental tone of frequency F Hz on sampling rate RATE is described by a sinus function of the form S(x):=sin(x*ORDER*2*pi*F/RATE) where ORDER in the index of an observation in the data file TEST. When F=440, RATE=44100, and pi=3.141592653589793 S(1) will represent a sample of tone A=440 Hz and a sample of a pure A Major triad is saved in the data file TEST as variable X by the VAR operation VAR X=1000*(S(1)+S(5/4)+S(3/2)) TO TEST where 1000 gives the sound volume. Now this TEST file is converted to a WAV file MAJOR.WAV by the Survo operation PLAY DATA TEST,X / WAV=MAJOR and thereafter it can be played as a sound of length 100000/RATE=2.267... seconds by activating PLAY SOUND MAJOR The other triads were obtained by replacing S(1)+S(5/4)+S(3/2) in the VAR operation above by S(1)+S(6/5)+S(3/2) pure minor S(1)+S(2^(4/12))+S(2^(7/12)) tempered major S(1)+S(2^(3/12))+S(2^(7/12)) tempered minor

This demo in YouTube

In 1972 when reading the classical treatise "On the Sensations of Tone"
by Helmholtz (1863)

I noticed his
graphical presentation

on the consonance of various musical intervals based on practical
experiments on violin.

I formulated these results as a theoretical model as follows:

Minimum points of the dissonance function "Accuracy of the ear" c=5 c=4 c=3 c=2 Unison 1:1 1:1 1:1 1:1 Just minor semitone 18:17 Minor diatonic semitone 17:16 Just diatonic semitone 16:15 Septimal diatonic semitone 15:14 Lesser tridecimal 2:3-tone 14:13 Greater tridecimal 2:3-tone 13:12 Neutral second 12:11 Neutral second 11:10 Major second 10:9 10:9 Major second 9:8 9:8 Septimal major second 8:7 8:7 15:13 Septimal major third 7:6 7:6 13:11 Just minor third 6:5 6:5 6:5 Neutral third 11:9 Major third 5:4 5:4 5:4 Diminished major third 14:11 Septimal major third 9:7 9:7 Diminished fourth 13:10 Perfect fourth 4:3 4:3 4:3 4:3 15:11 Eleventh harmonic 11:8 Lesser septimal tritone 7:5 7:5 Greater septimal tritone 10:7 10:7 13:9 Inversion of 11th harmonic 16:11 Perfect fifth 3:2 3:2 3:2 3:2 17:11 Septimal minor sixth 14:9 Undecimal minor sixth 11:7 11:7 Just minor sixth 8:5 8:5 Tridecimal neutral sixth 13:8 Just major sixth 5:3 5:3 5:3 17:10 Septimal major sixth 12:7 Harmonic seventh 7:4 7:4 7:4 Small just minor seventh 16:9 Greater just minor seventh 9:5 9:5 Undecimal neutral seventh 11:6 11:6 13:7 Just major seventh 15:8 17:9 19:10 21:11 Octave 2:1 2:1 2:1 2:1

My simple intuitive approach to this problem is conceptully different from results presented by

Plomp, Levelt, and Terhardt, for example, based on psychoacoustics and auditory experiments.

An excellent source for this topic is the book

Tuning, Timbre, Spectrum, Scale (Second edition 2005) by William A. Sethares.

In Survo the same algorithm is available also for Rational approximations of decimal numbers by 'listening'.

This demo in YouTube

In 2001 I programmed a PLAY operation for making various acoustic
experiments in Survo.

This demo shows how synthetic WAV files are generated by using PLAY and Survo operations related to data files (FILE MAKE, VAR, SER).

For example, it was interesting to see how a sequence of random tones when tamed by a simple moving avarage transformation becomes more 'melodic' (corresponding to the Slutzky-Youle effect in economic time series).

This demo shows how synthetic WAV files are generated by using PLAY and Survo operations related to data files (FILE MAKE, VAR, SER).

For example, it was interesting to see how a sequence of random tones when tamed by a simple moving avarage transformation becomes more 'melodic' (corresponding to the Slutzky-Youle effect in economic time series).

This demo in YouTube

In Survo artificial sound files can be generated by defining waveforms
by means of the VAR function.
Combinations of trigonometric functions are suitable for such
applications.

The harmony of sounds is here distorted by jumps triggered by an int function (on edit lines 13,14) rounding numbers down to the nearest integer.

A similar effect plays an important role visually inThe harmony of sounds is here distorted by jumps triggered by an int function (on edit lines 13,14) rounding numbers down to the nearest integer.

Lissajous curve variation (Knitting a carpet)

This demo in YouTube

It is shown how the PLAY operation may be a useful tool for
finding structural features and for detecting outliers in statistical
data sets.

I created this feature in a restricted form in 1987 for SURVO 84C.

There it was working in connection with the FILE SHOW operation.

Series of values of any numerical variable in a Survo data (list, table, or file) can be converted to musical tones and played by the PLAY DATA operation.

A tone file TRIADS,TON have an essential role in this task. Each tone file is a standard text file with a specific structure. For example, the tone file TRIADS.TON was created in the Survo edit field in this way

(by activating the SAVEP command on line 11):

I created this feature in a restricted form in 1987 for SURVO 84C.

There it was working in connection with the FILE SHOW operation.

Series of values of any numerical variable in a Survo data (list, table, or file) can be converted to musical tones and played by the PLAY DATA operation.

A tone file TRIADS,TON have an essential role in this task. Each tone file is a standard text file with a specific structure. For example, the tone file TRIADS.TON was created in the Survo edit field in this way

(by activating the SAVEP command on line 11):

11 *SAVEP CUR+1,E,TRIADS.TON 12 *1 / Type 13 *44100 / Rate Hz 14 *11 / # of tones 15 *9 / # of partials 16 * 0 110 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 17 * 1 137 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 18 * 2 165 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 19 * 3 220 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 20 * 4 275 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 21 * 5 330 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 22 * 6 440 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 23 * 7 550 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 24 * 8 660 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 25 * 9 880 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 26 *10 1100 1 S1:.9 S2:.3 S3:.3 S4:.2 S5:.2 S6:.1 S7:.1 S8:.05 S9:.05 27 *

The four first lines (12-15) define the general structure if the file.

So far "Type" 1 is the only alternative. "Rate Hz" is the sampling rate used when creating Waveform File Format (WAV) audio files by PLAY DATA. "# of tones" is the number of tones defined in the tone file and "# of partials" gives the number of partials (fundamental tone + overtones) for each tone.

Each tone is defined on a line of its own (lines 16-26 above) and has the form

index 0,1,2,...,

fundamental tone in Hz,

relative volume (1 for each tone in this case),

list of partials with their relative frequencies as multipliers of
the fundamental tone and their relative powers.

For example, above the tone 6 is 440 Hz with relative volume 1 and in the list of partials S3:.3 denotes the second partial with a relative power 0.3.

Here 'S' refers to a sinusoidal partial.

In this case, all tones have the same composition of partials, but in general partials may be different for each tone and they need not to be integer multiples of the fundamental tone.

So far "Type" 1 is the only alternative. "Rate Hz" is the sampling rate used when creating Waveform File Format (WAV) audio files by PLAY DATA. "# of tones" is the number of tones defined in the tone file and "# of partials" gives the number of partials (fundamental tone + overtones) for each tone.

Each tone is defined on a line of its own (lines 16-26 above) and has the form

For example, above the tone 6 is 440 Hz with relative volume 1 and in the list of partials S3:.3 denotes the second partial with a relative power 0.3.

Here 'S' refers to a sinusoidal partial.

In this case, all tones have the same composition of partials, but in general partials may be different for each tone and they need not to be integer multiples of the fundamental tone.

This demo in YouTube

The entire statistical time series about the monthly mean temperature
and rainfall in Helsinki from 1829 to the end of 2015
(187 consecutive) years is here presented in 226 seconds as
a sequence of two voices.

Thus the time elapses here over 26 million times faster than the real time.

The snapshot above tells all details how this demo was generated.

For both voices the following simple "tone generator" TRIADS1.TON vas used (see the line 10 above). Its propertien are given at the end of theses comments.

The specifications Temp and Rain (lines 11-12) tell how voices related to Temp and Rain variables would appear.

The parameters in these definitions are in general:

data variable creating the voice (Temp, Rain)

tone generator (index from TONES specification)

channel selection (0=left 1=right)

relative volume (1,1)

decay rate of volume <= 1 (default 1 = no decay)

SCALING=1 (line 14) tells that the values of data variables in use are linearly mapped to the indices (0,1,2,...) of the corresponding tone generator. If it is not given, the values of variables must be readily integers 0,1,... within the scale of the corresponding tone generator.

The specifications on the line 15 determine observations of the data to be used (RECORDS), tempo of "music" (TEMPO), and general volume (VOLUME).

In this case I have wanted Survo to display the current year as a text during the fast flow of years.

This takes place by calling another Survo session to play the sound by 'activating' the PLAY DATA command (line 16) by a special key sucro (modification of PRE M Z).

Then the 'main' Survo is free to do anything during the play and in this case displays the years on line 20 by a simple sucro.

To get the sound and the display of the year synchronized, the TEMPO specification has been adjusted to value 596 just by trial and error.

Here are the properties of TRIADS1.TON:

The monthly statistics of the mean temperature and the rainfall are are presented below as plots of time series for the last 20 years.

Because during each year the temperature is varying rather systematically between low and high valules, it gives a clear cycle of 12 tones but the tones from the rainfall are typically on a low level and high tones are like random peaks (cuckoos singing in the rain:).

Thus the time elapses here over 26 million times faster than the real time.

The snapshot above tells all details how this demo was generated.

For both voices the following simple "tone generator" TRIADS1.TON vas used (see the line 10 above). Its propertien are given at the end of theses comments.

The specifications Temp and Rain (lines 11-12) tell how voices related to Temp and Rain variables would appear.

The parameters in these definitions are in general:

SCALING=1 (line 14) tells that the values of data variables in use are linearly mapped to the indices (0,1,2,...) of the corresponding tone generator. If it is not given, the values of variables must be readily integers 0,1,... within the scale of the corresponding tone generator.

The specifications on the line 15 determine observations of the data to be used (RECORDS), tempo of "music" (TEMPO), and general volume (VOLUME).

In this case I have wanted Survo to display the current year as a text during the fast flow of years.

This takes place by calling another Survo session to play the sound by 'activating' the PLAY DATA command (line 16) by a special key sucro (modification of PRE M Z).

Then the 'main' Survo is free to do anything during the play and in this case displays the years on line 20 by a simple sucro.

To get the sound and the display of the year synchronized, the TEMPO specification has been adjusted to value 596 just by trial and error.

Here are the properties of TRIADS1.TON:

11 *SAVEP CUR+1,E,TRIADS1.TON 12 *1 / Type 13 *44100 / Rate Hz 14 *11 / # of tones 15 *3 / # of partials 16 * 0 110 1 S1:1 S2:0.5 S3:0.25 17 * 1 137 1 S1:1 S2:0.5 S3:0.25 18 * 2 165 1 S1:1 S2:0.5 S3:0.25 19 * 3 220 1 S1:1 S2:0.5 S3:0.25 20 * 4 275 1 S1:1 S2:0.5 S3:0.25 21 * 5 330 1 S1:1 S2:0.5 S3:0.25 22 * 6 440 1 S1:1 S2:0.5 S3:0.25 23 * 7 550 1 S1:1 S2:0.5 S3:0.25 24 * 8 660 1 S1:1 S2:0.5 S3:0.25 25 * 9 880 1 S1:1 S2:0.5 S3:0.25 26 E10 1100 1 S1:1 S2:0.5 S3:0.25 27 *

The monthly statistics of the mean temperature and the rainfall are are presented below as plots of time series for the last 20 years.

Because during each year the temperature is varying rather systematically between low and high valules, it gives a clear cycle of 12 tones but the tones from the rainfall are typically on a low level and high tones are like random peaks (cuckoos singing in the rain:).

This demo in YouTube

In this recreation of SURVO 66 its essential features are reproduced.

The example is taken from Statistical programming language SURVO 66

by T.Alanko, S.Mustonen, M.Tienari (1968). BIT, 8, 69-85.

A more comprehensive document is

Tilastollinen tietojenkäsittelyjärjestelmä SURVO 66

(in Finnish) by S.Mustonen (1967).

The SURVO66 operation is mainly a historical account about the origin of Survo. However, in cases where a great number of two-dimensional tables of frequencies, means, and standard deviations has to be made from large data, the SURVO66 operation is more efficient than the TAB operation since everything is obtained in one pass of the data set.

SURVO66 does its job through three stages T1,T2, and T3:

T1: Reading and storing the program. Storage space is allocated for each operation and the sum locations are cleared.

T2: Reading the data so that only one observation is in the core memory at the same time. (This is typical in SURVO MM also.)

The entire SURVO66 program is obeyed for each observation and 'sufficient' statistics are collected.

T3: After all observations have been processed, the SURVO66 program is scanned once more. 'Sufficient' statistics are processed into final form of the output and saved as a text file SURVO66.TXT.

Functions of SURVO MM are used in certain operations of SURVO66, like output formatting in CORREL and computations and output of REGRAN (based on sufficient statistics from CORREL).

The example presented in this demo does not contain any conditional operations (IF, EQUAL, LESS, BETWEEN, OR, AND, NOT) available also in the SURVO66 operation.

See also Cross tabulations with SURVO66

Sources and recordings related to Elliott 803 computer:

Elliott 803

Peter Onion's video

Early sound experiment on Elliott 803 computer in 1962

Using Elliott 803 (in Finnish) (Martti Tienari and Seppo Mustonen)

The basic contents of this example:

### Cross tabulations with SURVO66

Example #107 (Start the demo by clicking the picture!)

### Printing a small document

Example #108 (Start the demo by clicking the picture!)

### Circle estimation

Example #109 (Start the demo by clicking the picture!)

The results are the same (to 3 decimal places) as those obtained by Maisonobe.

More information and another demo about ESTIMATE

See also: Estimation of nonlinear regression models

### Contour ellipses on a graph paper

Example #110 (Start the demo by clicking the picture!)

### Sampling from a discrete uniform distribution

Example #111 (Start the demo by clicking the picture!)

A new sucro /U_SAMPLE is used for creating scatter plots of such 'datasets' appearing as gradually growing histograms. The plotting process is slowed down by a SLOW specification in the plot setup. Thus SLOW=50, for example, makes the GPLOT operation to plot each observation 50 times.

This slowing feature has been used in earlier demos

A closed curve,

Lissajous curve variation,

Color changing

related to plotting families of curves. This feature is now available also in scatter plots on screen.

The syntax of the /U_SAMPLE sucro command is

/U_SAMPLE m,s,n,seed,slow

where

m is the number of outcomes U in a single trial,

s is number of independent outcomes U1,U2,...,Us giving U=U1+U2+...+Us,

seed is the seed value of the random number generator,

slow is the value of the SLOW specification.

### Merits of slow plotting

Example #112 (Start the demo by clicking the picture!)

### Tuning roots of algebraic equations by "listening"

Example #113 (Start the demo by clicking the picture!)

Operations on polynomials

Matrix operations

The example is taken from Statistical programming language SURVO 66

by T.Alanko, S.Mustonen, M.Tienari (1968). BIT, 8, 69-85.

A more comprehensive document is

Tilastollinen tietojenkäsittelyjärjestelmä SURVO 66

(in Finnish) by S.Mustonen (1967).

The SURVO66 operation is mainly a historical account about the origin of Survo. However, in cases where a great number of two-dimensional tables of frequencies, means, and standard deviations has to be made from large data, the SURVO66 operation is more efficient than the TAB operation since everything is obtained in one pass of the data set.

SURVO66 does its job through three stages T1,T2, and T3:

T1: Reading and storing the program. Storage space is allocated for each operation and the sum locations are cleared.

T2: Reading the data so that only one observation is in the core memory at the same time. (This is typical in SURVO MM also.)

The entire SURVO66 program is obeyed for each observation and 'sufficient' statistics are collected.

T3: After all observations have been processed, the SURVO66 program is scanned once more. 'Sufficient' statistics are processed into final form of the output and saved as a text file SURVO66.TXT.

Functions of SURVO MM are used in certain operations of SURVO66, like output formatting in CORREL and computations and output of REGRAN (based on sufficient statistics from CORREL).

The example presented in this demo does not contain any conditional operations (IF, EQUAL, LESS, BETWEEN, OR, AND, NOT) available also in the SURVO66 operation.

See also Cross tabulations with SURVO66

Sources and recordings related to Elliott 803 computer:

Elliott 803

Peter Onion's video

Early sound experiment on Elliott 803 computer in 1962

Using Elliott 803 (in Finnish) (Martti Tienari and Seppo Mustonen)

The basic contents of this example:

A data set from a statistical research by Dr. Knight on computer characteristics is used and it has the following contents as a text file KNIGHT.TXT (N=91): SHOW KNIGHT.TXT % % Date Scientific Commercial Inverse unit % introduced power (ops/sec) power (ops/sec) cost (sec/$) % month year 4 63 21420.000 9079.000 44.54 4 65 224374.000 118154.000 15.20 7 63 67660.000 23420.000 23.98 4 65 1768.000 990.500 230.90 1 63 68690.000 58880.000 13.86 4 65 68497.000 29571.000 103.90 -- -- ---------- ---------- ------ The main target in this example was to see how well Grosch's law P=kC^2 fits to Knight's data by using a linear model ln(P) = a0 + a1*ln(C) + a2*T where P is 1000 ops/sec, C is $/hour, and T is the age in months. This law states that a1 should be close to 2. The program code (with comments in parentheses) is: SURVO66 KNIGHT.TXT Evolving Computer performance 1963-1967 M@5 (number of variables) CALL@X1 MONTH (rename variables) @X2 YEAR DEF@MONTH L:1 U:12 (set limits) @YEAR L:63 U:67 DIV@SPEED X3 1000 (SPEED=X3/1000) DIV@COST 3600 X5 (COST=3600/X5) SUB@Y1 68 YEAR (Y1=68-YEAR) MULT@Y2 12 Y1 (Y2=12*Y1) SUB@AGE Y2 MONTH (AGE= age in months) LOG@LSPEED SPEED (LSPEED=log(SPEED)) @LCOST COST CLASS@COSTCL (classification COSTCL) CHEAP 30 (upper limit for CHEAP is 30) MODER 90 EXPNS 500 TABLE@YEAR - (column variable, default classes 63-67) DEVEL COST COSTCL (table DEVEL, row variable COST with COSTCL) T:SPEED (tables of means and stddevs of SPEED) CORREL@LSPEED LCOST AGE N:CORR (correlations of variables LSPEEED-AGE) REGRAN@LSPEED LCOST AGE N:CORR (regression analysis using CORR) END@ The original names X1,X2,... of the variables are renamed in the code and the code is activated by the SURVO66 command with the name of the data set (KNIGHT.TXT) as the only parameter: The results have been saved in a text file SURVO66.TXT. They are now loaded into the edit field by LOADP SURVO66.TXT Evolving Computer performance 1963-1967 M@5 (number of variables) CALL@X1 MONTH (rename variables) @X2 YEAR DEF@MONTH L:1 U:12 (set limits) @YEAR L:63 U:67 DIV@SPEED X3 1000 (SPEED=X3/1000) DIV@COST 3600 X5 (COST=3600/X5) SUB@Y1 68 YEAR (Y1=68-YEAR) MULT@Y2 12 Y1 (Y2=12*Y1) SUB@AGE Y2 MONTH (AGE= age in months) LOG@LSPEED SPEED (LSPEED=log(SPEED)) @LCOST COST CLASS@COSTCL (classification COSTCL) CHEAP 30 (upper limit for CHEAP is 30) MODER 90 EXPNS 500 TABLE@YEAR - (column variable, default classes 63-67) DEVEL COST COSTCL (table DEVEL, row variable COST with COSTCL) T:SPEED (tables of means and stddevs of SPEED) Table: DEVEL Column variable: YEAR Row variable: COST Frequencies 63 64 65 66 67 Total CHEAP 6 4 10 7 4 31 MODER 7 11 9 5 1 33 EXPNS 6 6 6 6 3 27 Total 19 21 25 18 8 91 Chi2=6.0617 df=8 P=0.64032 Means of SPEED 63 64 65 66 67 Total CHEAP 5.529187 2.078415 20.80887 1.658377 36.56175 13.14300 MODER 13.25171 54.88778 50.50800 439.0866 154.8420 106.1023 EXPNS 198.3025 1371.591 1123.889 1875.849 1419.726 1173.221 Total 69.25011 421.0299 296.2399 747.8964 570.0332 391.0525 Standard deviations of SPEED 63 64 65 66 67 Total CHEAP 9.650197 2.826062 28.63729 2.051873 55.15149 26.80980 MODER 15.16012 59.95012 47.31911 484.0829 - 231.6778 EXPNS 170.9234 2770.993 1412.064 2187.189 1567.665 1823.350 Total 127.8364 1517.006 801.2234 1472.591 1095.507 1114.315 CORREL@LSPEED LCOST AGE N:CORR (correlations of variables LSPEEED-AGE) Means and standard deviations MATRIX CORR_M.M S66MSN /// Mean Stddev N LSPEED 3.05430 3.11669 91.00000 LCOST 3.90558 1.23418 91.00000 AGE 32.97802 14.36120 91.00000 Correlation coefficients MATRIX CORR_R.M S66CORR /// LSPEED LCOST AGE LSPEED 1.00000 0.80688 -0.17920 LCOST 0.80688 1.00000 0.05398 AGE -0.17920 0.05398 1.00000 REGRAN@LSPEED LCOST AGE N:CORR (regression analysis using CORR) LINREG S66DATA>.M,CUR+1 / RESULTS=0 Linear regression analysis: Data S66DATA>.M, Regressand LSPEED N=91 Variable Regr.coeff. Std.dev. t beta AGE -0.048483 0.012672 -3.826 -0.223 LCOST 2.068079 0.147458 14.02 0.819 constant -3.423891 0.716240 -4.780 Variance of regressand LSPEED=9.713751390 df=90 Residual variance=2.972162239 df=88 R=0.8372 R^2=0.7008 END@ Grosch's law seems to fit well to Knight's data.

This demo in YouTube

It is shown that SURVO66 is essentially faster than the TAB
operation of SURVO MM when many cross tabulations should be done
from a large data set.

See also Resurrection of SURVO 66

This demo in YouTube

In the window above the PRINT command on the line 29 prints lines from
CUR+1=30 to E=39 in a PostSript file DOC1.PS.

The plain text on lines 30-33 is printed by using the default font [Times(12)] and with line spacing [line_spacing(12)].

The graphs (histograms) are included on lines 34-36.

% 500 on line 34 allocates vertical space of 500 Points (1 Point = 0.3528 mm) for these graphs.

The graphs are included on lines by - picture commands of the form

- picture name_of_PS_file,x,y,scale_x,scale_y

where x and y give coordinates of the left lower corner of the graph in units of 0.1 mm. A plain * indicates just the current position on the current page.

For example, on the line 36 x is *+850 thus indicating that the second graph should be moved 850 units to the right so that the histogram of the rainfall will be positioned correctly without overlaying the histogram of the temperature.

scale_x and scale_y are scaling coefficients of the graph. In this case the size of both graphs will be 70 per cent of the original size in both directions.

Thus the following document has been created:

and it can then printed by means provided by Adobe Acrobat (Reader).

The supporting freeware programs Ghostscript and Acrobat Reader do not belong to SURVO MM. They must be downloaded from the net. The latest version of Ghostscript (either a 32-bit or a 64-bit version) can be loaded as a self-extracting EXE file.

When installing it, please use default settings.

When /GS-PDF is activated for the first time, Survo locates Ghostscript (this search for the Ghostscript program may take several seconds) and saves the location of Gswin32.exe or Gswin64.exe as a text file <Survo>\U\SYS\GSPATH.SYS

When making PostScript documents containing text and graphics the PRINT program module of Survo uses a 'driver' PS.DEV which is a standard text file located on the path <Survo>\U\SYS\ .

Normally this file should not to be altered. The main default settings (font type, font size, line_spacing, etc.) are set on the two last lines of this file.

The user may override any setting by inserting new definitions. For example, if we insert a new line of the form

I made the first version of this PostScript driver in 1987. Before that drivers had been made for various printers like Epson dot matrix printers and Canon laser printers.

In 1997 Kimmo Vehkalahti created a driver for making HTML pages,

and in 2004 another driver similarly for making LaTeX documents.

All these drivers can be used for controlling the same PRINT program module of Survo.

Chapters 8. Graphics and 9. Printing of reports in my book

tell all essential information about making graphs and multi-page documents when using Survo.

This and many other documents about Survo and other topics have been created by using these capabilities of Survo.

The plain text on lines 30-33 is printed by using the default font [Times(12)] and with line spacing [line_spacing(12)].

The graphs (histograms) are included on lines 34-36.

% 500 on line 34 allocates vertical space of 500 Points (1 Point = 0.3528 mm) for these graphs.

The graphs are included on lines by - picture commands of the form

- picture name_of_PS_file,x,y,scale_x,scale_y

where x and y give coordinates of the left lower corner of the graph in units of 0.1 mm. A plain * indicates just the current position on the current page.

For example, on the line 36 x is *+850 thus indicating that the second graph should be moved 850 units to the right so that the histogram of the rainfall will be positioned correctly without overlaying the histogram of the temperature.

scale_x and scale_y are scaling coefficients of the graph. In this case the size of both graphs will be 70 per cent of the original size in both directions.

Thus the following document has been created:

and it can then printed by means provided by Adobe Acrobat (Reader).

The supporting freeware programs Ghostscript and Acrobat Reader do not belong to SURVO MM. They must be downloaded from the net. The latest version of Ghostscript (either a 32-bit or a 64-bit version) can be loaded as a self-extracting EXE file.

When installing it, please use default settings.

When /GS-PDF is activated for the first time, Survo locates Ghostscript (this search for the Ghostscript program may take several seconds) and saves the location of Gswin32.exe or Gswin64.exe as a text file <Survo>\U\SYS\GSPATH.SYS

When making PostScript documents containing text and graphics the PRINT program module of Survo uses a 'driver' PS.DEV which is a standard text file located on the path <Survo>\U\SYS\ .

Normally this file should not to be altered. The main default settings (font type, font size, line_spacing, etc.) are set on the two last lines of this file.

The user may override any setting by inserting new definitions. For example, if we insert a new line of the form

- [ArialB(11)][line_spacing(14)]next after the PRINT line in the previous example, the result will be

I made the first version of this PostScript driver in 1987. Before that drivers had been made for various printers like Epson dot matrix printers and Canon laser printers.

In 1997 Kimmo Vehkalahti created a driver for making HTML pages,

and in 2004 another driver similarly for making LaTeX documents.

All these drivers can be used for controlling the same PRINT program module of Survo.

Chapters 8. Graphics and 9. Printing of reports in my book

tell all essential information about making graphs and multi-page documents when using Survo.

This and many other documents about Survo and other topics have been created by using these capabilities of Survo.

This demo in YouTube

ESTIMATE is a powerful tool for linear and nonlinear regression
analysis.

Also fairly general problems of ML estimation can be solved by this operation.

It permits the user to enter the model in normal mathematical notation.

Before computations ESTIMATE analyzes the model function and evaluates its symbolic derivatives up to second order with respect to parameters to be estimated. The estimation and computation procedures are then selected according to this analysis and on the basis of the user's specifications.

For example, if all derivatives of second order vanish, ESTIMATE 'knows' that the model is linear and selects the Newton's method method leading to the solution rapidly in so many steps as there are parameters to be estimated.

In other cases the Davidon-Fletcher-Powell method is the default one.

The user may override these selections by a METHOD specification.

I created this demo during the Compstat 82 Conference (Toulouse 1982).

It is shown how the location and radius of a circle can be estimated by the ESTIMATE operation from a data set having observations which are located approximately on the circumference of a circle.

It is shown also how a bias due to an erroneus observation can be eliminated by using L1 estimation instead the standard least squares (L2) method.

This example is also available as a flash demo.

Apparently the same problem has been treated in various ways by other statisticians. One example is the paper

Finding the circle that best fits a set of points of Luc Maisonobe (2007).

There one numerical example of 5 observations is given and solved by the Levenberg-Marquardt method.

Below is the corresponding least squares solution by ESTIMATE using the Davidon-Fletcher-Powell method

and starting from initial values X0=Y0=R=0.

Also fairly general problems of ML estimation can be solved by this operation.

It permits the user to enter the model in normal mathematical notation.

Before computations ESTIMATE analyzes the model function and evaluates its symbolic derivatives up to second order with respect to parameters to be estimated. The estimation and computation procedures are then selected according to this analysis and on the basis of the user's specifications.

For example, if all derivatives of second order vanish, ESTIMATE 'knows' that the model is linear and selects the Newton's method method leading to the solution rapidly in so many steps as there are parameters to be estimated.

In other cases the Davidon-Fletcher-Powell method is the default one.

The user may override these selections by a METHOD specification.

I created this demo during the Compstat 82 Conference (Toulouse 1982).

It is shown how the location and radius of a circle can be estimated by the ESTIMATE operation from a data set having observations which are located approximately on the circumference of a circle.

It is shown also how a bias due to an erroneus observation can be eliminated by using L1 estimation instead the standard least squares (L2) method.

This example is also available as a flash demo.

Apparently the same problem has been treated in various ways by other statisticians. One example is the paper

Finding the circle that best fits a set of points of Luc Maisonobe (2007).

There one numerical example of 5 observations is given and solved by the Levenberg-Marquardt method.

Below is the corresponding least squares solution by ESTIMATE using the Davidon-Fletcher-Powell method

and starting from initial values X0=Y0=R=0.

DATA CIRCLE X Y 30 68 50 -6 110 -20 35 15 45 97 MODEL Cmodel sqrt((X-X0)^2+(Y-Y0)^2)=R ESTIMATE CIRCLE,Cmodel,CUR+1 Estimated parameters of model Cmodel: X0=96.0759 (1.69426) Y0=48.1352 (1.11286) R=69.9602 (1.33064) n=5 rss=3.126753 nf=120 Correlations: X0 Y0 R X0 1.000 0.611 0.892 Y0 0.611 1.000 0.675 R 0.892 0.675 1.000

The results are the same (to 3 decimal places) as those obtained by Maisonobe.

More information and another demo about ESTIMATE

This demo in YouTube

The first version of this graph was given on page 40 of
my document about SURVO 84*

in 1984 and indicates that I had found the formulas

for contour ellipses on the confidence level P
of a general bivariate normal distribution

as well as the generalized
Box-Müller formulas

for generating observations of a general bivariate normal distiribution
with a correlation coefficient ρ from two independent Uniform(0,1) variables

already then and thus earlier than told in my document

Two formulas related to two-dimensional normal distribution.

*Originally
arcsin(ρ) had to be replaced by arctan(ρ/(sqrt(1-ρ*ρ))
since arctan was the only inverse trigonometric funtion available in SURVO 84.

This demo in YouTube

Samples from distributions related to the discrete uniform
distribution are shown as dynamic histograms generated step by step.
In fact, these graphs are scatter plots of data sets of two variables.
The X variable is a discrete random variate and the values of the Y
variable are cumulative frequencies of distinct X values.

This setup is generated, for example, for discrete uniform variable
with values 1,2,3,4,5,6 by MAT commands

MAT A=ZER(n,1) / n is the sample size MAT #TRANSFORM A BY int(6*rand(2017)+1) MAT B=ZER(n,2) MAT B(1,1)=A MAT #CUMFREQ(B)where the last command computes the cumulative sums of distinct values in the first column as elements of the second column.

A new sucro /U_SAMPLE is used for creating scatter plots of such 'datasets' appearing as gradually growing histograms. The plotting process is slowed down by a SLOW specification in the plot setup. Thus SLOW=50, for example, makes the GPLOT operation to plot each observation 50 times.

This slowing feature has been used in earlier demos

A closed curve,

Lissajous curve variation,

Color changing

related to plotting families of curves. This feature is now available also in scatter plots on screen.

The syntax of the /U_SAMPLE sucro command is

/U_SAMPLE m,s,n,seed,slow

where

m is the number of outcomes U in a single trial,

s is number of independent outcomes U1,U2,...,Us giving U=U1+U2+...+Us,

seed is the seed value of the random number generator,

slow is the value of the SLOW specification.

This demo in YouTube

In 1970s it was possible to create Survo graphics by the
Wang 2272 digital drum plotter
connected to a
Wang 2200 minicomputer.

Then drawing of a graph like that above took several minutes and there
was indeed plenty of time to watch details of the plotting process
and detect possible peculiarities.

This advantage was lost when plotters were replaced laser printers or graphic screens. On a laser printer, the entire plotting process takes place out of sight. On the screen everything happens too quickly.

The main target of this demo is to point out that there may be cases where it is meaningful to slow down the plotting process (by using the SLOW specification) so that the user is able to see potential interactions between observations and/or variables.

Other examples of slow plotting in Survo are given in

Sampling from a discrete uniform distribution.

This demo in YouTube

Recently (in March 2017) I have improved the C code for root finding
in POL R=ROOT(P) operation by implementing
Laguerre's method.

The roots are found stepwise and after each step the degree of the polynomial is decreased by dividing it by (x-r) where r is the newest root.

When the coefficients of the polynomial are integers, I check whether the value r (and in case of a complex number, its real and imaginary part separately) can be replaced by a 'nice' rational number which solves the equation at least as accurately as r.

This method is described in

Rational approximations by listening

See also

The roots are found stepwise and after each step the degree of the polynomial is decreased by dividing it by (x-r) where r is the newest root.

When the coefficients of the polynomial are integers, I check whether the value r (and in case of a complex number, its real and imaginary part separately) can be replaced by a 'nice' rational number which solves the equation at least as accurately as r.

This method is described in

Rational approximations by listening

See also

This demo in YouTube

Since 2013 I have been interested in certain
metric properties of regular polygons.
The most important result in my experimental and expository studies
is a conjecture that, for each such a polygon, the
the total length of all edges
and chords is the greatest root of an algebraic equation
with
coefficients depending on binomial coefficients and the other roots
of that equation can be represented as linear combinations
of the same entities with coefficients -1,0,1.

Furthermore, if the number of vertices of the regular polygon is a prime or a power of 2, the coefficients are -1 or 1.

I have also found an efficient algorithm for determining those coefficients.

These results are presented in my paper Lengths of edges and diagonals and sums of them in regular polygons as roots of algebraic equation (2013)

and some of these conjectures have been proved in my paper with**Pentti Haukkanen** and **Jorma Merikoski**
Some polynomials associated with regular polygons (2014).

After finding the essential results, I noticed (in March 2014) that the roots can also be given as simple expressions (see page 39 in my paper) and solving of an algebraic equation is avoided. However, it is still interesting to study these expressions (roots) as simple linear combinations of chord lengths leading also to certain trigonometric identities.

Furthermore, if the number of vertices of the regular polygon is a prime or a power of 2, the coefficients are -1 or 1.

I have also found an efficient algorithm for determining those coefficients.

These results are presented in my paper Lengths of edges and diagonals and sums of them in regular polygons as roots of algebraic equation (2013)

and some of these conjectures have been proved in my paper with

After finding the essential results, I noticed (in March 2014) that the roots can also be given as simple expressions (see page 39 in my paper) and solving of an algebraic equation is avoided. However, it is still interesting to study these expressions (roots) as simple linear combinations of chord lengths leading also to certain trigonometric identities.

For example, equations C11*E11=R11 (divided by 2*11) lead to formulas sin(5π/11)+sin(4π/11)+sin(3π/11)+sin(2π/11)+sin(1π/11) = cot(1π/22)/2 sin(5π/11)-sin(4π/11)+sin(3π/11)+sin(2π/11)-sin(1π/11) = cot(3π/22)/2 sin(5π/11)-sin(4π/11)+sin(3π/11)-sin(2π/11)+sin(1π/11) = cot(5π/22)/2 sin(5π/11)+sin(4π/11)-sin(3π/11)-sin(2π/11)-sin(1π/11) = cot(7π/22)/2 sin(5π/11)-sin(4π/11)-sin(3π/11)+sin(2π/11)+sin(1π/11) = cot(9π/22)/2 and equations C15*E15=R15 (divided by 2*15) to formulas sin(7π/15)+sin(6π/15)+sin(5π/15)+sin(4π/15)+sin(3π/15)+sin(2π/15)+sin(1π/15) = cot(1π/30)/2 sin(6π/15) +sin(3π/15) = cot(3π/30)/2 sin(5π/15) = cot(5π/30)/2 sin(7π/15)-sin(6π/15)+sin(5π/15)-sin(4π/15)+sin(3π/15)-sin(2π/15)+sin(1π/15) = cot(7π/30)/2 sin(6π/15) -sin(3π/15) = cot(9π/30)/2 -sin(7π/15)+sin(6π/15)-sin(5π/15)+sin(4π/15)+sin(3π/15)-sin(2π/15)+sin(1π/15) = cot(11π/30)/2 -sin(7π/15)+sin(6π/15)+sin(5π/15)-sin(4π/15)-sin(3π/15)+sin(2π/15)+sin(1π/15) = cot(13π/30)/2 The elements of the vector En are lengths of edges and chords (multiplied by n) e_i = 2*sin(((n+1)/2-i)π/n), i=1,2,...,(n-1)/2 for odd n and e_i = 2*sin((n/2+1-i)π/n), i=1,2,...,n/2 for even n. In the latter case e_1=2 is replaced by e_1=1. The elements of the vector Rn are r_i = n*cot((2*i-1)π/(2n)), i=1,2,...,⌊n/2⌋ and found originally as the square roots of the roots of equation (n-k)/2 Σ (-1)^i*C(n,2*i+k)*n^(n-2*i-k)*x^i=0 i=0 where k=0 if n is even and k=1 if k is odd.

The essential tools for finding the Cn matrices are the MAT #ARFIND and MAT #SPREAD commands of SURVO MM. MAT #ARFIND,n,A finds the Cn coefficients for 'roots' unique for n as linear combinations of chord lengths with coefficients +1,-1. The general setup related to n is saved as a matrix file A.MAT and the coefficients of the linear combinations in a matrix file Cn.MAT. According to my conjecture the valid coefficients related to row i of Cn (in a 'unique' case) are to be selected from c_ij=±sgn(cos(q_ni*pi*(2*j-k)/(2*i-1)))), i,j=1,,2,...,⌊n/2⌋ where k=1 for odd n and k=2 for even n and q_ni is a positive integer. The correct value of q_ni is selected from alternatives q_ni=1,2,...,⌊(n/2)⌋ so that the linear combination of chord lengths with coefficients c_ij gives the i'th 'root'. The selected q value and its sign coefficient appear as two first columns of A.MAT. Then according to this conjecture the correct coefficients are found in n/4 trials on average. Without relying to this conjecture, about 2^(n-2) alternatives should be tested which is an essentially harder task. The indefined rows of Cn (related to factors of n) remain filled with zeroes and the origin of these rows is revealed by the 'factor' and 'index' columns of the matrix A. The details of MAT #ARFIND can be found in its current C code. By creating C matrices for factors given in matrix A by repeated applications of MAT #ARFIND, the 'empty' rows in the original Cn matrix can be filled by the MAT #SPREAD operation.Table of q_ni coefficientsBy defining for positive integers n,k, n>=k mod(n,k) if mod(n,k)<=⌊k/2⌋ amod(n,k) = k - mod(n,k) otherwise I have concluded experimentally that q_ni values depend on n only through m=amod(n,i) values. Then it has been possible to create a table of q values of the following form i/m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 2 1 . . . . . . . . . . . . . . . . . . . . 3 2 1 . . . . . . . . . . . . . . . . . . . 4 3 2 1 . . . . . . . . . . . . . . . . . . 5 4 2 *3 1 . . . . . . . . . . . . . . . . . 6 5 3 2 4 1 . . . . . . . . . . . . . . . . 7 6 3 2 5 4 1 . . . . . . . . . . . . . . . 8 7 4 *3 2 *5 *6 1 . . . . . . . . . . . . . . 9 8 4 3 2 5 7 6 1 . . . . . . . . . . . . . 10 9 5 3 7 2 8 4 6 1 . . . . . . . . . . . . 11 10 5 *3 8 2 *6 *7 4 *9 1 . . . . . . . . . . . 12 11 6 4 3 7 2 5 10 9 8 1 . . . . . . . . . . 13 12 6 4 3 *5 2 9 11 7*10 8 1 . . . . . . . . . 14 13 7 *3 10 8 *6 2 5 *9 4 11*12 1 . . . . . . . . 15 14 7 5 11 3 12 2 9 8 13 4 6 10 1 . . . . . . . 16 15 8 5 4 3 13 11 2 12 14 7 9 6 10 1 . . . . . . 17 16 8 *3 4 10 *6 7 2 *9 5*11*12 14 13*15 1 . . . . . 18 17 9 6 13 *5 3 *7 11 2*10 8 16 4*14*15 12 1 . . . . 19 18 9 6 14 11 3 8 7 2 13 5 17 10 4 16 15 12 1 . . . 20 19 10 *3 5 4 *6 14 17 *9 2 16*12*13 7*15 11 8*18 1 . . 21 20 10 7 5 4 17 3 18 16 2 13 12 11 19 15 9 6 8 14 1 . 22 21 11 7 16 13 18 3 8 12 15 2 9 5 20 10 4 19 6 17 14 1 The row i in the table is apermutationof integers 1,2,...,i-1. The numbers preceded by *'s (being the same as column numbers) extend the row i to a permutation, but cannot appear as amod() values due to common factors with 2*i-1. Let's call them dummy values. The column 'q' in the A matrix obtained by MAT #ARFIND,n,A gives the pertinent q_ni values and the table may be extended by using them. The same table extended to row 75

In April 2017 I found an efficient algorithmic solution for calculating the table of q_ni numbers and this approach is described in

Regular polygons: Solving riddle of q coefficients

Earlier demo on the same subject:

Edges and diagonals of a regular n-sided polygon

This demo in YouTube

Already in 2013 I tried to find an algorithm for computing the table of the q_ni numbers appearing in the coefficients c_ij=±sgn(cos(q_ni*pi*(2*j-k)/(2*i-1)))), i,j=1,,2,...,⌊n/2⌋ of the linear combinations needed in the previous demo. When noticing that the row i of the table of q's is a permutation of integers 1,2,...,i-1, it was natural to search for a direct rule of selecting the right permutations and maybe that it is still possible. For example, it is temptating to think that the rows could be related to residues (mod i) of some functions of i. I have not found such a direct formula. It is surprising that now an extension of the table by simple arithmetical tricks leads to these permutations and thus we seem to have an 'algorithmic formula'. The q_ni values depend on n only through m=amod(n,i) values as told in the previous demo. If the sequence of integers in the column m of the table of q's is denoted by q(i,m), i=1,2,..., the recursive relation q(i,m)=2*q(i-m,m)-q(i-2*m,m), i=1,2,... seems to be generally valid and the table of q's can be completed in the following way by using that recursion (and readily available permutations until i=43): It is crucial to see that the row i (i=2,3,...) starting by a specific permutation of numbers 1,2,...,i-1 (in red) is followed by the same numbers in reversed order (in green), then followed by one dummy value and thereafter this scheme is repeated 'forever'. Dummies may also appear in permutations (typically as multiples of the 'correct' number) but it is not harmful since they cannot appear as q coefficients. For simplicity, dummies can be replaced by 0's. It is also essential to notice that any column can be continued upwards by a still simpler recursion so that q(-i,m)=-q(i+1,m) i=0,1,2,... after setting plain 0's to the row 1. For example, for m=4 we have i ... -4 -3 -2 -1 0 1 2 3 4 5 ... q(i,4) ... -1 -1 -2 -1 0 0 1 2 1 1 ... Then it is obvious that the table of q's can be generated simply row by row using the recursive relation. For example, assume that we have rows down to 5 ready with upward 'mirror' completions for 5 first columns: i/m 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 -5 -4 -4 -2 0 -1 -1! -3 -3 -2 -1 -1 -2 -2 -2 -1 -1 -2! 0 -1 -1 -1 0 -1 -1 0 0 0 0! 0 0 1 0 0 0 0 0! 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 1 1! 0 1! 1 0 1 1 0 1 1 0 1 1 0 1 1 0 1 1 0 3 2 1 1! 2 0 2 1 1 2 0 2 1 1 2 0 2 1 1 2 0 2 4 3! 2! 1 1 2 3 0 3 2 1 1 2 3 0 3 2 1 1 2 3 0 5 4! 2 0 1 1 0 2 4 0 4 2 0 1 1 15 2 4 0 4 2 0 Then the start of the next row emerges for the 5 first elements recursively as (!'s after numbers used in recursion) 6 5 3 2 4 1 giving the permutation and the row is completed by the rule told above: 6 5 3 2 4 1 1 4 2 3 5 0 5 3 2 4 1 1 4 2 3 5 On basis of these findings it was possible to create an essentiallly faster algorithm for computing the Cn matrices demonstrated here. This new algorithm is now available in SURVO MM as MAT #QFIND(n) operation and computing the table of q numbers is at least ten times faster than before. The MAT #ARFIND operation is replaced by MAT #QRFIND operation working like MAT #ARFIND but does the job much faster by using a readily computed large table of q values. By using MAT #QRFIND I have computed the linear combinations (with coefficients ±1) for the roots of the equation (presented in the preceding demo) for all prime numbers n less than 10000. At the same time I have checked that coefficients really are either +1 or -1 and linear combinations give the true roots. It has also been verified that each row i in the table of q's gives a permutation of numbers 1,2,...,i-1 (when each possible 0 is replaced by the column index m) and each permutation is of order 2 with i-1 as its first element and 1 as the last one. Although the table of q's was computed only once in this experiment, and it takes a few seconds, the entire checking process lasted on my current PC about 15 hours (a lot of matrix manipulations).

These new findings are reported in

On the roots of an algebraic equation related to regular polygons (2017).

The first part of this demo is

Equation for the sum of chord lengths in a regular polygon

The calculation process is described in

Regular polygons: Testing roots

C code for SURVO MM operations MAT #QFIND and MAT #QRFIND

This demo in YouTube

Here is the sucro code for numerical checking of the equation CE=R.

Before using this sucro the q coefficients have to be calculated as a text file Q5000.TXT by the command MAT #QFIND(5000). Thereafter

/C_TEST 3

starts from number 3 and scans consecutive prime numbers, checks for each of them the structure of C and CE=R, and saves the results in a text file C_TEST.TXT in the form

until interrupted by the user. Zeros after the prime number indicate that all elements in C are either 1 or −1 and the floating point number is the sum of squares of elements in CE−R calculated in double precision. Largest sums were obtained for the last primes for obvious reasons. The sums are close enough to zeros and indicate validity of CE=R.

So the presentation of roots as linear combinations of edge and chord lengths was confirmed by this sucro for all primes less than 10000.

Although the table of q's was computed only once in this experiment by the MAT #QFIND(5000) command giving Q5000.TXT and it takes a few seconds, the entire checking process lasted on my current PC about 15 hours (a lot of matrix manipulations).

I have also checked that the linear combinations of chord lengths with coefficients 1 or −1 are unique for primes ≤79. For this task a Survo operation CTEST has been made. For example, in case of n=79, 2^((n−1)/2−1)=274'877'906'944 (positive) combinations had to be tested and it took about 100 hours on my PC.

The newest freeware version of SURVO MM can be dowloaded from here.

It includes all functions related to the current topic.

10 * 11 *TUTSAVE C_TEST 12 / def Wn=W1 Wdivisor=W2 Wremainder=W3 Wsquare=W4 Wt=W5 Wt2=W6 13 *{tempo 1}{R} 14 / 15 *{ref set 1} 16 + S: n={print Wn} m=({print Wn}-1)/2{R} 17 *MAT #QRFIND {print Wn},Q5000.TXT,A{act}{R} 18 / Testing that coefficients Cn are 1 or -1 19 *MAT A=C{print Wn}{act}{R} 20 *MAT TRANSFORM A BY abs(X#){act}{R} 21 *MAT B=CON(m,m){act}{R} 22 *MAT A=A-B{act}{R} 23 *MAT TRANSFORM A BY abs(X#){act}{R} 24 *MAT A=SUM(SUM(A)'){R} 25 *MAT_A(1,1)={act}{l} {save word Wt}{R} 26 / 27 *MAT R=ZER(m,1){act} pi=3.141592653589793{R} 28 *MAT E=R{act}{R} 29 *MAT TRANSFORM R BY cot((2*I#-1)*pi/(2*n)){act}{R} 30 *MAT TRANSFORM E BY 2*sin(((n+1)/2-I#)*pi/n){act}{R} 31 *MAT A=MTM(C{print Wn}*E-R){act}{R} 32 *MAT_A(1,1)={act}{l} {save word Wt2}{R} 33 *COPY CUR+1,CUR+1 TO C_TEST.TXT{R} 34 *{erase}{print Wn} {print Wt} {print Wt2}{home}{u}{act} 35 / Next prime number 36 + A: {Wn=Wn+2}{Wdivisor=1} 37 + B: {Wdivisor=Wdivisor+2}{Wremainder=Wn%Wdivisor} 38 - if Wremainder = 0 then goto A 39 *{Wsquare=Wdivisor*Wdivisor} 40 - if Wsquare < Wn then goto B 41 *{ref jump 1}{goto S}{end} 42 *

Before using this sucro the q coefficients have to be calculated as a text file Q5000.TXT by the command MAT #QFIND(5000). Thereafter

/C_TEST 3

starts from number 3 and scans consecutive prime numbers, checks for each of them the structure of C and CE=R, and saves the results in a text file C_TEST.TXT in the form

3 0 4.9303806576313e-032 5 0 2.0954117794933e-031 7 0 6.1629758220392e-032 11 0 3.5745259767827e-031 13 0 3.4266145570538e-030 17 0 2.3611901118188e-030 19 0 5.3810482646179e-030 23 0 1.4687141755897e-029 29 0 1.4881468087286e-029 .... 9931 0 2.5192799756962e-022 9941 0 1.2926419087749e-022 9949 0 6.8344417909304e-022 9967 0 1.2351984585185e-022 9973 0 5.3219957169430e-022

until interrupted by the user. Zeros after the prime number indicate that all elements in C are either 1 or −1 and the floating point number is the sum of squares of elements in CE−R calculated in double precision. Largest sums were obtained for the last primes for obvious reasons. The sums are close enough to zeros and indicate validity of CE=R.

So the presentation of roots as linear combinations of edge and chord lengths was confirmed by this sucro for all primes less than 10000.

Although the table of q's was computed only once in this experiment by the MAT #QFIND(5000) command giving Q5000.TXT and it takes a few seconds, the entire checking process lasted on my current PC about 15 hours (a lot of matrix manipulations).

I have also checked that the linear combinations of chord lengths with coefficients 1 or −1 are unique for primes ≤79. For this task a Survo operation CTEST has been made. For example, in case of n=79, 2^((n−1)/2−1)=274'877'906'944 (positive) combinations had to be tested and it took about 100 hours on my PC.

The newest freeware version of SURVO MM can be dowloaded from here.

It includes all functions related to the current topic.

Earlier demos on the same subject:

Regular polygons: Solving riddle of q coefficients

Equation for the sum of chord lengths in a regular polygon

C code for SURVO MM operations MAT #QFIND and MAT #QRFIND

This demo in YouTube

The original version was created in 2006 as a flash application
(in Finnish).

This demo in YouTube

In three of them the Game of Life is active. All these Survo sessions are initiated in a fourth Survo session. Thus in each of the three first child processes the Game of Life is started by the main process.

The original game is here modified so that the playground is limited, the rules of transitions are generalized, and the color of each generation is selected randomly.

In this extended form the game is purely recreational without any serious purpose.

All actions take place in a Survo window and thus essentially in a MS-DOS console window.

I have programmed the Game of Life as a Survo operation LIFEGAME in 1990 and here it employs the following setup in the edit field:

There, for example, the ON and OFF specifications determine rules for `births´ and `deaths´ of elements and the LIFE specification gives the color palette. Also the borders of the upper part of the playground are visible in that picture. The startup setting in this example is always symmetric and so the game stays symmetric `forever´.

The game is started by the LIFEGAME command appearing at the end of the third edit line.

It was generated by activating the sucro of the current example twice
and by moving the three 'game windows' of the second session downwards
below the three first windows by the mouse. The recording was started
after these activations and moves.

In this version of the demo, over 10'000 patterns will be shown in an hour.

This demo in YouTube

**'Widescreen' version**

For those who really like watching `Game of Life´ in this `re-creational´ form, a new version with 8 concurrent cases and lasting up to 2 hours is now available in YouTube.

In this version of the demo, over 10'000 patterns will be shown in an hour.

This demo in YouTube

For those who really like watching `Game of Life´ in this `re-creational´ form, a new version with 8 concurrent cases and lasting up to 2 hours is now available in YouTube.

This demo in YouTube

Fitting alternative regression models to heterogeneous data

In 1976 I got interested in a problem of making regression analysis

when the dataset is heterogeneous so that the observations seem to be

divided into two partially overlapped clusters caused by an unrecorded

factor.

I tried to solve this problem by making an iterative program for

simultaneous clustering and regression model estimation in clusters

and called this approach

Thus by starting from proper initial values of regression coefficients

of two models, the program computes for each observation its deviation

from both models and uses the value of the smaller deviation

in the sum of residual squares

tries to minimize this sum by gradual modification of the regression

coefficients.

I have now reimplemented this method as a new Survo operation DIGRES

and show here how it works in a particular case.

I wrote the original document (in Finnish) about digression analysis in 1976 after making a Basic program where the definition of the model was directly inserted into the program code.

In 1978 I gave a presentation about this topic in the Compstat 1978 conference in Leiden.

I have also written a short entry 'Digression analysis' in the Encyclopedia of Statistical Sciences.

This "Parallel regression lines" example has been presented in all above documents.

The validity of a digression model can be studied by simulation experiments.

To study the accuracy of analysis in this case, I created 1000 homogeneous samples of size 100 from a multivariate normal distribution with parameters obtained from the data DATA100 and made a corresponding digression and regression analysis for each of those homogenous samples.

Let SR be the residual sum of squares in regression analysis and SD the residual sum of squares in digression analysis. Then the ratio SD/SR seems to be a good measure for the degree of digression so that low values of SD/SR indicates a high degree of digression. In this example we have SD=23.4535 and SR=102.5013 giving SD/SR=0.2288 .

Now for each homogeneous sample, both SD and SR were recorded and SD/SR values calculated. The smallest of 1000 simulated values was SD/SR=0.2518 and clearly larger than the value SD/SR=0.2288 in DATA100 thus being a strong indicator of heterogeneity.

In this simple case the cluster analysis with the Wilks' Lambda criterion carried out by the CLUSTER operation (created for Survo much later than the original version of DIGRES) leads practically to the same grouping of observations.

In more complicated cases like this one regression and cluster analysis cannot compete with digression analysis.

The DIGRES operation is freely available in SURVO MM (ver.3.58).

This demo in YouTube

Digression analysis may be useful in situations where the data is so complicated that neither cluster analysis nor regression analysis is working alone.

Even when using DIGRES, one must know in advance a lot about the form of the regression functions related to the topic.

Here Survo was used for generating an artificial data where two short, erroneous, and periodical data sets are mixed.

DIGRES worked in this example as follows:

*Now the model is defined as *MODEL M *Y=sin(A*X+B) or Y=sin(C*X+D) * *and starting from 'poor' initial values *A=8 B=12 C=15 D=0 *The DIGRES operation gives the results * *DIGRES SIN100,M,CUR+1 *Estimated parameters of model M: *A=9.82352 *B=10.1049 *C=12.2938 *D=-0.154894 *n=100 rss=4.017864 R^2=0.93380 nf=671 * *The estimates are close to the true values 10,10,12,0. *

DIGRES does not give standard errors for parameter estimates. These numbers are obtained by making ordinary regression analysis (here by ESTIMATE operation) in the subgroups found by DIGRES.

In this example we have

* *MODEL M1 *Y=sin(A*X+B) * *MASK=AA--A-- *A=10 B=10 RESULTS=0 *ESTIMATE SIN100,M1,CUR+1 / IND=G,1 First group *Estimated parameters of model M1: *A=9.82352 (0.11347) *B=10.1049 (0.0657281) *n=51 rss=1.653080 R^2=0.93311 nf=39 * *MODEL M2 *Y=sin(C*X+D) *MASK=AA--A-- * *C=12 D=0 *ESTIMATE SIN100,M2,CUR+1 / IND=G,2 Second group *Estimated parameters of model M2: *C=12.2938 (0.161625) *D=-0.154893 (0.0904904) *n=49 rss=2.364784 R^2=0.92972 nf=33 *The estimated parameters are the same as in digression analysis and their standard errors are valid for DIGRES. Therefore the results can be completed into the form:

* *DIGRES SIN100,M,CUR+1 *Estimated parameters of model M: *A=9.82352 (0.11347) *B=10.1049 (0.0657281) *C=12.2938 (0.11347) *D=-0.154894 (0.0657281) *n=100 rss=4.017864 R^2=0.93380 nf=671 *

Information about computations and generating the two graphs appearing in this demo is available in a Survo edit file

http:\\www.survo.fi\demos\MIXED120.EDT

This demo in YouTube

The roots of Diophantine equations of the form X^a+Y^b=cZ are studied empirically by selecting parameters a,b, and c randomly and plotting the roots X,Y as a 'scatter plot'. Besides the trivial roots X=c*i, Y=c*j, i,j=1,2,..., for many a,b,c combinations interesting patterns of other roots can be seen.

This is the first of 7 demos

This demo in YouTube

According to my experiments, nontrivial roots for Diophantine equations X^4+Y^4=cZ are obtained only for primes of the form 8n+1 and their multiples.

Here roots are calculated and plotted for these c values until 433.

This is the second of 7 demos

This demo in YouTube

The connections between the roots of the Diophantine equation X^4+Y^4=17*Z are studied and it will be seen that they are located in the integer points of lines of type

Y=17+k*(X-17*i), k=-2, -1/2, 1/2, 2

and thus forming two slanted grids of squares.

For c=8*11+1=89 the grid lines have slopes k=-7/5, 5/7, -5/7, 7/5.

For c=8*9+1=73 the grid lines have slopes k=-7/3, 3/7, 7/3, -3/7.

For c=8*54+1=433 the grid lines have slopes k=-1/3, 3, 1/3, -3.

Graph of case c=697=17*41:

Combination of mod(X^4+Y^4,c)=0 and mod(X^2+Y^4,c)=0 for c=1049:

Above a 4196x4196 grid with a greenish background is colored so that
the point (X,Y) is red if the integer X^4+Y^4 is divisible by 1049
and it is black if X^2+Y^4 is divisible by 1049. Otherwise it remains
green.

For example, (4192,1303) is red since (4192^4+1303^4)/1049=297128793673 and (3964,2185) is black since (3964^2+2185^4)/1049=21728541529.

This is the third of 7 demos

For example, (4192,1303) is red since (4192^4+1303^4)/1049=297128793673 and (3964,2185) is black since (3964^2+2185^4)/1049=21728541529.

This is the third of 7 demos

This demo in YouTube

According to my experiments, "interesting" symmetric configurations
of roots (X,Y) for Diophantine equations X^n+Y^n=c*Z are obtained when

n = k*2^m and c = i*2^(m+1)+1 is a prime, k,m,i = 1,2,...

Here the roots are plotted for X,Y=0,1,2,...,2c. Then a common feature in all these graphs is that the square determined by points (0,0) and (2c,2c) is divided into four subsquares of size c having identical configuration of points. In any of them, say in the square (0,c) the points are symmetric to both diagonals since when mod(X^n+Y^n,c)=0 also mod(Y^n+X^n,c)=0 and mod((c-X)^n+(c-Y)^n,c)=0 when n is even. Then, for example, the configuration of points in the triangle with vertices (0,0),(0,c/2),(c/2,c/2) determines the entire graph by rotations and translations.

Due to these symmetric properties, a unique symmetric pattern (B) locates around the middle point (c,c) and symmetric patterns of another kind (A) are located around middle points of the four corner squares.

This basic structure covers the entire XY space.

When computing numerical values of mod(X^n+Y^n,c) modular exponentiation is used in the Survo module DIOPH.

This is the fourth of 7 demos

n = k*2^m and c = i*2^(m+1)+1 is a prime, k,m,i = 1,2,...

Here the roots are plotted for X,Y=0,1,2,...,2c. Then a common feature in all these graphs is that the square determined by points (0,0) and (2c,2c) is divided into four subsquares of size c having identical configuration of points. In any of them, say in the square (0,c) the points are symmetric to both diagonals since when mod(X^n+Y^n,c)=0 also mod(Y^n+X^n,c)=0 and mod((c-X)^n+(c-Y)^n,c)=0 when n is even. Then, for example, the configuration of points in the triangle with vertices (0,0),(0,c/2),(c/2,c/2) determines the entire graph by rotations and translations.

Due to these symmetric properties, a unique symmetric pattern (B) locates around the middle point (c,c) and symmetric patterns of another kind (A) are located around middle points of the four corner squares.

This basic structure covers the entire XY space.

When computing numerical values of mod(X^n+Y^n,c) modular exponentiation is used in the Survo module DIOPH.

This is the fourth of 7 demos

This demo in YouTube

The graphs created in the previous demos related to roots (X,Y) of
Diophantine equations X^n+Y^n=cZ have a lot of symmetrical
(kaleidoscopic) features.

Using the case X^32+Y^32=641*Z as an example it is shown how the graph of of roots (X,Y) for X,Y=0,1,...,2*641=1282 can be generated from a small triangular part of it by using that part or its transpose as a 'building block' 32 times.

If the same 'symmetrization' is performed by starting from a corresponding triangular part but filled randomly by points, graphs like the four ones below are obtained:

The overall structure in them is similar, but there are no accurate linear dependencies between points.

**Linear clusterings of points in the graph**

For finding linear clusterings numerically a sucro /DIOPH was created.

Using the case X^32+Y^32=641*Z as an example it is shown how the graph of of roots (X,Y) for X,Y=0,1,...,2*641=1282 can be generated from a small triangular part of it by using that part or its transpose as a 'building block' 32 times.

If the same 'symmetrization' is performed by starting from a corresponding triangular part but filled randomly by points, graphs like the four ones below are obtained:

The overall structure in them is similar, but there are no accurate linear dependencies between points.

For finding linear clusterings numerically a sucro /DIOPH was created.

*TUTSAVE DIOPH / / def Wm=W1 Wm2=W2 Wprime=W3 Wn=W4 / *{R} *SCRATCH {act}{home} *DIOPH {write Wm},{write Wm2},{write Wprime},{write Wn},CUR+4{act} *{R} *FILE COPY DIOPH1 TO NEW DIOPH{R} *DATA DIOPH1{R} *p q{R} / The results of DIOPH are saved{R} / in a Survo data file DIOPH as variables p,q.{R} *{u3}{act} *SCRATCH {act}{home} *VAR G:2=gcd(p,q) TO DIOPH{act}{R} *VAR C1:1=if(p>q)then(0)else(1) TO DIOPH{act}{R} *VAR C2:1=if(G>1)then(0)else(1) TO DIOPH{act}{R} *VAR C3:1=C1*C2 TO DIOPH{act}{R} *FILE COPY DIOPH TO NEW DIOPH2 / IND=C3,1{act}{R} *VAR R:2=sqrt(p*p+q*q) TO DIOPH2{act}{R} *FILE SORT DIOPH2 BY R TO DIOPH3{act}{R} *FILE LOAD DIOPH3 / IND=ORDER,2,20 VARS=p,q,R{act} *{end}

When applied to the current case mod(X^32+Y^32,641)=0 /DIOPH gives the following result:

/DIOPH 32,32,641,100 DIOPH 32,32,641,100,CUR+4 / n_comb=533 The results of DIOPH are saved in a Survo data file DIOPH as variables p,q. VAR G:2=gcd(p,q) TO DIOPH VAR C1:1=if(p>q)then(0)else(1) TO DIOPH VAR C2:1=if(G>1)then(0)else(1) TO DIOPH VAR C3:1=C1*C2 TO DIOPH FILE COPY DIOPH TO NEW DIOPH2 / IND=C3,1 VAR R:2=sqrt(p*p+q*q) TO DIOPH2 FILE SORT DIOPH2 BY R TO DIOPH3 FILE LOAD DIOPH3 / IND=ORDER,2,20 VARS=p,q,R DATA DIOPH3*,A,B,C p q R 1 2 2 1 5 5 4 5 6 1 8 8 9 13 15 5 16 16 1 20 20 2 25 25 8 25 26 17 27 31 1 32 32 21 31 37 13 36 38 19 33 38 23 33 40 25 32 40 3 41 41 12 41 42 29 31 42

In this case the directions p/q: 1/2, 1/5, 4/5, 1/8 are the most prominent. Drawing grids of lines for them in colors 1/2 (black), 1/5 (red), 4/5 (green), 1/8 (yellow) gives the following graph:

Plotting also lines related to four next p/q directions, i.e.

9/13, 5/16, 1/20, 2/25

these first eight directions seem to cover all points (below by lines drawn in red in the first quarter of the previous graph):

Thus in the case X^32+Y^32=641*Z the set of solutions (X,Y) can be covered by 2*8 regular square grids determined by ratios

1/2, 1/5, 4/5, 1/8, 9/13, 5/16, 1/20, 2/25.

In the general case mod(X^n+Y^n,P)=0 it seems to be true that for n=2^m the maximum number of regular square grids needed to cover the set of solutions (X,Y) is 2^(m-1).

For example, for lower m values and certain P values we get by using /DIOPH:

m 2^m P ratios 2 4 17 1/2 3 8 97 1/8 5/7 4 16 929 11/12 4/17 7/25 14/29

This is the fifth of 7 demos

This demo in YouTube

By using the Survo operation DIOPH2 is now possible to study how
the roots of the Diophantine equation mod(X^n+Y^n,P)=0 can be computed
in the "P square" 0<=X,Y<=P
by starting from 'minimal solutions' obtained
by the sucro /DIOPH and using the basic properties:

If (X,Y) is a root then (k*X,k*Y), k=1,2,... and

(Y,X), (X,P-Y), (P-Y,X), (P-X,Y), (Y,P-X), (P-X,P-Y), (P-Y,P-X)

are also roots.

The roots (X,Y) located in the P square and
related to a minimal solution (p,q) are equidistantly on
parallel lines

(1) X=iPX_0+qt, Y=iPY_0+pt, i=-q,-q+1,...,p-1,p

where X_0 and Y0 are obtained from the Diophantine equation
pX_0+qY_0=1 .
The line for a given value of i encounters the line connecting the
diagonal of the P square with end points (0,P) and (P,0)
in the point determined by

t=P*(1-i*(X_0+Y_0))/(p+q).

When t is rounded to int(t)=t_0,
the point of solution

X_1=iPX_0+qt_0, Y_1=iPY_0+pt_0

is located
in the P square (or in its corner for i=-q and i=p).
Then it is easy to determine all points of solution in the P square
related to minimal solution p,q by using (1) and
by the basic properties given above.

In the next demo a general procedure will be presented for determining
how many minimal solutions are needed in each case. The p,q values
will be processed in the order given by /DIOPH until in the
set of solutions (X,Y) the next pair (p,q) already appears among
those solutions.

Thus my conjecture is that the top values in the list of p,q values from
/DIOPH give minimal solutions. This view is supported e.g. by the fact
that a small distance sqrt(p^2+q^2) between consecutive roots implies
thicker covering of roots than a big distance. Thus this principle
guarantees most 'economic' approach for finding all roots.

The following graphs show these minimal solutions in two cases:

The minimal solutions are located below the red arc of a circle.

In the latter case the 32 optimal p/q ratios are

5/6, 6/7, 1/17, 11/16, 12/19, 9/22, 13/22, 2/27,

16/23, 3/29, 15/26, 21/26, 6/37, 17/33, 2/39,

9/43, 15/41, 17/40, 13/43, 9/46, 21/41, 13/46,

1/48, 25/42, 18/47, 3/53, 26/47, 32/43, 2/55,

30/49, 17/56, 31/50

in the form given by /DIOPH.

This is the last of 7 demos

This demo in YouTube

By using this technique I have solved over 100 cases with various
n and P values.
The results of these experiments support the following conjecture about
the number of p,q pairs needed:

Assume that P is a prime number.
The trivial roots (X,Y) of mod(X^n+Y^n,P)=0 are

(iP,jP), i,j=...,-2,-1,1,2,...

Let the P-1 be divisible by 2^k but not by 2^(k+1).
Then non-trivial roots can appear only when n<2^k.
For any even n<2^k giving also nontrivial roots,
the number of p,q combinations needed for straight lines to cover
the roots (X,Y) in the XY plane is

N(n,P)=ceil[gcd(n,P-1)/4].

Example: N(64,641)=16

The above N(n,P) formula holds in all cases I have tested and
they are listed in

# of roots and slopes in 120 cases and lists of some interesting p,q values

In some cases as for n=64,P=641 all 'optimal' p,q combinations do
not appear in the order determined by sqrt(p^2+q^2). Tables of p,q
combinations of such cases are also given at the end of this list.

This is the last of 7 demos

Home | News | Publications | Download | Flash

Copyright © Survo Systems 2001-2017. All rights reserved.

Updated 2018-07-14 by webmaster'at'survo.fi.

Demos Best viewed with any browser.

Updated 2018-07-14 by webmaster'at'survo.fi.

Demos Best viewed with any browser.