Adriana Vlad, Adrian Mitrea * Estimating Conditional Probabilities and Digram Statistical Structure in Printed Romanian
Digram
(i,j) |
Probability
(j/i) | Signal to noise ratio | Relative error | Probability
(i,j) | Cumulated relative error | *) Probability
(i,j) | **) Probability
(i,j) |
AC | 0.0755 | 20.98 | 0.0934 | 0.0076 | 0.1189 | 0.0084 | 0.0089 |
AL | 0.1109 | 25.93 | 0.0756 | 0.0111 | 0.1006 | 0.0098 | 0.0102 |
AR | 0.1621 | 32.30 | 0.0607 | 0.0163 | 0.0854 | 0.0152 | 0.0161 |
AS | 0.0697 | 20.10 | 0.0975 | 0.0070 | 0.1231 | 0.0068 | 0.0067 |
AT | 0.1272 | 28.04 | 0.0699 | 0.0128 | 0.0948 | 0.0114 | 0.0115 |
CA | 0.1478 | 22.15 | 0.0885 | 0.0075 | 0.1252 | 0.0080 | 0.0081 |
CE | 0.1775 | 24.71 | 0.0793 | 0.0090 | 0.1157 | 0.0097 | 0.0090 |
CU | 0.1443 | 21.84 | 0.0898 | 0.0073 | 0.1265 | 0.0075 | 0.0071 |
DE | 0.4472 | 38.16 | 0.0514 | 0.0146 | 0.0959 | 0.0149 | 0.0145 |
DI | 0.1839 | 20.14 | 0.0973 | 0.0060 | 0.1438 | 0.0060 | 0.0058 |
EA | 0.1373 | 32.69 | 0.0600 | 0.0167 | 0.0821 | 0.0166 | 0.0167 |
EC | 0.0887 | 25.57 | 0.0766 | 0.0108 | 0.0991 | 0.0103 | 0.0096 |
EL | 0.0806 | 24.26 | 0.0808 | 0.0098 | 0.1034 | 0.0095 | 0.0090 |
EN | 0.0691 | 22.33 | 0.0878 | 0.0084 | 0.1105 | 0.0091 | 0.0092 |
ER | 0.1035 | 27.84 | 0.0704 | 0.0126 | 0.0928 | 0.0120 | 0.0122 |
ES | 0.0896 | 25.72 | 0.0762 | 0.0109 | 0.0987 | 0.0108 | 0.0106 |
IA | 0.0787 | 22.22 | 0.0882 | 0.0083 | 0.1128 | 0.0087 | 0.0089 |
IC | 0.0952 | 24.65 | 0.0795 | 0.0101 | 0.1039 | 0.0102 | 0.0104 |
IE | 0.0709 | 21.01 | 0.0933 | 0.0075 | 0.1180 | 0.0067 | 0.0069 |
IN | 0.1438 | 31.15 | 0.0629 | 0.0152 | 0.0869 | 0.0155 | 0.0152 |
IT | 0.0789 | 22.25 | 0.0881 | 0.0083 | 0.1127 | 0.0077 | 0.0077 |
LA | 0.1459 | 21.53 | 0.0910 | 0.0070 | 0.1288 | 0.0068 | 0.0068 |
LE | 0.2115 | 26.98 | 0.0726 | 0.0102 | 0.1098 | 0.0105 | 0.0101 |
LU | 0.1374 | 20.79 | 0.0943 | 0.0066 | 0.1321 | 0.0064 | 0.0066 |
NE | 0.1269 | 22.70 | 0.0863 | 0.0083 | 0.1184 | 0.0079 | 0.0085 |
NT | 0.1399 | 24.01 | 0.0816 | 0.0091 | 0.1135 | 0.0096 | 0.0099 |
OR | 0.2346 | 26.26 | 0.0746 | 0.0099 | 0.1145 | 0.0101 | 0.0107 |
PE | 0.2066 | 20.79 | 0.0943 | 0.0066 | 0.1413 | 0.0067 | 0.0063 |
PR | 0.2217 | 21.74 | 0.0901 | 0.0070 | 0.1370 | 0.0073 | 0.0073 |
RA | 0.1147 | 23.09 | 0.0849 | 0.0087 | 0.1144 | 0.0085 | 0.0088 |
RE | 0.2745 | 39.47 | 0.0497 | 0.0209 | 0.0782 | 0.0208 | 0.0206 |
RI | 0.1827 | 30.33 | 0.0646 | 0.0139 | 0.0936 | 0.0137 | 0.0143 |
SE | 0.1723 | 21.90 | 0.0895 | 0.0071 | 0.1305 | 0.0072 | 0.0076 |
ST | 0.2418 | 27.10 | 0.0723 | 0.0099 | 0.1126 | 0.0095 | 0.0089 |
TA | 0.1222 | 22.11 | 0.0887 | 0.0076 | 0.1216 | 0.0078 | 0.0073 |
TE | 0.2492 | 34.14 | 0.0574 | 0.0155 | 0.0895 | 0.0156 | 0.0157 |
TI | 0.1151 | 21.37 | 0.0917 | 0.0071 | 0.1248 | 0.0064 | 0.0068 |
TR | 0.1193 | 21.81 | 0.0899 | 0.0074 | 0.1229 | 0.0077 | 0.0074 |
UL | 0.2008 | 28.80 | 0.0681 | 0.0118 | 0.1014 | 0.0111 | 0.0107 |
UN | 0.1587 | 24.96 | 0.0785 | 0.0093 | 0.1122 | 0.0088 | 0.0086 |
ÂN | 0.6100 | 23.70 | 0.0827 | 0.0041 | 0.1855 | 0.0039 | 0.0038 |
ÎN | 0.8808 | 71.31 | 0.0275 | 0.0105 | 0.1004 | 0.0107 | 0.0107 |
ªI | 0.6657 | 37.42 | 0.0524 | 0.0089 | 0.1230 | 0.0083 | 0.0086 |
ÞI | 0.6101 | 31.99 | 0.0613 | 0.0067 | 0.1396 | 0.0072 | 0.0070 |
*) Calculated as ratio between the occurrence number and the total digram number on the whole X text.
**) Calculated on a periodical
sample from the whole X text with a step of 200 letters.
51