Log-linear models of migration flows

Author:

JS Little and J Raymer

Introduction to model applications

The log-linear modelling framework provides several valuable techniques for studying and estimating migration flows within a network of regions. To date, these methods have been applied most often to internal migration systems where regions are defined as sub-national administrative units. However, they need not be restricted to domestic migration and may be applied to international systems of migration as well (Raymer 2007).

A migration flow is defined as the number of migrations from one region to another over the course of a specified time frame. There are several different ways to count migrations and each one could yield a different result. For example, Rees and Willekens (1986) make the distinction between registration systems that count the number of inter-regional residential moves over a reference period and censuses that count persons who reside in a place at the time of the census that is different from the place of residence at the beginning of the reference period.

Regardless of the method used to count migration flows, it is conventional to present them in contingency tables. These are square tables that report the flow counts between origin and destination regions. The flows in the migration table can be perfectly reproduced by the multiplicative component model, which is a saturated (i.e., where there are as many estimated parameters as there are data points) log-linear model. It has been used by Willekens (1983), Rogers, Willekens, Little et al. (2002)) and Rogers, Little and Raymer (2010)) to represent the matrix of flows between regions, and by Raymer and Rogers (2007), Raymer, Bonaguidi and Valentini (2006)) and Rogers, Little and Raymer (2010)) to capture the structure of inter-regional flows within age categories. The multiplicative components are interpretable and conveniently used to define the structure of migration between the regions of interest (Rogers, Willekens, Little et al. 2002). If calculated for more than one set of inter-regional flows, defined for different time periods, for example, or for different age, sex or race categories, multiplicative components are useful for comparing migration regimes across these populations.

Log-linear methods may be used to justify simplified representations of migration structure that are more parsimonious than the saturated model. The appropriateness of a reduced model is determined by fitting the predicted flows to the observed flows and by using statistical methods to evaluate the goodness of fit. If the reduced form has merit, i.e., fits the data well, the model may be used to estimate indirectly the flows. The independence model, for example, assumes inter-regional flows are distributed according to the pattern that could have been predicted based on the marginal distributions of flows across origin and destination regions. If the independence model is confirmed, inter-regional flows are predictable and can be estimated indirectly, but accurately, if the total sending and receiving flows of each region are given.

Sometimes the structure of migration is hypothesized to be invariant with respect to factors such as time, age, sex, and race. These hypotheses can be represented and tested with log-linear models. Allowing for changes in the level of migration, studies have documented remarkable stability in migration structures, in particular the rates of migration by age, over time (Mueser 1989; Nair 1985; Snickars and Weibull 1977). Other studies have shown consistency in the age patterns of inter-regional migration over time (Raymer and Rogers 2007). Moreover, the migration structure of the youngest ages, which can be inferred from birthplace-specific population stocks, has, in certain contexts, proven to be a “proxy” for the level of migration and allowed the estimation of migration of the older age groups (Raymer and Rogers 2007; Rogers, Little and Raymer 2010).

These studies have set the stage for establishing the method of offsets as a successful tool for indirectly estimating migration flows. It is a special application of log-linear modelling that forces a known migration structure on to a system that may have missing or unreliable inter-regional flow data. Using this method, the known migration structure of one time period can be borrowed from another period. In addition, when flows are disaggregated by age, the structure of age-specific inter-regional flows of one time period can be applied to another period. Furthermore Raymer and Rogers (2007) showed that the level of infant lifetime migration can be applied, using the method of offsets, to estimate indirectly the migration flows of the older ages.

Applications of log-linear models, and the related assumptions, are detailed in the sections that follow, beginning with the two-variable case, i.e., origin and destination. In this section, the log-linear model is defined in the context of two-dimensional flow tables, and multiplicative forms as well as additive forms of the saturated model are derived and interpreted. The log-linear model of independence and the “migrants only” quasi-independence model are set out, including illustrations and a brief description of the methods for evaluating goodness-of-fit.

The section concludes with an illustration of the method of offsets for indirectly estimating the inter-regional flow data of one period based on the migration flow patterns of another. When flow data are available for two periods, the period-invariance assumption can be tested with a log-linear model and the method of offsets. Models that disaggregate the origin and destination of flows into age categories are considered. This is followed by an illustration of how the multiplicative model with age can be applied, using the method of offsets, to estimate indirectly the age-specific inter-regional flows for another period.

Applications of the two-variable model

To illustrate the two-variable log-linear model, consider the 1973 and 1976 migrations in the Netherlands between types of municipalities categorized into six different groups based on degree of urbanization. These were published by Willekens (1983)) and are presented in Table 1. In this context, there are two variables, region of origin (O) and region of destination (D). Neither is identified as the dependent variable. The outcome variable may be either the inter-regional migration flow, denoted n_ij, in the multiplicative form of the model, or the natural logarithm of the flow, denoted ln(n_ij), in the additive form of the model.

Decompositions of the saturated model, each one perfectly regenerating the observed data, are described in the subsections presenting the multiplicative component model and the additive linear model, and three indirect estimation techniques are illustrated in the three subsections describing the independence model, the quasi-independence model and the method of offsets subsections that follow.

Table 1 Migration between municipalities by degree of urbanization,* the Netherlands, 1973 and 1976

A. 1973 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	50,498	23,829	8,566	21,846	16,264	18,856	139,859
	2	25,005	27,536	6,953	14,326	16,212	18,282	108,314
	3	15,675	10,710	13,874	6,266	9,819	19,701	76,045
	4	23,457	14,169	4,431	10,209	9,386	10,973	72,625
	5	29,548	25,267	11,802	13,160	15,979	20,406	116,162
	6	46,815	39,123	42,399	25,012	26,830	23,304	203,483
	Total	190,998	140,634	88,025	90,819	94,490	111,522	716,488
B. 1976 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	14,473	14,327	6,077	11,689	10,618	9,897	67,081
	2	14,833	36,258	13,289	17,391	20,899	21,869	124,539
	3	8,330	17,764	25,113	10,489	18,171	29,220	109,087
	4	11,315	16,498	8,935	10,537	10,762	12,519	70,566
	5	11,875	24,370	19,151	12,312	16,724	22,591	107,023
	6	16,582	32,336	52,415	22,264	28,182	27,810	179,589
	Total	77,408	141,553	124,980	84,682	105,356	123,906	657,885
		*1: rural municipalities
		2: industrial rural municipalities
		3: specific resident municipalities of commuters
		4: rural towns and small towns
		5. medium-sized towns
		6. large towns of more than 100,000 inhabitants
	Source: Central Bureau of Statistics, The Hague

Application 1: The multiplicative component model

The multiplicative expression of the saturated log-linear model, called the multiplicative component model, reproduces the elements of the flow table as follows:

n_{i j} = (T) (O_{i}) (D_{j}) (O D_{i j}) .

Equation 1

Like all saturated models, it is, strictly speaking, not a model but a way of representing the data. n_ij is the observed flow of migration from region i to region j, and the effect parameters are T, O_i, D_j, OD_ij. Therefore, any i to j flow found in the interior 6 by 6 sub-matrices of Table 1 can be expressed by an equation of the same form as Equation 1 with the corresponding set of parameters. T gives the overall effect, O_i gives the effect of origin i, D_j gives the effect of destination j, and OD_ij gives the effect of the association between O_iand D_j. Taken together, the parameters of the saturated model represent the spatial structure of migration (Rogers, Willekens, Little et al. 2002).

Two different sets of parameters that satisfy the multiplicative component model have been used in migration studies and both are presented here. Each one offers a different way of representing and interpreting the migration structure. The first is called geometric mean effect coding (Knoke and Burke 1980; Willekens 1983) and the second is called total sum reference coding (Raymer and Rogers 2007; Rogers, Little and Raymer 2010). A third multiplicative component model is derived in the subsection presenting the log-linear additive model.

Application 2: Geometric mean effect coding

Geometric mean effect coding was the first decomposition of Equation 1 used for migration analysis. It was proposed by Birch (1963) and is formally equivalent to the gravity model of migration (Willekens 1983). Table 2 shows the multiplicative components resulting from geometric mean effect coding of the Netherlands data from Table 1. Note that the overall component (T) is set out in the grand total locations of the table, the origin components (O_i) are set out in the row-total locations, the destination components (D_j) are set out in the column-total locations, and the origin-destination interaction components (OD_ij) are set out in the cells of the interior sub-matrices.

Table 2 Multiplicative components using geometric mean effect coding

A. 1973 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	1.457	0.940	0.656	1.352	0.933	0.882	1.180
	2	0.885	1.332	0.653	1.087	1.140	1.048	0.962
	3	0.771	0.720	1.811	0.661	0.959	1.570	0.692
	4	1.275	1.052	0.639	1.190	1.014	0.966	0.627
	5	0.943	1.102	1.000	0.901	1.013	1.055	1.067
	6	0.838	0.957	2.015	0.960	0.954	0.676	1.903
	Total	1.711	1.252	0.644	0.798	0.861	1.056	17,168.003
B. 1976 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	1.753	0.984	0.571	1.317	0.979	0.787	0.656
	2	0.986	1.366	0.686	1.075	1.057	0.954	1.195
	3	0.655	0.792	1.533	0.767	1.088	1.508	1.010
	4	1.277	1.055	0.783	1.106	0.925	0.927	0.704
	5	0.900	1.047	1.127	0.868	0.965	1.124	1.048
	6	0.769	0.850	1.888	0.960	0.995	0.847	1.712
	Total	0.768	1.354	0.989	0.825	1.008	1.169	16,401.919

The overall effect, T, is described as the constant of proportionality or the size main effect (Willekens 1983). It is the geometric mean of all inter-regional flow values:

T = {[\prod_{i j} n {}_{i j}]}^{(\frac{1}{m \times m})},

where m is the number of origin regions (rows) = the number of destination regions (columns). T equals 17,168.003 for 1973 and 16,401.919 for 1976.

For a particular region i, the main effect of that region of origin is the ratio of the geometric mean of flows originating from i divided by the overall geometric mean:

O_{i} = \frac{1}{T} {[\prod_{j} n {}_{i j}]}^{\frac{1}{m}} .

The main effect, O_i, shows the relative importance of region i as a source of migrations (Alonso 1986). For example, based on the 1973 data, the effect of originating in Category 4 is equal to:

O_{4} = \frac{1}{17168.003} {[23457 \times 14169 \times 4431 \times 10209 \times 9386 \times 10973]}^{\frac{1}{6}} = 0.627 .

This is the smallest of the origin (row) effects, which suggests that Category 4 was the least important source of migrations in 1973.

Similarly, the destination main effect, D_j, gives the relative importance of region j as an attractor of migrants. It is ratio of the geometric mean of column j to the total geometric mean and the formula is:

​ D_{j} = \frac{1}{T} {[\prod_{i} n {}_{i j}]}^{\frac{1}{m}} ​ .

For example, for municipalities in Category 4, the destination effect in 1973 is equal to:

D_{4} = \frac{1}{17168.00} {[21846 \times 14326 \times 6266 \times 10209 \times 13160 \times 25012]}^{\frac{1}{6}} = 0. 798 .

All other row and column effects can be derived in the same way. Each is the geometric mean of the row (or column) elements divided by the overall geometric mean, and they are equivalent to the balancing factors in the gravity model (Willekens 1983).

They can be compared across regions and across time periods. For example, Category 6 was the most important source of migrations in 1973 (1.903 is greater than the other destination effects), and in 1976 (1.712 is greater than the other destination effects). Category 1 was less important as a destination in 1976 than in 1973 (0.768 is less than 1.711), and, in 1973, it was less important as a source of migrations than as a destination for migrations (1.180 is less than 1.711).

Panels A and B in Table 2 are sometimes called the spatial interaction matrices. The elements are the OD_ij interaction effects in Equation 1 and each one is equal to the observed flow between i and j divided by the expected flow, which is the product of the other three parameters. The formula is:

O D_{i j} = \frac{n_{i j}}{(T) (O_{i}) (D_{j})} .

Each OD_ij expresses the departure of the observed flow, n_ij, from the expected flow based on the assumption of no association between the destination region j and the origin region i, i.e., (T)(O_i)(D_j). They have been interpreted as indicators of the accessibility, the ease of interaction, or the attractiveness between two regions (Rogers, Willekens, Little et al. 2002).

Values equal to 1.0 indicate independence, i.e., no association between the origin and the destination. As implied by Equation 1, if an OD_ij parameter is equal to 1.0, n_ij is determined by the values of T, O_i and D_j alone. A departure from 1.0 in either direction is an indication of an association between the destination and the origin. Values greater than 1.0 indicate higher than expected levels of accessibility/attractiveness and values less than 1.0 indicate less than expected accessibility/attractiveness.

Since the 1973 diagonal effects are generally greater than 1.0, it appears migrants were unexpectedly attracted to destinations in the same category of municipality. Category 6 was an exception. Migrants from large towns of more than 100,000 inhabitants (i.e., Category 6) were more attracted to commuter municipalities (i.e., Category 3) than to other large towns (2.015 is greater than 0.676).

Table 2 shows all the parameters necessary for reproducing the 1973 and 1976 flows. To verify that any flow in Table 1 can be reproduced by the multiplicative components, take, for example, the 1973 flow from Category 2 to Category 3:

n_2,3 =6953=17168.003×0.962×0.644×0.653 .

The parameter values, however, are not all independent of each other. In other words, some parameter values can be derived from the others. For one year of data, for all i and j combinations, there are 36 interaction effects, 6 origin main effects, 6 destination main effects, and one overall effect as reported in Table 2. However, the 49 parameters, reported for each year in Table 2, were derived from only 36 observed flows, making 13 more parameters than original data points, implying that 13 parameters must be redundant. In other words, 13 of the 49 parameters can be calculated from the other 36, and the relationship between parameters is determined by the following constraints associated with geometric mean effect coding. The first set of constraints forces the products of the origin main effects (and destination effects) to be equal to 1. This is expressed as

​ \prod_{i} O {}_{i} = 1 ​ and ​ \prod_{j} D {}_{j} = 1 ​ .

The second set of constraints is imposed on the interaction elements of each row and column, making the products of the interior elements in each row (and column) equal to 1. In other words, if five of the interaction effects associated with a particular origin (or destination) are given, the sixth interaction effect would be implied.

This is expressed as

​ \prod_{i} O {}_{i j} = 1 ​ and ​ \prod_{j} D {}_{i j} = 1 ​ .

In general, if there are m regions there are m² linearly independent parameters and 1+m+m+(m×m) multiplicative components. For all of the geometric mean effect coding computations, see Table 2 in the Multiplicative Components sheet of the accompanying workbook.

Application 3: Total sum reference coding

Geometric mean effect coding, which uses the geometric mean as the reference value, was the earliest log-linear decomposition used to describe migration (Rogers, Willekens, Little et al. 2002; Willekens 1983). Recently, however, total sum reference coding has become more standard (Raymer and Rogers 2007; Rogers, Little and Raymer 2010). While both decompositions satisfy Equation 1, the effects under total sum reference coding are more transparent. For example, the main effect, T, is now the total number of migrants, denoted n₊₊. O_i is now the proportion of all migrants leaving from region i (i.e., n_i+/n₊₊), and D_j is the proportion of all migrants moving to region j (i.e., n_+j/n₊₊). The interaction component OD_ij is now defined as n_ij/[(T)(O_i)(D_j)] or the ratio of the observed number of migrants, n_ij, to the expected number, (T)(O_i)(D_j). All effects taken together provide another way to represent the spatial structure of migration.

The multiplicative components derived from total sum reference coding are set out in Table 3. Consider, for example, the 8566 migrations from Category 1 to Category 3 in 1973 disaggregated into the four multiplicative components:

\begin{array}{l} n_{13} = (T) (O_{1}) (D_{3}) (O D_{13}) \\ = n_{+ +} (\frac{n_{1 +}}{n_{+ +}}) (\frac{n_{+ 3}}{n_{+ +}}) [\frac{n_{13}}{(n_{+ +}) (\frac{n_{1 +}}{n_{+ +}}) (\frac{n_{+ 3}}{n_{+ +}})}] \\ = (716488) (\frac{139859}{716488}) (\frac{88025}{716488}) (\frac{8566}{17183}) \\ = 716488 (0.102) (0.190) (0.477) \\ = 8566 . \end{array}

The interpretations of these components are relatively straightforward. The overall component is the reported total number of migrations in 1973, i.e., 716,488. The origin component represents the share of all migrants from each region, i.e., 10 per cent of all migrations originated in the Category 1. The destination component represents the shares of all migrations to each region, i.e., 19 per cent of all migrations had Category 3 as the destination. Finally, the interaction component represents the ratio of observed migrants to expected migrants, and there were roughly 48 observed migrations between region 1 and 3 for every 100 expected. The expected flow is based on the marginal total information, i.e., (T)(O₁)(D₃).

Table 3 Multiplicative components using total sum reference coding

A. 1973 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	1.354	0.868	0.499	1.232	0.882	0.866	0.195
	2	0.866	1.295	0.523	1.043	1.135	1.084	0.151
	3	0.773	0.718	1.485	0.650	0.979	1.664	0.106
	4	1.212	0.994	0.497	1.109	0.980	0.971	0.101
	5	0.954	1.108	0.827	0.894	1.043	1.129	0.162
	6	0.863	0.980	1.696	0.970	1.000	0.736	0.284
	Total	0.267	0.196	0.123	0.127	0.132	0.156	716,488
B. 1976 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	1.834	0.993	0.477	1.354	0.988	0.783	0.102
	2	1.012	1.353	0.562	1.085	1.048	0.932	0.189
	3	0.649	0.757	1.212	0.747	1.040	1.422	0.166
	4	1.363	1.087	0.667	1.160	0.952	0.942	0.107
	5	0.943	1.058	0.942	0.894	0.976	1.121	0.163
	6	0.785	0.837	1.536	0.963	0.980	0.822	0.273
	Total	0.118	0.215	0.190	0.129	0.160	0.188	657,885

Like geometric mean effect coding, the decomposition based on total sum reference coding gives more parameters than original data points. The constraints that define the relationships between parameters, and thus allow the redundant parameters to be derived, are as follows:

\begin{array}{l} ​ \sum_{i} O_{i} = 1 ​; ​ \sum_{j} D_{j} = 1 ​; ​ \\ \frac{\sum_{i} O_{i} \sum_{j} O D_{i j}^{}}{m} = 1 ​, and \\ ​ \frac{\sum_{j} O_{i} \sum_{i} O D_{i j}^{}}{m} = 1 ​ . \end{array}

where m is the number of regions (Raymer, Bonaguidi and Valentini 2006).

For all of the total sum reference coding computations, see Table 3 in the Multiplicative components sheet of the accompanying workbook.

Comparing two multiplicative component models

If the same decomposition scheme is applied to two sets of flow data from a given system of regions, all but the T parameter are scale free. This means that taking the ratios of two sets of components provides a simple method for examining stability in migration structure without confounding the effects of growth or decline in overall levels of migration (Rogers, Willekens, Little et al. 2002). In Table 4, ratios of the 1976 to 1973 components are displayed. Several depart substantially from 1 indicating the migration structure changed in the three years between 1973 and 1976. For example, the ratio of the components for OD₁₁ is equal to 1.354, implying that migration within Category 1 was more attractive in 1976 than in 1973. In contrast, the ratio of the components for OD₃₃ is equal to 0.816, suggesting migration within Category 3 was less attractive in 1976 than in 1973.

Table 4 Ratios of 1976 to 1973 multiplicative components

	Destination
Origin	1	2	3	4	5	6	Total
1	1.354	1.144	0.957	1.099	1.121	0.904	0.522
2	1.169	1.045	1.075	1.040	0.923	0.860	1.252
3	0.839	1.055	0.816	1.149	1.062	0.854	1.562
4	1.125	1.093	1.342	1.046	0.972	0.970	1.058
5	0.988	0.955	1.139	1.000	0.936	0.993	1.003
6	0.909	0.854	0.906	0.993	0.980	1.117	0.961
Total	0.441	1.096	1.546	1.015	1.214	1.210	0.918

Application 4: The log-linear additive model

Another form of the saturated log-linear model, which is an alternative to the multiplicative component model, is the linear additive model. Whether using the linear additive or the multiplicative form of the saturated log-linear model, the parameters represent the spatial structure of migration (Rogers, Willekens, Little et al. 2002) and each flow value can be fully reproduced by the parameters.

Because the multiplicative formation is formally equivalent to the gravity model (Willekens 1983), it is considered to be more appropriate than the linear additive model for representing spatial migration structures. On the other hand, the linear additive form is often found in statistics and when a standard statistical package (e.g., SPSS, Stata, R) is used to estimate a log-linear model, the parameters are always reported in the linear additive form. For that reason, the conventional calculations and interpretations of the parameters in the linear additive model are described in this sub-section.

The additive formulation is a linear function of logarithms and it makes evident why the model came to be called the log-linear model (Knoke and Burke 1980). It is mathematically equivalent to the multiplicative component model and it results from taking logarithms of both sides of Equation 1 as follows:

\ln (n_{i j}) = \ln (T) + \ln (O_{i}) + \ln (D_{j}) + \ln (O D_{i j})

which can be expressed more concisely as:

\ln (n_{i j}) = λ + λ_{i}^{O} + λ_{j}^{D} + λ_{i j}^{O D} .

Equation 2

The λ values are simply the natural logarithms of the parameters appearing in Equation 1. The O, D, and OD superscripts are parameter descriptors (not exponents) and the subscripts i and j refer to the categories of the origin and destination variables, respectively.

Applying natural logarithmic transformations to the parameters in Table 2 and Table 3 would result in sets of corresponding linear additive parameters. However, just as there are at least two decompositions of the multiplicative component model, i.e., the geometric mean reference coding and the total sum effect coding, there are multiple strategies for arriving at sets of parameters that satisfy the linear additive model (Powers and Xie 2008), and the approaches taken by the standard statistical packages are not simply logarithmic transformations of the multiplicative components derived earlier.

Recall that a migration system with m regions has m×m linearly independent parameters. The multiplicative component models described above give an interpretable value for 1+m+m+(m×m) parameters, though they are not linearly independent of each other. On the other hand, statistical routines in SPSS, Stata, and R calculate and report only linearly independent parameters, resulting in 1 value for

​ λ_{}^{T} ​

, m-1 values for

​ λ_{i}^{O} ​

, m-1 values for

​ λ_{j}^{D} ​

, and
(m-1) ×(m-1) values for

​ λ_{i j}^{O D} ​ .

The particular set of parameter values that is calculated and reported depends on the contrast coding scheme used by the software. Contrast coding blocks out one region by fixing all linear additive parameters for that region equal to 0. SPSS, for example, fixes the parameters for the last region, i.e., the region assigned the highest numeric value, m, in this case:

​ λ_{m}^{O} = λ_{m}^{D} = λ_{m j}^{O D} = λ_{i m}^{O D} = 0 ​ .

The parameters of the Netherlands data reported by SPSS are displayed in Table 5. The SPSS commands that generate these results for the 1973-migration table, along with the SPSS output, are presented in Appendix 1 [1]. Table 5 with the Excel formulae for calculation of the parameters are available in the Contrast coding sheet of the accompanying workbook.

Table 5 Additive linear parameters using "last region" contrast coding

A. 1973 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	0.288	-0.284	-1.388	0.076	-0.289	0.000	-0.212
	2	-0.384	-0.109	-1.565	-0.315	-0.261	0.000	-0.243
	3	-0.926	-1.128	-0.949	-1.216	-0.837	0.000	-0.168
	4	0.062	-0.262	-1.505	-0.143	-0.297	0.000	-0.753
	5	-0.327	-0.304	-1.146	-0.509	-0.385	0.000	-0.133
	6	0.000	0.000	0.000	0.000	0.000	0.000	0.000
	Total	0.698	0.518	0.598	0.071	0.141	0.000	10.056
B. 1976 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	0.897	0.219	-1.122	0.389	0.057	0.000	-1.033
	2	0.129	0.355	-1.132	-0.007	-0.059	0.000	-0.240
	3	-0.738	-0.648	-0.785	-0.802	-0.488	0.000	0.049
	4	0.416	0.125	-0.971	0.050	-0.165	0.000	-0.798
	5	-0.126	-0.075	-0.799	-0.385	-0.314	0.000	-0.208
	6	0.000	0.000	0.000	0.000	0.000	0.000	0.000
	Total	-0.517	0.151	0.634	-0.222	0.013	0.000	10.233

Notice the parameters for the last region are equal to 0, and, therefore, make no contribution to Equation 2. Interpretation of the parameters in Table 5 is somewhat complicated since they are in logarithmic units. Conversion back to the multiplicative components by exponentiation gives yet another set of multiplicative components that satisfy Equation 1. These are presented in Table 6, and they are the multiplicative components associated with “last region” contrast coding. Generally, these are not used to describe the spatial structure of migration, but they are useful in describing migration systems because the interaction parameters, OD_ij, are equivalent to odds ratios.

Table 6 Multiplicative components using "last region" contrast coding

A. 1973 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	1.333	0.753	0.250	1.079	0.749	1.000	0.809
	2	0.681	0.897	0.209	0.730	0.770	1.000	0.785
	3	0.396	0.324	0.387	0.296	0.433	1.000	0.845
	4	1.064	0.769	0.222	0.867	0.743	1.000	0.471
	5	0.721	0.738	0.318	0.601	0.680	1.000	0.876
	6	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	Total	2.009	1.679	1.819	1.073	1.151	1.000	23,304
B. 1976 Migration table
		Destination
	Origin	1	2	3	4	5	6	Total
	1	2.453	1.245	0.326	1.475	1.059	1.000	0.356
	2	1.138	1.426	0.322	0.993	0.943	1.000	0.786
	3	0.478	0.523	0.456	0.448	0.614	1.000	1.051
	4	1.516	1.133	0.379	1.051	0.848	1.000	0.450
	5	0.882	0.928	0.450	0.681	0.731	1.000	0.812
	6	1.000	1.000	1.000	1.000	1.000	1.000	1.000
	Total	0.596	1.163	1.885	0.801	1.013	1.000	27,810

For example, the overall parameter from the 1973-migration data reported in Table 5, λ^T, gives the natural logarithm of the observed migrations for the reference region:

ln(n₆₆)=10.056, and from Table 6, the companion parameter T gives the n₆₆ migration flow:
- n₆₆=exp(10.056)=23304.

Another illustration from the 1973-migration table in Table 5 shows how the origin main effects,

​ λ_{i}^{O} ​

, are added to the overall parameter to reproduce the migrations from Category 1 to the reference destination, Category 6, reported in Table 1. For example:

ln(n₁₆)= 10.056-.212=9.845, and the corresponding multiplicative component, O₁ times T from Table 6 gives:
- n₁₆=27810*.356=18856.

Using the same approach, the logarithms of all the migration flows can be reproduced by applying Equation 1 with the appropriate parameters from Table 6, or the observed flows can be reproduced by applying Equation 2 using the parameters in Table 5.

The association parameters in the linear form,

​ λ_{i j}^{O D} ​

, are logged odds ratios (LORs), which are the logarithm of the ratio of two odds: 1) the odds of migration to region j rather than the reference region, conditional on originating in region i; and 2) the odds of migration to region j rather than the reference region, conditional on originating in the reference region. For example, from the 1973 sub-matrix in Table 5,

​ λ_{23}^{O D} ​

= -1.565, which is calculated as:

λ_{23}^{O D} = \ln [\frac{\frac{n_{23}}{n_{26}}}{\frac{n_{63}}{n_{66}}}] = \ln [\frac{\frac{6, 953}{18, 282}}{\frac{42, 399}{23, 304}}] = - 1.565 .

In words, the parameter is described as the logged ratio of the odds of migration to Category 3, rather than to Category 6, between a migrant originating in Category 2 and one originating in Category 6.

Odds ratios measure the relative likelihood of one outcome to another, and because they are more standard than LOR, it may be easier to exponentiate the LORs and interpret the association parameters, presented in Table 6, as odds ratios. For example, the model parameter OD₂₃, for the 1973 data, is calculated as:

O D_{23} = \exp (- 1.565) = [\frac{\frac{n_{23}}{n_{26}}}{\frac{n_{63}}{n_{66}}}] = 0.209 .

In words, the odds that a migrant from Category 2 will choose Category 3 over Category 6 is approximately 1/5^ththe odds that a migrant from Category 6 will choose Category 3 over Category 6. Odds-ratios are always positive and always depend on the choice of reference category. An odds ratio equal to 1 means a null relationship, i.e., statistical independence. Values higher than 1 mean a positive association and values less than 1 indicate a negative association.

Stata and R use a different contrast coding scheme to SPSS. Both of these statistical packages use the “first region” contrast coding as opposed to the “last region” contrast coding used by SPSS. In these two programs, the parameters for the first region, i.e., the region assigned the lowest numeric value, are fixed to be equal to 0, i.e.,

​ λ_{1}^{O} = λ_{1}^{D} = λ_{1 j}^{O D} = λ_{i 1}^{O D} = 0 ​ .

The Stata and R commands for generating the linear additive parameters, as well as the corresponding output, for the 1973 migration data can be downloaded from Appendix 1 [1].

All forms of the saturated model and all statistical methods for estimating the interaction parameters are in agreement and provide substantively similar results. The formulae for the calculations of the parameters are available in the Linear Additive Parameters sheet of the accompanying workbook. Furthermore, tests that each linear additive interaction parameter is equal to 0 are done automatically by SPSS and Stata. These results are available from Appendix 1 [1] and they show that each non-redundant interaction parameter is statistically significant. See Agresti and Finlay (2009) and Powers and Xie (2008)) for descriptions of the standard errors of the estimates.

Application 5: The independence model

All the models presented to this point have been saturated, and, therefore, perfectly represent the observed flows. Generally, the substantively interesting parameters are the interaction parameters because they indicate associations between pairs of regions. The independence model, however, hypothesizes that the interaction parameters are uninteresting and unnecessary because all multiplicative interaction parameters, OD_ij, are equal to 1, or, equivalently, all linear additive interaction parameters,

​ λ_{i j}^{O D} ​

, are equal to 0. The independence model implies that the interaction terms should fall out of the model, reducing it to the most parsimonious form of a two-variable model, i.e.

​ n_{i j} = (T) (O_{i}) (D_{j}) ​

,or, equivalently,

​ \ln (n_{i j}) = λ + λ_{i}^{O} + λ_{j}^{D} ​ .

Visual inspection of the interaction parameters in the saturated log-linear model is one strategy for investigating the independence hypothesis. Another method is to calculate row or column conditional distributions. If the conditional distributions within rows (origins) are identical, there is independence between origins and destinations. In addition, since independence is a symmetric property, if the conditional distributions within rows (origins) are identical, the distributions within columns (destinations) also will be identical (Agresti and Finlay 2009; Powers and Xie 2008). In the Independence sheet of the accompanying workbook, the percentages of the Netherlands migrations within columns (destinations) are calculated. The column percentages are quite varied, suggesting, like the interaction parameters, that statistical independence is unfounded in this example.

The independence hypothesis implies that each particular inter-regional flow can be determined by the sizes of the marginal flows. Let N_ij be the expected flow between regions i and j if the independence hypothesis is true. N_ij is then equal to the total number of flows in the migration system, n₊₊, multiplied by the proportion of the all migrants leaving from region i, n_i+/n₊₊, times the proportion of all migrants moving to region j, n_+j/n₊₊, i.e., N_ij = n₊₊(n_i+/n₊₊)(n_+j/n₊₊). If independence can be assumed, a good estimate of an inter-regional flow is N_ij, and the problem of estimating inter-regional migration flows is truly simplified.

The differences between the observed flows, n_ij, and the expected flows, N_ij, form the basis of the goodness-of-fit evaluation and the Pearson Chi-Squared Statistic, denoted Χ², which is widely used to summarize these discrepancies. It is calculated as:

χ^{2} = \sum \frac{{(n_{i j} - N_{i j})}^{2}}{N_{i j}} ​,

where the summation is taken over all internal cells in the migration matrix. When there is perfect agreement between the observed and the expected flows, over all cells, the Χ² equals 0 indicating the independence model fits the data perfectly. Larger differences between n_ij and N_ij produce larger Χ²values and increasingly stronger evidence that the independence model is inadequate. In general, smaller values indicate a good fit and larger values a poor fit.

If the independence hypothesis is true, the Χ² statistic is governed by the Χ²probability distribution with (m-1)×(m-1) degrees of freedom. This distribution provides the basis for testing the significance of the Χ²statistic (Agresti 2007; Agresti and Finlay 2009). If the Χ²statistic falls in the right-sided extremes of its distribution, it signifies a low probability, e.g., p<0.05, that the independence hypothesis is true, and the model is rejected. The Χ² values associated with independence model applied to the Netherlands data in Table 1 are calculated and reported in the Independence sheet of the accompanying workbook. See Appendix 2 [1] for the SPSS, Stata and R commands for testing the independence model with the 1973 example data.

The Χ² value associated with the 1973 example data is 47,623, and the degrees of freedom (df) are 25. The associated p-value is less than 0.000, and the hypothesis of independence is rejected. (However, see the comments below about the limitations of this test when the sample size is large.) This is not surprising given the three multiplicative decompositions of the Netherlands data, presented in Table 2, Table 3 and Table 6. The evidence consistently shows strong associations between regions and many of the multiplicative association parameters are not close to 1. Furthermore, the standard errors reported in Appendix 1 [1] by SPSS and Stata indicate the linear additive interaction parameters are significantly different from 0.

One alternative to the Χ² statistic is called either the likelihood ratio statistic, the deviance, or the G² statistic. All are different names for the same test statistic, and which name is used is determined by the preferences of authors of text books and software packages. For simplicity, G²will be adopted here. It is similar to the Χ² in that values close to 0 indicate a well-fitting model and large values indicate a poor fit. If the hypothesized independence model holds, the G² statistic also has a Χ² distribution.

The G² statistic has general utility that goes well beyond the independence model in log-linear analysis. It is widely used for comparing a simpler model to a more complex model. The G² statistic is derived from the ratio of two likelihoods: 1) the likelihood that the constrained model (here the model of independence) fits the data; and 2) the likelihood that the unconstrained model (here the saturated model) fits the data. If the ratio is close to 1, the simpler, constrained, and more parsimonious model is preferred because it represents the data as well as the more complex model does.

The ratio of the two likelihoods does not have a Χ² distribution. However, when the ratio is transformed into natural logarithm units and multiplied by -2, it becomes G², which is a Χ² distributed variable with (m-1)×(m-1)degrees of freedom. If L_c is the likelihood associated with the constrained (i.e., independence) model, and L_u is the likelihood under the unconstrained (i.e., saturated) model, then G² is calculated as:

​ G^{2} = - 2 \ln (\frac{L_{c}}{L_{u}}) = - 2 \ln L_{c} + 2 \ln L_{u} ​ .

Because the saturated model fits the data perfectly, i.e., L_u = 1, G² = –2ln L_c. The values, based on the example and the statistical software, are reported in Appendix 2. The value is reported to be 46,477.63 and it is called “Deviance” by SPSS and Stata. It is rounded and reported to be equal to 46,480 by R, where it is called “Residual Deviance.” With 25 degrees of freedom the probability that the independence model holds is effectively 0.

The Χ² and the G² statistics are asymptotically equivalent (Powers and Xie 2008) and they form the bases of the Pearson Chi-square and the likelihood ratio tests, respectively. As with all inferential tests, effective use requires attention to underlying assumptions as well as limitations. Both tests rely on the assumption that each inter-regional flow count in the migration table follows an independent Poisson distribution (Powers and Xie 2008) and both tests have the important limitations that are related to sample size. The Χ² statistic is inflated by large samples. Therefore, the Pearson Chi-square test is not appropriate to when the sample size is large. The G² statistic and the likelihood ratio test is preferred in this situation (Powers and Xie 2008). The Pearson Chi-square test is preferred when the expected frequencies average between 1 and 10, but neither statistic works well if most of the expected frequencies are less than 5 (Agresti and Finlay 2009; Powers and Xie 2008).

Criticism has been made of the G² statistic as well when samples are large (Raftery 1986, 1995) and there is growing consensus that information measures should be considered along with traditional significance tests in assessing model fit. The Bayesian Information Criterion (BIC) is closely related to G², and it is calculated by Stata as: BIC = G²–df ln(mxm), and by SPSS as:

​ B I C = - 2 \ln L_{c} + p \ln (m \times m) ​,

where p is the number of parameters estimated in the independence model, i.e., 2m-1. A low value suggests choosing the independence model over the saturated model (Powers and Xie 2008).

Akaike’s Information Criterion (AIC) is another alternative that takes on smaller values for better fitting models, since it judges how close the fitted values are to the expected values (Agresti 2007). In SPSS and R, it is calculated as:

​ A I C = - 2 (\ln L_{c} - p) ​,

where p is the number of parameters estimated in the independence model, i.e., 2m-1. In Stata, it is calculated as:

A I C = \frac{- 2 (\ln L_{c} - p)}{m \times m} .

As shown in Appendix 2 [1], SPSS and Stata report the BIC and AIC, and R reports only the rounded AIC. As previously stated, there are differences in the formulae used. The BIC reported by SPSS equals 46,934.237, and the BIC reported by Stata equals 46,388.04. R reports only the AIC, which is equal to 46,920, the rounded value reported by SPSS, 46,916.818. Stata’s AIC value is substantially smaller and is equal to 1,303.245. All reported BIC and AIC values are large and add to the growing evidence that discredits the independence model for this example.

The quasi-independence model

The independence model rarely provides an adequate fit to migration data. This is due, in part, to the overwhelming tendency to continue to reside in the same region. The quasi-independence model allows these “immobility” effects (Powers and Xie 2008) to be removed from the model, and this often results in improved predictions of inter-regional migration flows. The quasi-independence model has been applied effectively to migration data obtained from national censuses (Agresti 1990; Rogers, Little and Raymer 2010; Rogers, Willekens, Little et al. 2002), where persons who reported living in the same region at the time of the census as at the beginning of the reference period are represented in the diagonal elements of a migration table.

To illustrate, United States native-born migration data between 1985 and 1990 are reported in Panel A of Table 7. Clearly, the flows reported in the four diagonal elements of the interior sub-matrix are substantially larger than the off-diagonal elements, indicating that the propensity to maintain residence in the same region is much more typical than migration between regions.

The clustering along the diagonal cells contributes significantly to the poor fit of the independence model, and the dominating influence of the persons remaining in the region of origin have caused researchers to favour omitting them from the model. If migrants are defined as people changing their region of residence, this type of flow matrix is sometimes called a “migrants only” matrix. It is particularly useful for studying migration structure since it eliminates people who made no move or moved within the same region. Panel B of Table 7 displays the flow table with the diagonal elements set to 0, and the marginal totals adjusted accordingly.

Table 7 United States native-born migration flows, 1985-1990

A. Full migration table
	Destination
Origin	Northeast	Midwest	South	West	Total
Northeast	40,262,319	336,091	1,645,843	479,819	42,724,072
Midwest	351,029	50,677,007	1,692,687	958,696	53,679,419
South	778,868	1,197,134	69,563,871	1,150,649	72,690,522
West	348,892	668,979	1,082,104	37,872,893	39,972,868
Total	41,741,108	52,879,211	73,984,505	40,462,057	209,066,881
B. Migrants-only table
	Destination
Origin	Northeast	Midwest	South	West	Total
Northeast	0	336,091	1,645,843	479,819	2,461,753
Midwest	351,029	0	1,692,687	958,696	3,002,412
South	778,868	1,197,134	0	1,150,649	3,126,651
West	348,892	668,979	1,082,104	0	2,099,975
Total	1,478,789	2,202,204	4,420,634	2,589,164	10,690,791

The multiplicative components, using total sum reference coding, for the full migration table and the migrant-only table are reported in Table 8. The magnitude of the multiplicative component model parameters for the full data certainly departs from what is expected under the hypothesis of independence. They are substantially above 1.0 on the diagonal and the off-diagonal components are far below 1.0. In comparison, the multiplicative components for the migrants-only table are constrained to be equal to 0 in order to reproduce the structural zeros on the diagonal, and, as a result, the off-diagonal components are closer to 1.0

Table 8 Multiplicative components* of United States native-born migration flows, 1985-1990

A. Full migration table
	Destination
Origin	Northeast	Midwest	South	West	Total
Northeast	4.720	0.031	0.109	0.058	0.204
Midwest	0.033	3.733	0.089	0.092	0.257
South	0.054	0.065	2.704	0.082	0.348
West	0.044	0.066	0.076	4.896	0.191
Total	0.200	0.253	0.354	0.194	209,066,881
B. Migrants-only table
	Destination
Origin	Northeast	Midwest	South	West	Total
Northeast	0.000	0.663	1.617	0.805	0.230
Midwest	0.845	0.000	1.363	1.318	0.281
South	1.801	1.859	0.000	1.520	0.292
West	1.201	1.547	1.246	0.000	0.196
Total	0.138	0.206	0.413	0.242	10,690,791
*Total sum reference coding

The quasi-independence model requires that only migrations between different regions satisfy the independence assumption. This is estimated in two different but equivalent ways. The first method takes the full migration table data as in Panel A of Table 8, and fixes the weights on the interactive effects, OD_ij , to be zero when the regions of origin and destination are the same, i.e., i=j, insuring that n_ij=0. These are called structural zeros. When the origin and destination regions are different, i.e.,

​ i \neq j ​

, the interaction effects are fixed at 1.0, which is the familiar independence model and gives the predicted off-diagonal flows under the quasi-independence hypothesis. Implementation of this method in SPSS, Stata and R is illustrated in Appendix 3 [1] (available on the Tools for Demographic Estimation website).

The second method does not use the full migration data, but uses the migrants-only data as in Panel B of Table 7. It is best presented with the additive form:

​ \ln (n_{i j}) = λ + λ_{i}^{O} + λ_{j}^{D} + δ_{i} I ​

, where I is an indicator variable taking on values of 1 for the diagonal flows, i.e., when i=j, and values of 0 for the off-diagonal flows, i.e., when

​ i \neq j ​

(Agresti 2002). Therefore, an extra parameter,

​ δ_{i} ​

, is necessary to estimate each diagonal flow, and for the other inter-regional flows the

​ δ_{i} I ​

term falls out and the quasi-independence model reduces to the independence model. Consequently, just like the independence model, the off-diagonal interaction terms are constrained to be equal to 0 in the additive form of the model (and equal to 1 in the multiplicative form). Application of this method in Stata is illustrated in Ap [1]pendix 3 [1].

In the first method, the quasi-independence model fixes m parameters, OD_ii , for i = 1 to m, to be equal to 0. In the second method, m additional parameters,

​ δ_{i} ​

, are estimated, and when exponentiated will be very close to 0. Using either method, the quasi-independence model has m more parameters than the full independence model and the degrees of freedom are reduced by m.

Appendix 3 [1] (available on the Tools for Demographic Estimation website) illustrates how the quasi-independence model is estimated with statistical software packages SPSS, Stata and R, using the United States native-born migration flow data, 1985-1990. When the independence model is estimated with the full data, as expected, all goodness-of-fit indicators are extremely large: Χ² =544,479,395 (df= 9); G² = 461,411,576 (df= 9); Stata values for BIC and AIC are 461,000,000 and 28,800,000, respectively. When the quasi-independence model is estimated, all values were reduced substantially: Χ² =327,233 (df=5); G² =330,220(df=5); Stata values for BIC and AIC equal 330,207 and 27,535, respectively.

The inferential tests remain significant, and the quasi-independence model must be rejected as the true migration model. The independence and the quasi-independence models should not be compared, inferentially, with the likelihood ratio test because they are not nested models. However, the information measures may be compared directly. Both the BIC and AIC are reduced substantially, favouring the quasi-independence model over the independence model.

In addition, the predicted flows from the independence model can be contrasted with those from the quasi-independence model in Table 9. Visually comparing the predicted flows in Table 9 with the observed data in Table 7 reveals how much closer the quasi-independence model comes to representing the data. Two additional summary statistics are reported: R² and Mean Absolute Percent Error (MAPE). A comparison of the R² values shows the independence model explains 10% of the variation in the observed data and the quasi-independence model explains 95%. Furthermore, the average percent error for the quasi-independence model (MAPE=28) is dramatically reduced in comparison to the independence model (MAPE=2,492).

Since the fit of the quasi-independence model is not close enough to the observed data, it must be rejected as the “true” model. However, without observed migration data, the quasi-independence model may still offer a reasonable, but coarse, method for estimating inter-regional flows.

Table 9 Predicted United States native-born migration flows, 1985-1990, under independence and quasi-independence

A. Independence
	Destination
Origin	1	2	3	4
1	8,530,046	10,806,184	15,119,178	8,268,664
2	10,717,328	13,577,116	18,996,052	10,388,923
3	14,512,977	18,385,588	25,723,693	14,068,264
4	7,980,756	10,110,323	14,145,583	7,736,206
	R²=	0.104	MAPE=	2492.322
B. Quasi-independence
	Destination
Origin	1	2	3	4
1	0	535,839	1,349,561	576,353
2	442,768	0	1,793,640	766,005
3	720,681	1,159,163	0	1,246,806
4	315,340	507,201	1,277,434	0
	R²=	0.945	MAPE=	27.575

Application 6: The method of offsets

The validity of the independence and quasi-independence models can be evaluated with the inferential test statistics that accompany the log-linear model output, and, even when the models are not supported with significance tests, these models may be applied, in some contexts, to produce meaningful estimates of migration flows. The method of offsets assumes the auxiliary data have an implied structure of inter-regional associations that resembles the unknown migration structure. The method of offsets borrows the structure of the auxiliary data to derive the estimates of the missing migration flow data.

In past research, the auxiliary information, typically, has been a table of migration flows from another period in history (Rogers, Little and Raymer 2010; Rogers, Willekens, Little et al. 2002; Rogers, Willekens and Raymer 2003; Willekens 1983), but it could be from another age (Raymer and Rogers 2007), another sex or race group. It could be from another data source all together such as tax return data or motor vehicle registration data.

Given the auxiliary flow data,

​ n_{i j}^{*} ​

, the log-linear-with-offsets model is specified as:

​ \ln ({\hat{n}}_{i j}) = λ + λ_{i}^{O} + λ_{j}^{D} + \ln (n_{i j}^{*}) ​ .

This model will estimate flows,

​ {\hat{n}}_{i j} ​

, that have a migration structure that comes as close as possible to that of the auxiliary flow data, and, at the same time, the estimated flows are adjusted to sum to the marginal totals pre-specified by the researcher. In this way, the method of offsets is similar to the independence and quasi-independence models in that it provides an expected distribution of the flows such that the marginal row and column totals are equal to the a priori estimates.

To illustrate the workings of the method of offsets, consider the Netherlands 1976 migration flow matrix in Table 1. Suppose we wish to keep the numerical values of the row and column marginal totals, but, at the same time, wish to replace the migration interaction effects observed during that year by those observed during 1973, using the method of offsets. What would be the corresponding set of log-linear parameters? Table 10 sets out the predicted flow matrix obtained by the method of offsets in Panel A, and Panel B presents the associated multiplicative components derived using the total sum reference coding. Note that the T, O_i and D_j values of the predicted matrix, i.e., Panel B of Table 10, are identical to those reported for the observed 1976 flow matrix in Panel B of Table 3. However, the other terms (i.e., the interaction effects, OD_ij) reflect the influence of the migration structure of the observed 1973 data, Panel A of Table 3, as well as the row and column totals taken from the 1976 data. Therefore, the method of offsets applies the structure of the auxiliary data, the 1973 data in this case, to the interior flows, and at the same time, preserves the total number of flows observed in the 1976 data.

Table 10 Inter-regional migration flows in the Netherlands (1976), predicted with the method of offsets from the marginal totals (1976) and the migration flow table (1973)

PANEL A: Predicted using method of offsets
	Destination
Origin	1	2	3	4	5	6	Total
1	12,344	13,769	6,890	12,199	10,361	11,518	67,081
2	13,329	34,695	12,195	17,445	22,522	24,353	124,539
3	9,728	15,711	28,330	8,883	15,881	30,553	109,087
4	11,281	16,107	7,011	11,216	11,764	13,187	70,566
5	12,609	25,486	16,570	12,828	17,770	21,760	107,023
6	18,116	35,786	53,984	22,110	27,058	22,535	179,589
Total	77,408	141,553	124,980	84,682	105,356	123,906	657,885
	R²=	0.966	MAPE=	8.364
Panel B. Multiplicative components using total sum reference coding
	Destination
Origin	1	2	3	4	5	6	Total
1	1.564	0.954	0.541	1.413	0.964	0.912	0.102
2	0.910	1.295	0.515	1.088	1.129	1.038	0.189
3	0.758	0.669	1.367	0.633	0.909	1.487	0.166
4	1.359	1.061	0.523	1.235	1.041	0.992	0.107
5	1.001	1.107	0.815	0.931	1.037	1.080	0.163
6	0.857	0.926	1.582	0.956	0.941	0.666	0.273
Total	0.118	0.215	0.190	0.129	0.160	0.188	657,885

The predicted results in Panel A of Table 10 were taken from the output of the SPSS, Stata, and R commands for implementing the method of offsets found in Appendix 4 [1]. See the Method of offsets sheet in the accompanying Excel spreadsheet for other calculations.

Since the flows were observed directly in 1976, there are several ways to evaluate the suitability of the method of offsets for predicting the data. One simple method is to inspect visually the ratios of the association multiplicative components, as demonstrated in Table 4. Another method is to use the inferential tests and information measures reported by the log-linear procedures. These would be testing the hypothesis that the structure of the migration flows, i.e., the interaction parameters, did not change from 1973 to 1976. In the example reported in Table 10, the corresponding G² statistic is equal to 5,914 (df=25), and the hypothesis that the auxiliary data represent the same migration process as the observed data must be rejected. The final method, of those suggested here, relies on the standard R² and MAPE statistics to assess the fit between the observed and the predicted flows. These are reported in Panel A of Table 10 and are equal to 0.97 and 8.36, respectively. These statistics, as well as the ratios in Table 4, suggest this application of the method of offsets offers a set of estimates for the migration flows in 1976 that may be quite satisfactory.

The importance placed on the goodness-of-fit statistics depends on the quality of the observed flows used as inputs to the method of offsets. If the method is to be useful in a practical situation, it must be applicable when the inter-regional flows are not directly observed. In the absence of flow data, the method would still require pre-estimates of the marginal totals. Furthermore, if the method is implemented as illustrated in Appendix 4 (available on the Tools for Demographic Estimation website), initial estimates of the inter-regional flows are required. Therefore, the pre-estimates of the row and column totals would need to be distributed across the internal cells of the flow matrix so they add up to the respective marginal totals. Table 11, Panel A, presents a typical scenario, albeit continuing to use the marginal totals from the Netherlands 1976 data, which were observed. A simple solution is to distribute the flows according to the independence model, i.e., , which results in the initial estimates of the flows displayed in Panel B of Table 11.

As long as the initial inter-regional flows add up to the marginal totals, the predicted flows are not affected by the method used to distribute the flows within the cells. This is true because the flows will be predicted, ultimately, from the auxiliary data through the method of offsets, using the iterative proportional fitting algorithm (Agresti 1990; Deming and Stephan 1940). In other words, the initial estimates of the 1976 Netherland flows, used as input to the offsets log-linear model, could be the internal cells of Table 1, Panel B, or those in Table 11, Panel B. Either set of initial estimates would yield the predicted flows that are reported in Table 10, Panel A.

On the other hand, it is important to note that the associated inferential test statistics and the information measures that accompany the method of offsets must be interpreted with respect to the initial flow estimates. For example, if the initial flows were taken from Panel B of Table 11, the associated X² and G² test statistics would be testing the hypothesis that the predicted data are distributed in a manner that is consistent with the independence model.

Table 11 The inputs to the method of offsets in the absence of observed flows

Panel A. Pre-estimation marginal totals from the Netherlands, 1976
	Destination
Origin	1	2	3	4	5	6	Total
1							67,081
2							124,539
3							109,087
4							70,566
5							107,023
6							179,589
Total	77,408	141,553	124,980	84,682	105,356	123,906	657,885

Panel B. Independence model distribution scheme for initial flow estimates
	Destination
Origin	1	2	3	4	5	6	Total
1	7,893	14,433	12,744	8,635	10,743	12,634	67,081
2	14,654	26,796	23,659	16,030	19,944	23,456	124,539
3	12,835	23,472	20,724	14,042	17,470	20,545	109,087
4	8,303	15,183	13,406	9,083	11,301	13,290	70,566
5	12,593	23,027	20,331	13,776	17,139	20,157	107,023
6	21,131	38,641	34,117	23,116	28,760	33,824	179,589
Total	77,408	141,553	124,980	84,682	105,356	123,906	657,885

It is a simple matter to modify the method of offsets to apply it to the problem of predicting a table of “migrants only.” The SPSS, Stata and R commands require minor modifications that are specified in comments in Appendix 4 (available on the Tools for Demographic Estimation website). A worked example is included in the Method of offsets, migrants only sheet of the accompanying workbook. It uses the observed U.S. flows, 1985-1990, to retrospectively estimate the 1975-80 migrant flows reported by Rogers, Willekens, Little et al. (2002).

References

Agresti A. 1990. Categorical Data Analysis. New York: Wiley.

Agresti A. 2002. Categorical Data Analysis. New York: Wiley-Interscience.

Agresti A. 2007. An Introduction to Categorical Data Analysis. Hoboken, NJ: Wiley-Interscience.

Agresti A and B Finlay. 2009. Statistical Methods for the Social Sciences. Upper Saddle River, NJ: Pearson Prentice Hall.

Alonso W. 1986. Systemic and log-linear models: From here to there, then to now, and this to that. Discussion paper 86-10. Cambridge, MA: Harvard University, Center for Population Studies.

Birch MW. 1963. "Maximum likelihood in three-way contingency tables", Journal of the Royal Statistical Society Series B-Statistical Methodology 25(1):220-233.

Deming WE and FF Stephan. 1940. "On a least squares adjustment of a sampled frequency table when the expected marginal totals are known", Annals of Mathematical Statistics 11(4):427-444. doi: http://dx.doi.org/10.1214/aoms/1177731829 [2]

Knoke D and PJ Burke. 1980. Log-linear Models. Beverly Hills, CA: Sage Publications.

Mueser P. 1989. "The spatial structure of migration: An analysis of flows between states in the USA over three decades", Regional Studies 23(3):185-200. doi: http://dx.doi.org/10.1080/00343408912331345412 [3]

Nair PS. 1985. "Estimation of period-specific gross migration flows from limited data: Bi-proportional adjustment approach", Demography 22(1):133-142. doi: http://dx.doi.org/10.2307/2060992 [4]

Powers DA and Y Xie. 2008. Statistical Methods for Categorical Data Analysis. Bingley, UK: Emerald.

Raftery AE. 1986. "Choosing models for cross-classifications", American Sociological Review 51(1):145-146. doi: http://dx.doi.org/10.2307/2095483 [5]

Raftery AE. 1995. "Bayesian model selection in social research", Sociological Methodology 25(1):111-163. doi: http://dx.doi.org/10.2307/271063 [6]

Raymer J. 2007. "The estimation of international migration flows: A general technique focused on the origin-destination association structure", Environment and Planning A 39(4):985-995. doi: http://dx.doi.org/10.1068/a38264 [7]

Raymer J, A Bonaguidi and A Valentini. 2006. "Describing and projecting the age and spatial structures of interregional migration in Italy", Population, Space and Place 12(5):371-388. doi: http://dx.doi.org/10.1002/psp.414 [8]

Raymer J and A Rogers. 2007. "Using age and spacial flow structures in the indirect estimation of migration streams", Demography 44(2):199–223. doi: http://dx.doi.org/10.1353/dem.2007.0016 [9]

Rees P and FJ Willekens. 1986. "Data and accounts," in Rogers, A and FJ Willekens (eds). Migration and Settlement: A Multiregional Comparative Study. Dordrecht: D. Reidel, pp. 19-58.

Rogers A, JS Little and J Raymer. 2010. The Indirect Estimation of Migration: Methods for Dealing with Irregular, Inadequate, and Missing Data. Dordrecht: Springer.

Rogers A, F Willekens, JS Little and J Raymer. 2002. "Describing migration spatial stucture", Papers in Regional Science 81(1):29-48.

Rogers A, FJ Willekens and J Raymer. 2003. "Imposing age and spatial structures on inadequate migration-flow datasets", The Professional Geographer 55(1):56-69.

Snickars F and JW Weibull. 1977. "A minimum information principle: Theory and practice", Regional Science and Urban Economics 7(1-2):137-168. doi: http://dx.doi.org/10.1016/0166-0462(77)90021-7 [10]

Willekens F. 1983. "Log-linear modeling of spatial interaction", Papers of the Regional Science Association 52:187-205. doi: http://dx.doi.org/10.1007/BF01944102 [11]

Downloads

Appendices 1-4.pdf (16/09/2013) [12]

MI_LLM_appendices.pdf 373.46 KB