Task 1: Sampling



In Commission Regulation N 3127/94, article 1, it is stated that a prediction formula should be based on  ... a representative sample of the national or regional pigmeat production concerned by the assessment method... . National pig populations are generally very heterogeneous and include several sub-populations, such as different sexes and breeds. Moreover selection concerns carcasses and the carcasses can only be selected in slaughterhouses. The main problem is therefore how to choose slaughterhouses and to select carcasses in these slaughterhouses in order to get a representative sample?

Furthermore the most popular statistical method used for the assessment of pig classification methods is linear regression. But it is a well-known fact from regression theory that a wider range of x-values (prediction variables) will give more accurate estimates of the formula coefficients (intercept a and slope b). This is contradictory with a representative sample, which can be understood as at random. This means that it has to be clearly defined what should be a representative sample for the assessment of pig classification methods.

Finally non-linear statistical methods have also been used. Sampling influence on the estimates is still not clear.

Solutions used until now  

Three different approaches have been used in EU to deal with these sub-populations.

      Sub-populations are ignored.

      Sub-populations are ignored, except that the total numbers of animals for some sub-populations are fixed beforehand and chosen proportional to the numbers in the population.

      Sub-populations are taken into account using one sub-sample for each sub-population considered.


Different strategies have also been used to select carcases on the predicting variables :

      No selection on potential predicting variables.

      Selection on potential predicting variables with introduction of these variables in the formula.

      Selection on potential predicting variables without introduction of these variables in the formula.

      Selection on variables correlated with the predicting variables.

The main objective is to define clearly how a representative sample can be chosen for assessing pig classification methods. Among the different approaches used for dealing with, on one hand sub-populations, and on the other hand, the selection on potential predicting variables, it has to be determined which ones are correct.

Another objective is to define the sampling plans for WP1 and WP3.

Work in progress  

The sampling plans for WP1 and WP3 have been designed. Model for testing operator effect on classification measurements is still under discussion.

A questionnaire for describing national pig populations has been sent to all EU Member States and also to the Newly Associated States. Compilation of informations is running.

Several documents have been written about sampling for instruments with a moderate number of carcass measurements :  

  • sampling in linear regression
  • reducing costs in linear regression experiments
  • accounting for subpopulations in prediction, selection on weight when not predictor and oversampling within subpopulations
  • effect of the sampling method on the regression parameters.

Some works are still running about consequences of selection on estimated lean meat, combination of global oversampling and quotas by subpopulations, sampling for instruments with a large number of carcass measurements.