Bas Engel, Willem Buist

Institute for Animal Science and
Health (ID-Lelystad), Lelystad, The Netherlands.

Classification of pig carcases in
the European Community is based on the lean meat percentage of the carcass. The
lean meat percentage is predicted from instrumental carcass measurements
obtained in the slaughterline. The prediction formula employed has to meet
requirements for authorisation as put down in EC regulations. Requirements
involve the sampling procedure, the sample size and the accuracy of prediction.
Formulae are often derived by linear regression. We will discuss a type of
sampling scheme, which has been submitted for authorisation on a number of
occasions, but lacks formal statistical justification when employed in
conjunction with linear regression. Our aim is to assess the performance of the
prediction formula that follows from the potentially faulty combination of the
sampling scheme and linear regression in relation to the requirements in the EC
regulations.

In linear regression, inference is based on the
distribution of the dependent variable, i.e. the lean meat percentage,
conditional upon the independent variables, i.e. instrumental carcass
measurements. To put it less formally, we make an educated guess of the
percentage lean on the basis of the likelihood of different lean meat
percentages that may correspond to the observed instrumental carcass
measurements, e.g. fat and muscle depth at specific places on the carcass. The
carcasses may be selected on the basis of the fat and muscle depth measurements
because our only interest is in the random variation of the percentage lean
given the observed fat and muscle depth measurements. We have no particular
interest in the random variation in the fat and muscle depth measurements
themselves. It is in fact well known that selection of carcasses with more
extreme fat or muscle depth measurements will improve the accuracy of prediction.
Therefore, carcasses are usually not selected randomly but according to a
sampling scheme that favours a larger percentage of more extreme instrumental
measurements.

The EC regulations (EC, 1994) state that a sample
of pig carcasses should be representative for a national or regional pig
population. Possibly because of a misunderstanding about the intention of the
regulation, carcasses are regularly selected not only on the basis of the
instrumental measurements, but also on the basis of other variables, such as
carcass weight. These additional selection variables are not intended to be
included in the prediction formula. However, carcasses no longer offer a correct
impression of the most likely values for the percentage lean, given the observed
instrumental measurements, because of the additional selection on e.g. carcass
weight. Inference based on traditional regression theory may be misleading. The
accuracy of prediction may actually be less than required in the regulations,
which means that a formula may be wrongly authorised. Obviously the European
Community is not well served by the authorisation of inaccurate formulas with
adverse effects for harmonisation between countries. Alternatively, when the
prediction formula is more accurate than it appears to be on the basis of
standard linear regression theory, it may mistakenly not be authorised. This is
quite problematic for the region or country involved since considerable effort
and expense are invested in the introduction of a new measurement instrument in
the slaughterline.

In
this talk we discuss the performance of the potentially faulty sampling scheme
on the basis of results from computer simulation. Initially, simulated data are
based on actual and historical data from The Netherlands. The instrumental
measurements are a fat and muscle depth measurement obtained with the Henessy
Grading Probe. The additional selection variable is carcass weight. We study
other data configurations as well. These are for instance relevant for formulae
for two measurement instruments derived from the same sample of dissected
carcasses. In that case selection may be based on instrumental measurements that
appear in one of the formulae but not in the other. For the latter instrument,
results from traditional regression will be in doubt and our simulation results
will indicate how quality of prediction may be affected.