INFLUENCE OF THE SELECTION METHOD ON THE STANDARD ERROR OF THE REGRESSION PARAMETERS: PRACTICAL POINT OF VIEW

Maria Font i Furnols

IRTA-Meat Technology Centre. Granja Camps i Armet. E-17121 Monells (Girona)

 

Summary

Lean meat percentage (LMP) is a parameter of pig carcass classification in the European Union and it is obtained by means of measurements with different kinds of automatic probes, that have been previously calibrated. To obtain the calibration equation the European Commission has developed some rules (Commission Regulation (EEC) Nš 2967/85 and its modification Nš 3127/94), that establish the number of carcasses to be dissected (120) and the fact that these have to be representative of the national or regional pig meat production. At the same time, the root mean square deviation of the errors (RMSE), measured about zero, has to be less than 2.5. Nevertheless the selection method has not been specified in the regulation and it can be done in different ways. The aim of this paper was to study, by means of resampling methods, the influence of the selection method on the regression parameters.

The lean meat percentage (LMP) was obtained by dissection and predicted by means of regression method in which the independent variables were fat thickness (g34fom) and the lean depth (m34fom) measured between the 3rd and the 4th last ribs, at 60 mm from the mid-line with Fat-o-Meter. The methods studied were random selection and three other selection methods depending on the percentage of samples selected in the group with low (m-s), medium and high (m+s) level of g34fom: 16%-68%-16% or normal distribution (se166816), 33%-33%-33% (se333333) and 40%-20%-40% (se402040). The data was divided into two sets of data that follow a normal distribution. One set of data is used for the calibration equation, and from this set different subsamples (n=120) were selected according to the four different selection methods by means of the surveyselect procedure of SAS software. From these subsamples the parameters of the regression equation were calculated. The validation set was used to obtain the root mean standard prediction error (RMSPE). This procedure was repeated 20 times.

Unimportant differences were detected in the distribution of the RMSE for the different sampling methods. The distribution of the s.e. of the different regression parameters showed the main differences, the s.e. of the g34fom being the most discriminant. Sampling with weighted extremes (se333333 and se402040) had the distribution with lower values, se402040 being the lowest. When the distribution of the RMSPE was evaluated unimportant differences can be seen depending on the selection method. When only 5% of the most extreme carcasses were taken into account, the RMSPE of the se333333 and se402040 had a tendency to have less higher values.

The influence of the sampling method in the standard error of the slope of the regression variables has been demonstrated from a practical point of view. The expected influence of the sampling method in the RMSPE, when only 5% of the most extreme carcasses were taken into account, cannot be demonstrated. It is probably due to the fact that the most extreme carcasses in the data set were not extreme enough.