A neural network based model for predicting musculoskeletal disorder risk associated to load lifting tasks

This work presents a neural network based model for predicting musculoskeletal risk disorders associated to lifting tasks. The model proves useful for the ergonomist as a diagnosis tool, and allows a classification of lifting tasks into two categories (high and low) on the basis of musculoskeletal low-back disorder risk. This model offers a higher proportion of accurate classifications than other models with a 83.9% rate of accuracy. The procedure carried out for the elaboration of the model makes use of genetic algorithms for the development of neural networks, thus achieving a high rate of adjustment and generalization.
Palabras Clave: 
lifting tasks, neural networks, low back disorders
Autor principal: 
Sabina
Asensio-Cuesta
Coautores: 
José A.
Diego-Mas


Asensio-Cuesta, Sabina

Engineering Projects Department / Technical University of Valencia / Camino de Vera s/n / 46022 Valencia, España

+34 96 387 70 00 Ext- 85689/ sasensio@dpi.upv.es

Diego-Mas, José A.Engineering Projects Department / Technical University of Valencia / Camino de Vera s/n / 46022 Valencia, España+34 96 387 70 00 Ext- 85683/ jodiemas@dpi.upv.es

ABSTRACT

This work presents a neural network based model for predicting musculoskeletal risk disorders associated to lifting tasks. The model proves useful for the ergonomist as a diagnosis tool, and allows a classification of lifting tasks into two categories (high and low) on the basis of musculoskeletal low-back disorder risk. This model offers a higher proportion of accurate classifications than other models with a 83.9% rate of accuracy. The procedure carried out for the elaboration of the model makes use of genetic algorithms for the development of neural networks, thus achieving a high rate of adjustment and generalization.

Keywords

Lifting tasks, neural networks, low back disorders

INTRODUCTION

Taking the results obtained in epidemiological studies as a starting point, it is possible to develop mathematical models for predicting low back risk disorders (LBDs) associated to lifting tasks. These mathematical models link a series of variables representing the risk factors in a specific occupation with the probability of its workers experiencing LBDs. A sufficiently accurate model can be used by ergonomists as a quick risk diagnosis tool on the basis of some of the task variables.

However, developing an accurate model poses certain difficulties. For instance, it is necessary to first assess which are the risk factors involved in the experiencing of LBDs, as well as the best way to measure their frequency of occurrence. Although there are several studies which tackle these issues [1][2][3][4][5][6], there seems tobe no agreement in the literature neither on the relevant risk factors, nor on itspossible interactions [7][8][9][10]. On the other hand, to obtain a predictive model, an important field work needs to be carried out, one which might allow us to obtain a sufficient amount of cases with the right level of accuracy. In general, a great number of cases is required, and the variable measures need to be carried out in a convenient and precise way for each case.

The most commonly employed models for risk prediction of LBDs tend to be based on statistical techniques such as logistic regressions or generalized additive models [10][11][12][13][14]. Others, however, are based on artificial neural networks (ANNs) [15][16][17][18][19]. In these studies ANNs based models seem to have ahigher predictive capacity than those based on traditional statistical techniques, although this claim cannot be generalized [20]. The validity of the model obtained will depend, to a greater degree, on the risk factors considered and on the quantity and quality of the data obtained from the field research, than on the procedure to be employed to relate risk factors and real existing risk.

In order to obtain an ANN suitable for classifying jobs according to LBD risk, those experimental variables collected in the fieldwork which are significant enough tobe used as input for the model need to be determined. Furthermore, the most suitable architecture for the network has to be determined, i.e. how many layers of neurons itcontains, the number of neurons per layer, the neuron activation function, the learningrule to be used, the initial synaptic weights and other parameters which will largely condition the validity of the model.

Zurada et al. (1997) [17], Chen et al. (2000) [18] and Chen et al. (2004) [19] developed some models for LBD prediction. These authors attempted to use ANNs with few neurons and a single hidden layer to avoid overfitting. Overfitting (also known as overtraining) occurs when a model captures the statistical noise in the data rather than the underlying signal [20], i.e. the model memorises the correct responses to each pattern rather than learning the relationships between inputs and outputs. When this happens, the model produces very accurate results with the data employed to train the network, but it is not able to generalise them to new cases. This phenomenon generally occurs when the number of internal connections in the network is excessively large compared with the size of the available training data set. By reducing the number of neurons, the number of connections between them is also reduced and therefore there is less likelihood of overfitting. On the contrary, reducing the size of the network reduces its capacity to establish the relationships between the model’s inputs and outputs. On the other hand, a network with a large number of processing elements has greater tolerance to faults, such as input data noise or the disconnection of any of the network links. There is currently no systematic procedure for identifying the most suitable features of a network for the problem to be solved. Although there are certain generic rules on how the features of a network affect its behaviour, choosing a specific architecture rather than other can be hard to justify.

The aim of this paper is to obtain a model based on ANNs which might be able to classify industrial jobs according to the risk of low back disorders improving results from previous works. Early stopping will be used to prevent overfitting. This procedure consists of using a reduced data set (validation set) to calculate model error periodically during the network training. In order to establish the ANN parameters a Genetic Algorithm (GA) will be used. These algorithms perform a stochastic guided search based on the evolution of a set of structures, selecting the fittest ones in each cycle, as it occurs in natural species evolution. More information about GAs can be found on [28][29][30][31][32].

For the development of this model the data obtained from the field work in [10] was employed, as explained in [17]. The ANN obtained from this work presents a complex topology, with several hidden layers, a high number of neurons per layer and different activation functions in each neuron. This model affords better results than previous works.

Next section describes the operation and development principles of ANNs and GAs. The third section is devoted the application of ANNs to the problem of low back disorders in previous works. The fourth section describes the procedure used and the model developed in this study. Finally, the results are presented and discussed.

OVERVIEW OF ARTIFICIAL NEURAL NETWORKS AND GENETIC ALGORITHMS

Artificial neural networks

An ANN is a mathematical model that represents a distributed adaptive system built by means of multiple interconnecting processing elements, just as real neuralnetworks do. ANNs are used in many fields of research (psychology, robotics, biology, or computer science, to name a few) [21][22] due to their ability to adapt, learn1, generalise, organise or cluster data. In Feedforward Neural Networks (FNN), the processing elements (neurons) are distributed in several layers (Figure 1). The intermediate layers are known as the hidden layers, while the first and the last layers are known as the input and output layers, respectively. In general terms, each neuron receives signals processed and transmitted by neurons in the preceding layer and in turn processes and transmits them on to the next layer. The number of layers and the way in which the neurons are connected determine the architecture of the network.

i1

i2 Inputsi3 .in .

. .

. .

o1


.

.                 Outputs

om

Input layer Hidden layer Hidden layer Output layer

Figure 1: Example of the structure of a neural network with two hidden layers, n inputs and m outputs.

The input signals (i1, i2, …, in) are the values of the variables representing an instance of the phenomenon to be modelled. They are collected by the input layer which transmits them through links to the neurons in the first hidden layer. The signals are scaled in each link according to an adjustable parameter associated with each connection between neurons called weight. Usually, the initial weight of each link is randomly set. Each neuron in the hidden layer collects the signals from the connections, adds them up and produces an output that is a function of the sum. The most commonly used functions are sigmoids, hyperbolic tangents and linear versions of the latter. The signals traverse the network from the input layer to the output layer, where the network response to the inputs is collected (o1, o2, …, om).

Supervised learning networks are able to learn the relationships between theinputs and outputs through the repeated presentation of input data and the values of the corresponding outputs. Once trained, the network can generalise these relationships to new cases. The training process consists of presenting the network with a sufficient number of input cases and the desired output values. The output obtained by the network in each case is compared against the desired output, and the network error is calculated. Then, the weights of neuron connections are modified according to the selected training algorithm in order to minimise this error. This process is repeated until a criterion previously established is reached, for example, when the error value gets to a threshold or stops decreasing. Although there are different training algorithms applicable to different types of networks, the most commonly used to train FNNs is Back-Propagation (BP) [23]. Basically, the BP algorithm works as follows: once the network error for a given input has been calculated, the weights of the connections between the neurons in the last hidden layer and the output layer are modified according to the extent to which these connections have contributed to creating the error.

BP is a gradient descent procedure which, ideally, requires infinitesimal

1 Neural network learning or training is an adaptive procedure in which the weights of the connections between neurons are incrementally modified so as to improve the network performance until reaching a specified criterion.

variations in the connection weights. In practice, a parameter called learning rate is used to determine the magnitude of the weight variations. Learning rate must be chosen carefully; an excessively small rate would lengthen the time required to achieve convergence, and an excessively high rate would cause algorithm oscillation. To prevent oscillations another parameter is used, the momentum, which gives the algorithm a certain amount of inertia. The momentum makes a change in weights dependent on the previous ones.

Genetic algorithms

Like has been aforementioned, genetic algorithms perform a stochastic guided search based on the evolution of a set of structures as it occurs in natural species evolution. The starting point is a set of problem solutions called individuals. This first set is randomly generated and called initial population. In the present work eachindividual is an ANN, and it is coded by a finite length chain called chromosome.

Each solution individual is evaluated using an evaluation function to determine its suitability for the requirements of the problem. The population undergoes several transformations that yield a new population (new generation). These transformations are guided by some genetic operators, being the most common selection, crossover and mutation, which combine or modify the chromosomes representing the individuals.

Crossover and mutation operators are applied to create a new generation of individuals that inherit the best characteristics of their predecessors. For this purpose, the individuals that will participate in each of the genetic operators, and those that will survive and pass on to the following generation, are selected previously by mean of the selection operator. The process is repeated with the new set of individuals until a certain number of iterations is reached, or until a certain number of iterations without a new best solution have been performed, making the individuals evolve to better solutions to the problem.

PREVIOUS WORKS USING NEURAL NETWORKS TO PREDICT LOW BACK DISORDERS RISK

Marras et al. (1993) [10] developed a study which aimed to relate trunk movements and other aspects of the workplace in repetitive tasks with the risk of suffering LBDs. The study included 403 industrial lifting jobs which were classified into three groups: low-risk (124 cases), medium-risk (168 cases) and high-risk (111 cases). Risk of suffering LBDs was analysed to classify the jobs, using medical records from companies, injury logs and turnover rates. This model has been prospectively validated by reassessing jobs used in the original sample and comparing the reported injuries obtained from the OSHA 200 with the updated assessment values [27].

Zurada et al. (1997) [17] used the data from this fieldwork study to develop an FNN that was able to classify industrial jobs according to the risk of suffering LBDs. In order to train the networks, Zurada selected 74 jobs from Marras’ low-risk group and 74 jobs from the high-risk group at random. The remaining jobs from each group, 50 low-risk and 37 high-risk jobs, were used to test the networks. Medium-risk jobs were excluded from the study.

The data contained five independent variables representing risk factors for developing LBDs: lift rate in number of lifts per hour (LIFTR), peak twist velocityaverage (PTVAVG), peak moment (PMOMENT), peak sagittal angle (PSUB), and peak lateral velocity maximum (PLVMAX). The only dependent variable (RISK of LBDs) was a categorical variable with two possible values: ‘low-risk’ and ‘high-risk’. The procedureused to determine the most appropriate network architecture was to carry out anunspecified number of tests with different topological configurations and a single hidden layer, and the number of neurons varied between 8 and 20. For all cases the number of input variables was kept the same, and so were the neuron activation function (continuous unipolar), the steepness coefficient and the training constant.

Subsequently, [18] used the same experimental data in a heuristic procedure to determine which of the five variables in the jobs analysed in [10] were genuinely significant for the neural model, and which ones could be eliminated. In addition, the procedure established the appropriate number of neurons for the single hidden layer in the model. The same authors in [19] presented a new procedure for obtaining similar results with a smaller or incomplete data set.

The aforementioned authors measured the degree of generalisation of their models calculating the proportion of the jobs in the test set which were correctly classified (PCC). Calling x0 the number of low-risk jobs correctly classified by the model, x1 the number of high-risk jobs correctly classified, and n0 and n1 being the number of low and high-risk jobs respectively, the PCC is calculated as (x0+ x1)/(n0+ n1). The authors show the results through a confusion matrix where 0 represents the group of low-risk jobs and 1 the group of high-risk jobs (Table 1). PCC0 and PCC1 are defined as x0/n0 and x1/n1, representing, respectively, the proportion of correctly classified low and high-risk cases. The best results obtained for the test data in [17], [18] and [19], are shown in Table 3.

Real classification

0

1

Model classification

0

x0

n0- x0

1

n1- x1

x1

PCC0

PCC1

PCC

Table 1: Confusion matrix. PCC is the proportion of jobs which were correctly classified (PCC0 for the low-risk group, PCC1 for the high-risk group).

The network developed in [17] consisted of an FNN trained using BP, with a single hidden layer of 10 neurons and an output layer of 2 neurons. A continuous unipolar activation function was used for every neuron. The jobs were classified on the basis of the highest output value from the output layer in each case. The network was able to correctly classify 74.7% of the jobs in the test dataset. In [18] and [19], the network which produced best generalisation was an FNN with a four-neuron hidden layer that was able to correctly classify 79.3% of the jobs in the test dataset. In this case, the authors established that the model provided better results when the LIFTR (lift rate in number of lifts per hour) variable was eliminated, so the network had one input less than that of [17].

In these studies, the number of processing elements in the ANNs is kept deliberately low in order to avoid overfitting. In accordance with Masters rule [24], given the number of available training cases (148), the number of connections between neurons should not be more than 74. The best network obtained in [17] had80 connections between neurons, a number slightly above this limit. In the case of [18] and [19], the number of neurons in the best network was 30.

MATERIAL AND METHODS

ANNs development procedure

The development of an FNN-based model to solve a classification problem begins by determining the variables which, a priori, appear to have a bearing on the phenomenon to be modelled. The data collected must include the values of the independent variables and the desired result for each case. The data set is usually divided into two groups, the training set and the test set. The first set is used to train the neural network. The second set is used to validate that the network has been trained appropriately, and that it is able to generalise the relationships between inputs and outputs and provide the desired response in those cases which have not been used during the training process. A high number of cases is required for network training,and all the possible patterns of behaviour must be considered; otherwise, the neural network will not be able to determine some of the input-output relationships, thus failing to generalise them to the test data even if good results are obtained for the training data.

To avoid overfitting and obtain models with the highest possible degree of adjustment, some regularisation procedures can be used such as jitter (artificial noise deliberately added to the inputs during training), weight decay (that adds a penalty term to the error function) or early stopping. Early stopping is the use of a reduced data set (validation set) to calculate model error periodically during training. These validation data sets are not used to train the network, but rather to determine the moment when the model stops learning and starts memorising the relationships between training patterns and their resulting outputs. Periodically, during training stage, the weights of the connections between neurons are fixed and the validation set data are introduced, calculating the error of the model. In a typical network training process, the training error decreases continuously, whilst the validation error decreases at first, but then increases when the model’s degree of generalisation starts to diminish (Figure 2).

Training error Validation error

Early stopping point

Text Box: Error

Training iterations

Figure 2: Evolution of training error and validation error during training.

Since network weights do not change during validation, the network does not learn from such cases, but only from the training set ones. When both training and validation error curves go down, the network is learning input-output relationships and so is learning to generalise. But when the validation error curve starts going up (early stopping point), the network is memorising, by means of the weights of the connections between neurons, the correct responses for the training set. This information cannot be generalised to the validation set cases. So, although the training error could continue decreasing, the training process is stopped at the point where the validation error starts increasing, thus avoiding overfitting.

The usual procedure is to divide the available data into three sets: the training set, used to train the network; the validation set, used to determine the early stopping point; and the test set, used to validate the degree of generalisation of the trained model. To compare this approach with those of previous studies, the fieldwork data of [10] were used as described in [17]. This latter work presents tables with the data of the 148 jobs (74 low-risk and 74 high-risk) used to train the model, and the 87 ones used for the test (50 low-risk and 37 high-risk). In order to apply early stopping, it was necessary to divide the original training set into two groups, one used for training and the other for validation. The test set was not modified. There is no fixed rule for determining the appropriate number of cases to make up the validation set, although a study in this respect can be found in [25]. In this case, 30 out of the 148 cases of the training set used in [17] were chosen randomly (15 low-risk and 15 high-risk) to make up the validation set, approximately representing 20% of the available training data. Table 2 shows the distribution of cases in the different sets.

Training set

Validation set

Test set

Total

Low-risk

54

15

50

124

High-risk

54

15

37

111

Total

108

30

87

235

Table 2: Distribution of cases in the different data sets.

Once the training, test and validation sets have been established, the architecture of the FNN must be determined. The size of the network (number of hidden layers and neurons per layer) affects the model capacity to generalise. Establishing the appropriate number of neurons is a major problem for which there is no systematic procedure, although there are some simple rules, such as Masters rule [25], which recommends that the minimum number of hidden neurons is the integer part of (i x o)1/2, where i is the number of network inputs and o is the number of outputs. The common procedure is to start trying networks with a small number of neurons, and then increase the size of the network until an appropriate result is obtained. The number of independent and dependent variables determines the number of neurons in the input and output layers. In the hidden layers, the appropriate number of neurons will depend on the number of independent and dependent variables in the problem, the quantity and quality of the training data available, the complexity of the relationships between the input and output patterns, the type of neuron activation function and the training algorithm used [20]. A high number of processing elements increases convergence speed and reduces error during training, but the error can be much higher during the test stage due to overfitting. On the other hand, an excessively small number of processing elements can cause greater training error and poor test results (underfitting), particularly when modelling complex functions.

The appropriate number of hidden layers for the network will largely depend on the function to be fitted and on the activation functions of the neurons (used to transform the activation level of a neuron into an output signal). For complex target functions, with various hills or valleys, it is useful to employ various hidden layers in order to achieve greater approximation[20]. The neurons in the second hidden layer allow the network to adjust each hill or valley more accurately. A network with various layers can adjust a function more accurately than a network with a single hidden layer, using a smaller number of links depending on the number of neurons in each layer. In contrast, a network with various hidden layers can more easily get trapped in local minima, thus requiring an overall optimisation procedure or several random initialisations to be carried out. Generally speaking, an FNN with two hidden layers is a universal approximator that can generalise any input-output relationship [22][26].

The input variables of the model that have proved to be relevant, the number of network layers, the number of neurons per layer and the activation function of each neuron, among other parameters, were determined by means a GA.

A genetic algorithm to obtain the network architecture

The algorithm starts by generating an initial population of individuals that represent different ANNs. Each individual is codified through a binary vector (chromosome). The number of network layers, the number of neurons per layer, the input variables that will be employed, the type of activation function for each neuron and the learning rate and the momentum for each neuron layer remain codified in each chromosome.

Population size was established in 30 individuals. Each of the ANNs that conforms the initial population is trained by using BP as a training algorithm and the PPC reached by each network in the data from the test set is used as an assessment function. When all the ANNs have been evaluated, a selection is made of those that will survive and pass on to the next generation or that will be used as reproducers. For this process a roulette wheel selection [32] is used, in which the probability of an individual being selected is related to the value obtained in his evaluation, such that those individuals who offer the best results have greater chance of being selected.

Pairs of individuals are chosen from those selected in the previous stage. Reproduction is performed by means of a crossover, in which two new individuals are generated from the combination of the solutions represented by the original individuals. The new individuals will replace their parents in the population. The parameter pc (crossover probability) determines the number of individuals in the next generation that will be created by crossover. A typical value for this parameter varies between 0.5 and 0.9 [33]. The crossover point is chosen at random. The descendants are generated by combining chromosomes that remain on the left and on the right of the crossover point in each of the parents.

The mutation operator is applied to individuals selected at random. The number of individuals that will mutate is determined by the parameter pm (probability of mutation). The process consists of the selection of two points in a given chromosome string and exchanging the values at those points.

The process was repeated with each new set of individuals until a certainnumber of iterations was reached.

RESULTS AND DISCUSSION

The algorithm was executed until iteration 30. The number of iterations turned out to be enough considering that the best ANN was obtained in generation 18. The total execution time for the algorithm was 26 minutes in a PC with a 1.83 GHz processor and 1 GB RAM. The minimun network training passes for each network were 500, and the cutoff was 5000. The limit on hidden neurons in each layer was 16, and the maximum hidden layers was 2. The activation functions employed were: linear, hyperbolic tangent and logistic sigmoid. Crossover probability was fixed in 0.5 and probability of mutation in 0.01. The number of variables could have oscillated between a minimum of 2 and a maximum of 5. The learning rate for the hidden layers could have oscillated between 0.1 and 0.4, and between 0.1 and 0.2 for the output layer. The momentum oscillated between 0.1 and 0.3 for the hidden layers and between 0.1 and 0.2 for the output layer.

The best network found (Figure 3) was obtained in the generation 18 after 16 minutes of GA execution. It was a Fast-Back Propagation Network with five inputs, two hidden layers, 7 neurons on the first layer and 3 on the second. On the first hidden layer, the functions of transference were logistic sigmoids for 3 neurons and linear functions in other 4. On the second hidden layer, two neurons employed hyperbolic tangents and one employed a linear function. On the output layer the neuron employed a linear function.

LIFTR

Linear

Logistic sigmoid Hyperbolic tangent

PTVAV

PMOMENT PSUB

PLVMA

RISK of LBDs

Input layer Hidden layer 1 Hidden layer 2 Output layer

Figure 3: ANN with a higher percentage of correct classifications as found by the GA.

The network was able to attain a greater degree of generalisation than those described in similar previous studies. The PCC of the test data was around 6% better for the low-risk jobs and 13.5% for the high-risk jobs than those obtained in [17]. In the case of [18] and [19], the improvement was around 2% and 8.1% (Table 3). As the authors of these works point out, it is difficult to obtain a greater level of generalisation from the data of [10], given the complex nature of LBD etiology, the fact that some factors are not taken into consideration in the study and the possibility of wrong classifications in the fieldwork. Table 3 also shows the PCC obtained when using the logistic regression in [12] on the test data set. The predictive capacity of models based on ANNs is greater than that of models based on logistic regression. These ones are able to properly classify 88% of low risk jobs but only 51% of the high risk ones. This last type of misclassification error can result in a high impact on workers health.

Marras et al.

(1992)

Zurada et al.

(1997)

Chen et al. (2000)

Chen et al. (2004)

Current approach

Real classification

Real classification

Real classification

Real classification

0

1

0

1

0

1

0

1

Model classification

0

44

6

36

14

38

12

39

11

1

18

19

8

29

6

31

3

34

0.880

0.514

0.720

0.784

0.760

0.838

0.780

0.919

0.724

0.747

0.793

0.839

Table 3: Comparison of the results of previous works and the current approach.

In [18], a backward elimination method was used to remove the independent variables which had little bearing on job classification from the model. Similarly in [19], a procedure of forward selection of input variables was used. In both cases, the best results were obtained when the LIFTR (lift rate in number of lifts per hour) variable was removed from the study. In the present work, significant input variables were established by means of the GA. 86 of the 100 betters ANNs found by the AG used all the independent variables presented in [10]. 10 ANNs used 4 input variables (peak twist velocity average was eliminated). Lastly, 4 ANNs used 3 independent variables, having been removed the peak twist velocity average and the peak sagittal angle. ANNs using all the 5 independent variables achieved higher PCCs than that using less inputs.

A sensitivity analysis was carried out with the best network using all fivevariables in order to identify the effect that each of the network inputs was having on the network output. The mean value of each variable in the training set cases was calculated, the weights of the links in the trained network were fixed, and these mean values were used as inputs. A mean-centred random perturbation was applied to the value of each input variable whilst keeping the rest constant, and the output variation was measured. The result of the analysis, expressed as a percentage of the effect of each input on risk associated with jobs, was: LIFTR 19.23%, PTVAVG 5.18%, PMOMENT 48.75%, PSUB 10.21%, and PLVMAX 16.63%. From these results it can be concluded that the peak moment is the variable which has the greatest bearing on classifying the risk of LBDs in the model obtained, and that lift rate and peak lateral velocity have considerable bearing. In contrast to what is proposed in previous studies, the bearing of lift rate is significant enough for its remaining in the model. The peak twist velocity average appears to have a very slight bearing on classifying risk and so its elimination could be considered. Nevertheless, this does not mean that it is not a factor in the appearance of LBDs.

An outstanding finding resulting from the approach followed here for obtaining an ANNs based model is the use of different activation functions in each neuron. It is common to employ the same activation functions in all the network neurons (and this is the procedure followed by the researchers previously reviewed), or to employ the same type of activation function in all the neurons of a layer. In this research, we have allowed the activation functions to be different in each neuron, what seems to have yielded better adjustment and somehow makes the model more general in its application.

Although a high degree of generalisation of the relationships between input variables and the corresponding classification of risk has been achieved, the model fails to classify 16.1% of the cases (mean percentage of the test set). There may be a number of reasons for incorrect classifications, and they are probably linked to the data used to train the network. Zurada [17] attribute network job classification errors to fieldwork errors made in [10]; mainly due to incomplete or erroneous company LBD records, or ignorance of psychological factors which can result in the origin of certain injuries being wrongly recorded. Chen [19] state that, leaving aside possible fieldwork errors, the use of BP algorithms for network training could be the cause of classification errors. However, this study has proved that using BP learning algorithms is not the cause of incorrect classifications and that these are more likely to be caused by fieldwork errors.

The best ANN found has been implemented in a web application (Figure 4) which can be freely accessed by interested readers in the ergonomics website ergonautas.com                             (http://www.ergonautas.com/herramientas/rnergo/rnergo2.php). Introducing the values of the five variables of the model corresponding to the job to beclassified, the risk classification of the job will be obtained as output. It must bepointed out that the model presented in this work was obtained using data of jobs with very specific characteristics, i.e. stable jobs with  highly  repetitive and consistent manual material handling demands, in which biomechanical indicators can provide valuable risk information [10]. Therefore, the model must only be used to classify jobs with the same characteristics. Other types of tasks will require other more appropriate approaches. On the other hand, it must be remarked that individual worker characteristics are not considered in the model, and they could be individual factors that could reduce worker tolerance to risk. It is also necessary to state that the neural network model has been developed using data from a previous field work, and so the validity of this model is related to the validity of the assumptions and procedures of that field work. In this sense, more work would be required to validate the model obtained here such as a diagnosis system [27].

Figure 4: Web application implementing the neuronal model.

CONCLUSIONS

This study has introduced a genetic algorithm based approach to obtain neural networks models in order to classify the risk of low back disorders (LBDs) presented by certain lifting jobs involving manual material handling. Unlike previous approaches, instead of limiting the size of the networks to avoid overfitting, complex networks have been developed and early stopping has been used as a mechanism to prevent overfitting. A genetic algorithm has been employed as a mechanism to assess the neural network architecture, the number of significant model variables and the number of layers and neurons per network layer. Moreover, different activation functions in the network neurons have been used.

The neural network model developed using this approach has shown greater capacity for establishing generalised relationships between the characteristics of the jobs and the risk of causing LBDs, and greater accuracy than previous ones. Although this neural network model outperforms previous models and could be a useful tool for the ergonomist, further studies are needed to validate its capacity to correctly classify the risk of LBDs.

ACKNOWLEDGEMENTS

We would like to thank the Universidad Politécnica de Valencia for its assistance to carry out this research through its Support Programme for Research and Development and its Funding Projects PAID-06-09/2902 and PAID-05-09/4215.

REFERENCES

1. Eldersa, L.A.M. & Burdorf, A. (2001). Interrelations of risk factors and low back pain in scaffolders. Occupational and Environmental Medicine 58, 597-603.

  • 2. Magora, A. (1970). Investigation of the relation between low back pain and occupation. Industrial Medicine 39, 465471.
  • 3. National Institute for Occupational Safety and Health (1981). Work practices guide for manual lifting. NIOSH Technical Report No. 81122. Cincinnati: National Institute for Occupational Safety and Health.
  • 4. Neumann, W.P., Wells, R.P., Norman, R.W., Frank, J., Shannon, H, Kerr, M.S & the OUBPS Working Group (2001). A posture and load sampling approach to determining lowback pain risk in occupational settings. International Journal of Industrial Ergonomics 27 (2), 6577.
  • 5. Neumann, W.P., Wells, R.P., Norman, R.W., Kerr, M.S., Frank, J., Shannon, H. & OUBPS Working Group (2001). Trunk posture: reliability, accuracy, and risk estimates for low back pain from a video based assessment method. International Journal of Industrial Ergonomics 28 (6), 355365.
  • 6. Okunribido, O.O., Magnusson, M. & Pope, M.M. (2006). Delivery drivers and low back pain: A study of the exposures to posture demands, manual materials handling and wholebody vibration. International Journal of Industrial Ergonomics 36 (3), 265273.
  • 7. Battie, M.C., Bigos, S.J., Fisher, L.D., Spengler, D.M., Hansson, T. H., Nachemson, A.L. & Wortley, M.D. (1990). The role of spinal flexibility in back pain complaints within industry: A propective study. Spine, 15, 768773.
  • 8. Bigos, S.J., Spengler, D.M., Martin, N.A., Zeh, J., Fisher, L., Nachemson, A.L., & Wang, M.H. (1986). Back injuries in industry: A propective study. II. Injury factors. Spine, 11, 246251.
  • 9. Dempsey, P.G. (1998). A critical review of biomechanical, epidemiological, physiological and psychophysical criteria for designing manual materials handling tasks. Ergonomics 41, 7388.
  • 10. Marras, W.S., Lavender, S.A., Leurgans, S.E., Rajulu, S.L., Allread, W.G., Fathallah, F.A. & Ferguson, S.A. (1993). The role of dynamic three dimensional trunk motion in occupationally related low back disorders. Spine 18, 617–628.
  • 11. Marras, W.S., Lavender, S.A., Leurgans, S.E., Fathallah, F.A., Ferguson, S.A., Allread, W.G. & Rajulu, S.L. (1995). Biomechanical risk factors for occupationally related low back disorder risk. Ergonomics 38 (2), 377–410.
  • 12. Marras, W.S. (1992). Toward an understanding of dynamic variables in ergonomic. Occupational Medicine: State of the Art Reviews 7, 655–677.
  • 13. Dempsey, P.G., Ayoub, M.M. & Westfall, P.H. (1995). The NIOSH lifting equations: a closer look. In: Bitner, A.C., Champney, P.C. (Eds.), Advance in Industrial Ergonomics and Safety VII. Taylor and Francis, Bristol, PA, 705712.

14. Dempsey, P.G. & Westfall, P.H. (1997). Developing explicit risk models for predicting low-back disability: a statistical perspective. International Journal of Industrial Ergonomics 19 (6), 483-497.

  • 15. Karwowski, W, Zurada, J., Marras, W. S. & Gaddie, P. (1994). A prototype of the artificial neural networkbased system for classification of industrial jobs with respect to risk of low back disorders. Aghazadeh, F. (ed) Proceedings of the Industrial Ergonomics & Safety Conference, Taylor & Francis, London, 1922.
  • 16. Nussbaum, M. & Chaffin, D.B. (1996). Evaluation of artificial neural network modelling to predict torso muscle activity. Ergonomics 39 (12), 14301444.
  • 17. Zurada, J., Karwowski, W. & Marras, W.S. (1997). A neural networkbased system for classification of industrial jobs with respect to risk of low back disorders due to workplace design. Applied Ergonomics 28 (1), 4958.
  • 18. Chen, C.L., Kaber, D.B. & Dempsey, P.G. (2000). A new approach to applying feedforward neural networks to the prediction of musculoskeletal disorder risk. Applied Ergonomics 31, 269282.
  • 19. Chen, C.L., Kaber, D.B., & Dempsey, P.G (2004). Using feedforward neural networks and forward selection of input variables for an ergonomics data classification problem. Human Factors in Ergonomics & Manufacturing 14 (1), 31 – 49.
  • 20. Sarle, W.S., ed. (1997). Neural Network FAQ, part 1 of 7: Introduction, periodic posting to the Usenet newsgroup comp.ai.neuralnets, URL: ftp://ftp.sas.com/pub/neural/FAQ.html. Available on 06/06/2009.
  • 21. Marren, A., Harston, C. & Pap, R. (1990). Handbook of neural computing applications. Academic Press Inc, San Diego.
  • 22. Principe, J.C., Euliano, N.R. & Lefebvre, W.C. (2000). Neural and adaptive systems, fundamentals through simulations. John Wiley & Sons, Inc, New York.
  • 23. Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986). Learning internal representations by error propagation, in Rumelhart, D.E. and McClelland, J. L., eds. (1986), Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Volume 1, 318362. The MIT Press. Cambridge.
  • 24. Masters, T. (1993). Practical Neural Network Recipes in C++. Academic Press Inc., San Diego, CA.
  • 25. Sarle, W.S. (1995). Stopped Training and Other Remedies for Overfitting. Proceedings of the 27th Symposium on the Interface of Computing Science and Statistics, 352360.
  • 26. Sontag, E.D. (1992). Feedback stabilization using twohiddenlayer nets. IEEE Transactions on Neural Networks, 3, 981990.
  • 27. Marras, W.S., Allread, W.G., Burr, D.L. & Fathallah, F.A. (2000). Prospective validation of a lowback disorder risk model and assessment of ergonomic interventions associated with manual materials handling tasks. Ergonomics 43, 18661886.
  • 28. Chambers, L. D. (1995), Practical Handbook of Genetic Algorithms: New frontiers. 1st ed. Boca Raton: CRC Press, Inc.
  • 29. Chambers, L. D. (1998), Practical Handbook of Genetic Algorithms: Complex coding system. 1st ed. Boca Raton: CRC Press, Inc.
  • 30. Chambers, L. D. (2000), Practical Handbook of Genetic Algorithms: Applications. 2nd ed. Boca Raton: CRC Press, Inc.
  • 31. Davis, L. (1991). Handbook of Genetic Algorithms. New York: Van Nostrand Reinhold.
  • 32. Goldberg, D.E. (1989). Genetic Algorithms in search, optimization and machine learning. Massachusetts: AddisonWesley Publishing Company Inc.
  • 33. Srinivas, M. & Patnaik, L.M. (1994). Genetic algorithms: a survey. Computer, 27, pp. 1726.