Smart vision-based analysis and error deduction of human pose to reduce musculoskeletal disorders in construction

Mahesh Babu Purushothaman (Department of Built Environment Engineering, School of Future Environments, Auckland University of Technology, Auckland, New Zealand)
Kasun Moolika Gedara (ECMS, Auckland University of Technology, Auckland, New Zealand)

Smart and Sustainable Built Environment

ISSN: 2046-6099

Article publication date: 22 August 2023

1323

Abstract

Purpose

This pragmatic research paper aims to unravel the smart vision-based method (SVBM), an AI program to correlate the computer vision (recorded and live videos using mobile and embedded cameras) that aids in manual lifting human pose deduction, analysis and training in the construction sector.

Design/methodology/approach

Using a pragmatic approach combined with the literature review, this study discusses the SVBM. The research method includes a literature review followed by a pragmatic approach and lab validation of the acquired data. Adopting the practical approach, the authors of this article developed an SVBM, an AI program to correlate computer vision (recorded and live videos using mobile and embedded cameras).

Findings

Results show that SVBM observes the relevant events without additional attachments to the human body and compares them with the standard axis to identify abnormal postures using mobile and other cameras. Angles of critical nodal points are projected through human pose detection and calculating body part movement angles using a novel software program and mobile application. The SVBM demonstrates its ability to data capture and analysis in real-time and offline using videos recorded earlier and is validated for program coding and results repeatability.

Research limitations/implications

Literature review methodology limitations include not keeping in phase with the most updated field knowledge. This limitation is offset by choosing the range for literature review within the last two decades. This literature review may not have captured all published articles because the restriction of database access and search was based only on English. Also, the authors may have omitted fruitful articles hiding in a less popular journal. These limitations are acknowledged. The critical limitation is that the trust, privacy and psychological issues are not addressed in SVBM, which is recognised. However, the benefits of SVBM naturally offset this limitation to being adopted practically.

Practical implications

The theoretical and practical implications include customised and individualistic prediction and preventing most posture-related hazardous behaviours before a critical injury happens. The theoretical implications include mimicking the human pose and lab-based analysis without attaching sensors that naturally alter the working poses. SVBM would help researchers develop more accurate data and theoretical models close to actuals.

Social implications

By using SVBM, the possibility of early deduction and prevention of musculoskeletal disorders is high; the social implications include the benefits of being a healthier society and health concerned construction sector.

Originality/value

Human pose detection, especially joint angle calculation in a work environment, is crucial to early deduction of muscoloskeletal disorders. Conventional digital technology-based methods to detect pose flaws focus on location information from wearables and laboratory-controlled motion sensors. For the first time, this paper presents novel computer vision (recorded and live videos using mobile and embedded cameras) and digital image-related deep learning methods without attachment to the human body for manual handling pose deduction and analysis of angles, neckline and torso line in an actual construction work environment.

Keywords

Citation

Purushothaman, M.B. and Gedara, K.M. (2023), "Smart vision-based analysis and error deduction of human pose to reduce musculoskeletal disorders in construction", Smart and Sustainable Built Environment, Vol. ahead-of-print No. ahead-of-print. https://doi.org/10.1108/SASBE-02-2023-0037

Publisher

:

Emerald Publishing Limited

Copyright © 2023, Mahesh Babu Purushothaman and Kasun Moolika Gedara

License

Published by Emerald Publishing Limited. This article is published under the Creative Commons Attribution (CC BY 4.0) licence. Anyone may reproduce, distribute, translate and create derivative works of this article (for both commercial and non-commercial purposes), subject to full attribution to the original publication and authors. The full terms of this licence may be seen at http://creativecommons.org/licences/by/4.0/legalcode


1. Introduction

Physically demanding jobs such as those in construction have greater exposure to high-risk work environments and the highest amount of work injuries compared to other New Zealand industries over recent years (ACC, 2023). The data from New Zealand's primary workplace health and safety regulator, WorkSafe, from June 2021 to May 2022 showed that those working in the manufacturing industry had the highest number of injuries resulting in more than a week away from work (5775 total ACC injuries claim). The most common injury is “muscle stress due to lifting, carrying or putting down objects”, also known as manual handling (WorkSafe, 2022). These injuries can lead to Musculoskeletal Disorders (MSD), which affect a person's muscles, nerves, tendons, joints, cartilage and spinal disc and are commonly known as trauma, back pain and arthritis (USBJI, n.d). Until the late 1990s, these disorders were widely believed to be in older people. However, the United States Center for Disease Control and Prevention's (CDC) National Institute for Occupational Safety and Health (NIOSH) released evidence of Work-Related Musculoskeletal Disorders (WMSD) in 1997.

According to the CDC (2020), the conditions for WMSD compared to regular MSD are when

  1. the work environment and performance of work contribute significantly to the disease; and

  2. the condition worsens or persists longer due to work conditions (CDC, 2020).

WMSD is due to lifting heavy objects and performing repetitive forceful tasks (CDC, 2020). WMSD is evident in WorkSafe's Outcomes Dashboard presented in December 2019, which saw a survey from 2004 to 2006 show “repetitive tasks” being the highest risk factor for WMSD, affecting nearly 70% of the general New Zealand workforce and Māori being 79% affected. WMSD was “Lifting”, the fourth highest cause of work-related injury, with the affected general workforce reaching nearly 40% while Māori was almost 55% (WorkSafe NZ, 2019a, b).

According to NIOSH's equation for calculating the Recommended Weight Limit (RWL), seven factors are critical to manual lifting (Choi et al., 2012 and Singh et al., 2014 and VelocityEHS, 2020).

  1. Load weight (LW) – How heavy is the lifter's load?

  2. Horizontal distance (HD) – How far is the load from the body when lifting?

  3. Vertical distance (VD) – To what height is the load lifted (such as lifting from the floor)?

  4. Travelling distance (TD) – How far does the load need to be lifted?

  5. Frequency of lift (FL) – How often is the load lifted?

  6. Asymmetric turns (AT) – What angle does the lifter's body take when lifting (posture while lifting)?

  7. Coupling grip (CG) – What is the quality of hand grip on the load?

The government agencies such as WorkSafe New Zealand regulate workplace health and safety practices and minimise risks by assessing data and updating policies. However, institutional and academic research and innovation within the technology space have seen the rise of wearable technology (WT) – the use of electronics worn on the body (Yasar, 2022) that provides a more personalised approach to reducing these injuries.

Adding technology to the workplace can provide productivity and efficiency and be used for health and safety benefits such as improving safety performance (Safety Champion, 2021 and Karakhan et al., 2019). However, when adopting new technology, WorkSafe NZ (2019a, b) states that it is essential for employers to consider the health and safety risks of the technology itself, as the responsibility falls on the employer if a worker is injured using the technology (WorkSafe NZ, 2019a, b).

Their guidelines are (WorkSafe NZ, 2019a, b) as follows:

  1. Consider whether the new technology is fit for its purpose.

  2. Check that the manufacturer/designer has considered the health and safety impact of the technology itself.

  3. Check that technology is proven and reliable.

  4. Consider whether the new technology adds additional health and safety risks or alters any health and safety risks.

Examples of technology currently being used in manufacturing are; Automation and Robotics, Augmented/Virtual Reality Software and Mobile Apps, IIoT Sensors and Wearables (Getac, 2021). These technologies add a physical element to the working environment and human bodies. Though the lifting process is easy, adding a physical component is uncomfortable to humans, and their working and typical poses change over time and may lead to WSMD in the long term. But when done correctly, technology can improve a workplace's health and safety performance (Karakhan et al., 2019). However, there is a current knowledge gap on technologies that use no additional physical elements to humans to assist in manual lifting and reduce WMSD. This paper revolves around the research question:

RQ1.

How can Smart vision-based analysis and error deduction of human pose technologies that use no additional physical elements to humans to assist in manual lifting and reduce WMSD be adopted?

The research objective was to develop and demonstrate the concept and use of the novel smart vision-based method (SVBM) for analysis and error deduction of human pose to reduce musculoskeletal disorders in construction. This paper aims to highlight the concept and the use of the novel smart vision-based analysis and error deduction of human pose to reduce musculoskeletal disorders in construction.

2. Literature review

SMART Technologies aid in data collection, training and physical work that aims to reduce WMSD. Over a period, training methods were developed to correct the pose during manual lifting. Digital technologies were subsequently deployed to enhance pose deduction and training. Employees get advice and training on the correct pose and actions during manual material handling at the working site. These trainings are based on physiotherapy principles and use minimal technology. However, most research in the past decade proposes getting the best results using SMART Technologies. Automation and robotics are estimated to reduce workplace physical and psychological injury by 11% by 2030 (Horton et al., 2018). Since 2009, the amount of robotics used worldwide has increased rapidly due to the declining cost and their capacity and ability improving (Horton et al., 2018). Along with robotic machines, drones have been used to monitor workplaces and minimise health and safety risks by accessing hazardous locations such as tunnels, mines, storage tanks, etc., to either monitor or collect samples (Horton et al., 2018 and Chubb, n.d). They help increase health and safety by replacing or assisting workers in doing dangerous tasks or providing relief from the boredom of repetitive tasks for workers (Horton et al., 2018).

Augmented reality (AR) overlaps digital information in the real world that uses devices such as smart glasses or mobile phones (Getac, 2021). Virtual reality (VR) is a passive or interactive computer-generated simulation where the user puts on a VR headset (Getac, 2021). VR allows workers to build their knowledge and practice awareness to reduce incidents in the workplace (Strivr, n.d.). In contrast, AR allows that information and knowledge to be shown in real-time and place (Daniels and Dustin, 2022). VR and AR have been used predominantly in training and educating workers in dangerous tasks in a completely safe space (Getac, 2021 and Chubb, n.d.). Software and mobile apps are one of the most used technology systems implemented into the workplace and most accessible due to most workers having mobile devices (Safety Champion, 2021 and Chubb, n.d.). Software and mobile apps began with connecting workers across the workplace, whether onsite or offsite, reporting health and safety hazards, and accessing real-time data (Safety Champion, 2021; Chubb, n.d. and Schulz, 2021). The app technology benefits by utilising the already installed devices within the mobile phones. The app features include the following (Schulz, 2021):

  1. Linking location data using QR codes scanned by the device's camera.

  2. Improving health and safety incident reporting by capturing via a camera or using voice-to-text to relay information.

  3. Using the camera's advanced motion-capture technology to make manual ergonomic assessments to reduce musculoskeletal disorders.

  4. Providing workers with accessible training and resources on hand.

  5. Gives more accountability to workers in managing health and safety risks.

Further, the different technologies connect amongst themselves. IIoT (Industrial Internet of Things) is a sensor network that connects and communicates with computers and software, improving efficiency, automating processes and adding AI within the workplace (Ordr, n.d). While IIoT includes many of the same definitions as IoT (Internet of Things), IIoT specialises in the manufacturing and industrial sector and includes technologies such as; machine learning, big data, sensor data, automation and machine-to-machine communication (Kumar et al., 2019). Examples of how IoT sensors are used include (Ordr, n.d. and Eshghi, 2022 and Kumar and Iyer, 2019):

  1. Remote management – Being able to manage machines and workers from afar.

  2. Predictive maintenance – IIoT temperature and vibration sensors can monitor conditions to alert when a machine is close to expiring, acting out of its normal parameters, or needing maintenance.

  3. Remote monitoring – Especially used in production facilities where workplaces can monitor variables such as time, input and power consumption for some machines.

  4. Asset tracking – Using sensors such as GPS and RFID tags, workplaces can track and trace inventory, assets and supplies.

  5. Safe work environment – Particularly within facilities dealing with chemicals, IIoT air quality sensors can provide reassurance or alert when the air quality changes.

Furthermore, wherever manual tasks are unavoidable, essential technologies such as “wearables” blend with humans to help ease the work. Wearable technology (WT) is “an electronic device designed to be worn on the user's body” (Yasar, 2022). Its history can be dated to the 13th century when eyeglasses were made, but its modern technology roots started in the 1960s. With its broad definition, many types of WT exist across multiple industries/sectors. Standard WT includes devices such as; smart watches (Apple Watch), fitness trackers (Fitbit), AR devices (Google Glasses), VR headsets (Oculus) and body-mounted sensors used in healthcare. A common factor of these devices is that they gather data from the user to display it in a visual form (fitness trackers, healthcare body sensors) or display information/visuals in a more accessible way (smart watches, AR, VR headsets). WT can collect biological feedback data non-invasive (McDevitt et al., 2022). For example, Wearable ECG monitors can detect atrial fibrillation, bloody pressure, and biosensors. Augusta University Medical Center study found that WT reduces 89% of patient deterioration into preventable cardiac or respiratory arrest (Phaneuf, 2022). Further, WT is used in athletes' training, in-game performance, the potential of sport-related injuries, and recovery (Ohio University, 2020). However, its use in the workplace to prevent injuries such as muscle sprains in the lower back from lifting heavy objects that can lead to Work-related Musculoskeletal Disorders (WMSD) is still in the research phase.

The recent development of Biomechanical Wearable Technology aims to assess performance during tasks and movements to help health and safety professionals, ergonomists, and workers prevent and identify potential health and safety risks. Poitras et al. (2019) describe that the current use of workplace assessments (such as questionnaires) is subjective, unlike WT, which gives a more personal approach. The research inclusion of WT within the workplace is due to industry workers having the same movements, user performance and prevention of injury goals as sports athletes (McDevitt et al., 2022).

The development of biomechanical WT devices falls into two categories (McDevitt et al., 2022):

  1. Assisting/Performance Enhancement-Exoskeletons

  2. Monitoring/Risk Assessment-Pressure sensors, IMUs with multiple sensors such as accelerometers, gyroscopes etc.

Exoskeletons are the most prominent biomechanical WT device today. More than 7,000 units were sold in manufacturing alone in 2018, with an estimated growth rate of more than 50% between 2019 and 2024 (Esko Bionics, 2020). McDevitt et al. (2022) define exoskeletons as “wearable machine devices that augment human performance, primarily for heavy lifting tasks”. They were introduced for military use in 1965, but since the late 1990s, exoskeletons' workplace use has increased significantly. While most countries' health and safety policies encourage redesigning the workplace with an ergonomic approach, this is impossible in temporary workplaces. Exoskeletons help compensate for situations like this while also improving the quality of work (Esko Bionics, 2020).

Exoskeletons use robotic technology to provide postural support while following the user's movements without misalignment or resistance. Exoskeletons reduce the mechanical energy needed to complete tasks (which helps reduce fatigue) while improving both the range of motion and muscle fatigue or activation (McDevitt et al., 2022). Using the exoskeletons reduces stress on the shoulder muscle by 30%, which is the most common muscle to be impacted by injuries while taking the longest to heal and return to full function (Esko Bionics, 2020). Having exoskeletons support older workers in handling a physically demanding task (Okpala et al., 2022). Exoskeletons can either be powered or passive and are currently used in three main ways (AmTrust, n.d):

  1. Back-assist: exoskeletons support the lumbar spine while lifting.

  2. Shoulder and arm assist: Exoskeletons support sustained overhead work.

  3. ELeg-assist: exoskeletons to support the ankle, knee, and hip joints while carrying a load.

Large companies such as Toyota, Ford and Boeing have all adopted exoskeletons into the workplace, receiving positive worker feedback, less exertion, less discomfort and reduced injuries (Zelik, 2021). Ford Motor Company, which adopted the technology in 2011, has seen an 83% reduction in injuries by those who use the exoskeletons (Esko Bionics, 2020). Rexbionics has developed exoskeletons for those with walking disabilities. However, it is still not used in NZ Industries. Exoskeletons' main barrier is cost and usability, as one person can only use them simultaneously (McDevitt et al., 2022). While Exoskeletons assist a person in a task r, IMU devices monitor performance. Eleven existing companies produce exoskeletons that fit the upper body, semi-full body and whole body that aid in picking, carrying, bending and lifting, prolonged standing, extended arm and repeated motion (Okpala et al., 2022).

IMU devices consist of accelerometers, gyroscopes, and magnetometers that monitor a worker's task/posture for performance analysis, exposure to risk analysis, and to help with task redesign (McDevitt et al., 2022). These are used within sports, but due to their minimal size, they are beginning to see an increase in use within the manufacturing workplace as they provide real-time worker monitoring to help identify potential risks on the job. The types of IMUs are the following:

  1. Accelerometer-quantify and monitor dynamic linear acceleration, used to monitor biomechanical parameters of human movement.

  2. Gyroscopes-monitor the angular rate of change to measure axial rotation and provide valuable positioning measures.

  3. Magnetometer-measures the magnetic field and can determine Earth's North (Gleadhill, 2019). Combined with sensor fusion software, they can assess motion, orientation and head movement (Pao, 2018).

Research has shown pressure sensors can also be applied with IMUs to measure fatigue and imbalance. Antwi-Afari and Li (2018) used IMU sensors to track balance loss through the pressure sensors used in the insole of the worker's shoe. The results showed differences in gait (a person's pattern of walking) during balance situations. This combination can also detect issues such as asymmetries or specific limb movements indicating fatigue (McDevitt et al., 2022). Other research includes Akhmad et al. (2020) creating a device using nine IMUs to replicate NIOSH's Lifting Equation. While the team said there needed to be more work, it also provided evidence that this could be possible.

Much literature research has been done on IMUs within a laboratory setting, but organisations such as DorsaVi (n.d.) provide a commercial WT solution using IMUs called ViSafe (Gleadhill, 2019). While there are many benefits to wearable technology, many challenges prevent it from being widely used within society. Kalia (2017) describes six significant challenges for WT and how they affect the user:

  1. Battery life: Due to WT devices being relatively small, the battery needs to be small. And with many WT devices worn constantly throughout the day, the battery life is drained quicker.

  2. Ergonomics: User comfort is paramount in WT, as in textile clothing. Some may find discomfort in having a device strapped around them for long periods, or the device's material is uncomfortable-mainly since most WT devices include a rigid component to house the electronics, accompanied by fabric straps. Some WT devices can also heat up over time.

  3. Differentiating and providing value: People do not see the value of having WT compared to other electronic devices, so using them is challenging.

  4. Sealing: Waterproofing WT from water and sweat is crucial for WT devices, as work can corrode metal components.

  5. Miniaturization and integration: With WT getting smaller and smaller, it is challenging to reduce components size such as radio/antennas, making it more difficult to have a strong signal.

  6. Safety, security, privacy: Most safety concerns come from using Lithium batteries within WT devices and their proximity to the body and potential radiation emissions. WT devices are potentially hackable, threatening security and privacy.

With detailed research and statistics showing how wearable technology can reduce injuries related to work-related musculoskeletal disorders (WMSD), there is hesitation in the industry to adopt the technology. McDevitt et al. (2022) and Navarra (2022) discuss the trust and reluctance to use this new technology. Though WT provides accurate data while not injuring the worker themselves, the discomfort, privacy issue, and constant watch concern users (Navarra, 2022; Kalia, 2017).

In laboratories, visual object tracking uses inertial measurement unit (IMU) sensors for pose deduction (Wei et al., 2021). In the past, construction workers' weight gain detection and recognition using a single wearable inertial measurement unit (IMU) method (Chen et al., 2021). Similarly, nine nodal points were tracked using multiple wearable radio-frequency identification (RFID) sensors to monitor human poses; notably, this system focuses on hand positions (Lee et al., 2019). In implementing sensors and devices, the depth sensors take advantage of a portable, accurate, low-cost device for capturing human pose and reconstruction (Taddei et al., 2014). In large-scale working places, data capturing and environmental conditions were considered factors affecting the output result's accuracy (Pang et al., 2021). For example, construction's dynamic work nature needs constant material shifting; hence, manual lifting occurs more frequently in different surroundings. To overcome this barrier, recently, researchers used Computer vision to analyse human pose errors. One computer vision method is 3D Mocap, which aligns the digital video images using similar pixels region segmentation based on pre-defined image frames to calculate the human pose errors (Rogez et al., 2008). In another work, different pose outline measurements utilise augmented reality (AR) to gather the human postural errors; this method does not use any sensors attached to the human body (Hellsten et al., 2021). However, the calculations are inaccurate as they only compare body outlines that do not specify individual human nodal points (part) movement errors. The main disadvantage is that the method only relays on the standing frames and cannot be used for other positions.

Human pose estimation is closely related to analysing human motion from images and video (Poppe, 2010). Numerous research on Vision-based systems has been undertaken in the new Millennium. Moeslund et al. (2006) refer to 350 articles between 2000 and 2006 that brief the vision-based initial works in this field. Researchers used a variety of methods for vision-based human pose deduction. For Example, Jain et al. (2015) used red, green, and blue colour components for each pixel, motion features and Convolutional Network architecture to deduct human body pose in the video. Xu et al. (2023) used 17 body points and multiple surveillance cameras for offline abnormal human-posture recognition analysis. Dang et al. (2019) comprehensively surveyed sensor- or vision-based human activity recognition. Their survey of 64 papers identified single and multi-person pipeline analysis, heat map analysis and CNN analysis for various applications using 5–17 critical nodal human body points. Zheng et al. (2020) survey of 309 articles acknowledge offline action recognition, prediction, detection, and tracking as key outputs of vision-based systems that are handicapped by video resolution, body deformation compared to standard models and a number of parameters.

Hellsten et al. (2021) discuss the literature on the Potential of Computer Vision-Based Marker-Less Human Motion Analysis for Rehabilitation. Hellsten et al. (2021) state “that most promising techniques from a physiotherapy point of view are 3D marker-less pose estimation based on a single view as these can perform advanced motion analysis of the human body while only requiring a single camera and a computing device”. Lan et al. (2022) review of 153 articles concludes that vision-based systems have been widely applied to action analysis, human-computer interface, gaming, sports analysis, motion capture and computer-generated imagery. Kulkarni et al. (2023) discuss 49 papers on offline computer vision and machine learning algorithms, such as feed-forward neural networks, convolutional neural networks (CNN), OpenPose, and MediaPipe, with the exception of single live surveillance camera based on fall deduction. Through their review, Lan et al. (2022) identify that a gap in the vision system exists still in the analysis of human poses considering the wide diversity of the human body (Lan et al., 2022). Most of these works are indoors, using high-quality cameras and images, yet to be adopted in real-life situations (Hellsten et al., 2021; Lan et al., 2022).

Researchers used computer vision to deal with musculoskeletal disorders since the 1990s. For example, Wang et al. (1996) initially analysed lower back issues using computer vision and super imposed biomechanical model to identify stress points. Mehrizi et al. (2018) proposed a modified algorithm based on the Twin Gaussian Process (TGP) to extract the 3D pose from each frame of the videos captured from 2 lab cameras to develop and validate a computer vision-based marker-less motion capture method to assess lifting tasks and reduce Work-related musculoskeletal disorders (WMSD). Snyder et al. (2021) suggested an IMU sensor captured lifting dataset analysis using a 2D vision and CNN. However, for real-world use, they suggest minimising the number of sensors which will significantly advance the practicality, reducing cost and eliminating the awkward placement of several sensors. Jung et al. (2022) developed a computer vision-based lifting task recognition method using CNN and open pose with 17 nodal points. Earlier, Huang and Nguyen (2019) used multiple cameras and OpenPose to develop 2D and 3D skeleton movement tracking. But OpenPose can detect persons in an image only if the nose or the neck keypoint is not occluded and uses fewer nodal points.

The construction industry has also embraced vision-based technologies. Liu et al. (2017) use a convolutional neural network (CNN) to estimate human pose on sequential images from construction sites to analyse unsafe behaviour monitoring, ergonomic analysis and productivity estimation. Roberts et al. (2020) used 317 annotated offline RGB video feeds of bricklaying and plastering operations to estimate each frame's pose-tracking body joint. However, the result display is potentially cluttered that did not consider carrying movements and performing an ergonomics assessment. Luo et al. (2020) proposed a methodology framework to track construction equipment's location, pose and movement to avoid potential collisions and other accidents to achieve safer onsite conditions. However, they state there are limited studies that automatically estimate the full body pose (Luo et al., 2020). The survey also revealed that smart vision-based analysis and error deduction of human pose to reduce musculoskeletal disorders in construction during manual lifting are yet to be developed.

Eventually, the vision-based human pose estimation approach is still lab-based and needs to be implemented for applications in the real world (Lan et al., 2023). The existing methods of vision-based HPE are offline, based on lightweight neural networks that are manual and heuristic design. Implementing these state-of-the-art neural networks in mobile or embedded devices incurs enormous computational costs and is yet to be operationalised (Lan et al., 2023). This literature survey identified a lack of Real-time human pose deduction using mobile or embedded devices that can be used in construction sites. The current lab-based methods used multiple cameras and up to 17 nodal points (Xu et al., 2023). The survey also revealed that existing computer vision-based applications do not consider the combination of angles, neckline, and Torso line for manual handling pose deduction and analysis in an actual construction work environment. Thus, it is crucial to design real-time neural networks and vision-based systems for efficient human pose estimation using a single camera mobile application that is cost effective and accurate. This paper proposes the real time Smart vision-based method (SVBM), an AI program to correlate the computer vision (recorded and live videos using mobile and embedded cameras) that aids in manual lifting human pose deduction (using 33 nodal points), analysis (combination of nodal points, angles, neckline and torso line), and real-time training in the construction sector.

3. The method

This research is based on the pragmatism approach that evaluates theories or beliefs in terms of the success of their practical application, the solution that takes a realistic approach (Smith, 1978). This differs from the qualitative paradigm (which relies on objectivism and positivism) and the quantitative paradigm (which depends on deduction and confirmation) in the sense that the outputs are proven for their practicality (Maarouf, 2019). Though the pragmatic approach does not support the assumption made in the quantitative and qualitative techniques, it is the most common philosophical justification for practical research outputs (Maarouf, 2019). The pragmatic research aims to develop a SMART vision-based analysis and error deduction of human pose technology that uses no additional physical elements to humans to analyse manual lifting pose and reduce WMSD that can be adopted Practically. The method adopted is shown in Figure 1 and each step is explained in the following subsections.

3.1 Convolutional neural network and BlazePose

The convolutional neural network (CNN) image recognition and object detection is a key architecture that has revolutionised the object detection domain and is the backbone architecture of human pose estimation (Kulkarni et al., 2023). Researchers have used AI-based CNN and advanced computer algorithms that work with vision tracking to conduct the human three-dimension (3D) pose estimation. The most used software platforms for this purpose are OpenPose and BlazePose. While calculating the motion of the parts of our human body, the video images are segmented into multiple single photos in 3D pose reconstruction platforms such as OpenPose (Pang et al., 2021). In some studies using OpenPose, additional algorithms were needed to calculate the pose reasonably. For example, Corin (2021) additionally used a triangulation algorithm to calculate the limb joint kinematics from the videos. The challenge with OpenPose is that it requires camera calibration and takes more time to deliver outputs when videos from two or more cameras are analysed. Another work related to visual object tracking is the 3D posture using three artificial neural networks within two different positions (Aghazadeh et al., 2020). Though these results satisfied the required efficiency, hand locations needed to be input manually for pose estimations. BlazePose (a high-fidelity human pose tracking solution within the MediaPipe Pose software framework) is another CNN for human pose tracking developed by Google, which detects the 33 nodal points of the human body, which are higher compared to others such as COCO, BlazeFace and BlazePlam. The closer the key points are used, the more the human pose can be simulated; BlazePose offers 33 nodal points that are vital and closer compared to other platforms. The nodal points are shown in Figure 2.

The BlazePose platform is better than OpenPose and supports mobile and laptop platforms (Bazarevsky et al., 2020). However, this platform does not consider the neck posture and backline of the human body, which is a disadvantage.

This research used a mobile camera and mobile-based application to capture pose errors and quantify the angles of 33 nodal points to help experts and workers to analyse and correct the mistakes. The primary disadvantage of existing vision-based analysis is that it does not consider the neck posture and backline of the human body, uses advanced laboratory-based cameras to capture data, and studies are lab-based. To assess the neck posture and backline of the human body, the research with novelty combined BlazePose with the OpenCV platform for calculating the head, neck and shoulder positions in a given video frame. OpenCV (a 3D lightweight CNN (Kuehne, 2011)) is an image and video processing and vision recognition platform. OpenCV was used for video capturing, storing, camera calibration and geometric measurement data transfer for pose processing. Mainly the research has sought to use pose detection and get more accurate data. The data acquired and calculated include the following:

  1. Visual human pose and movement

  2. Identification of nodal points and landmark

  3. The location of critical human body nodal points.

  4. The angle between the nodal points in a particular frame

  5. The angle between the neckline, torso line, and the human body axis in a particular frame

  6. The angle and movement of nodal points over a period

  7. The distance between nodal points

  8. The hip-shoulder length change over a period

3.2 Data acquisition

The data acquisition for manual handling was based on multiple box lifts captured on mobile cameras, and the related videos were recorded, coded and stored. The size of the recordings was not limited, and different actions were captured. The footage was separated into frames to analyse results under various conditions. A volunteer participant mimicked the construction sector's lifting task, captured in 18 video clips using a Samsung S7 Edge phone with a 1220 × 960 pixels resolution. The camera was placed on the rear-left side at 135° from the sagittal plane of the box. The videos were stored in an HP laptop with CPU configuration 11th Gen Intel(R) Core (TM) i7-1185G7 @ 3.00 GHz, 32 gigabytes of RAM embedded with Matplotlib platform. The participant was asked to stand in front of the box weighing 10 kilograms and finish the lifting tasks without moving their feet (refer to Figure 3). The participant chooses the initial distance between them and the box and the lifting speed. The participant performed three vertical lifts sequence: Floor to knuckle height, floor-to-shoulder height, and knuckle-to-shoulder height. Additionally, each lifting sequence ended with a twist of 0°, 30°, and 60° to the right-hand side of the starting position (Yamauchi and Iwamoto, 2010). Each lifting event was repeated twice.

3.3 Identification of nodal points and landmark

OpenCV was used to deduct a person using a heatmap within the captured video frame that helps to isolate the human from other objects. Then the isolated pose is superimposed with nodal points of Blazepose. Next, for the landmarks, a relative position is used to determine the body parts, i.e. x and y-axis deduction that gives an actual value, which is calculated using OpenCV. In the next step, the landmarks are used to calculate the angular motion of the human body parts and provide an output on the mobile in real-time. This method tracked 33 human body nodal points Figure 2, rendering landmarks and background segmentation. The landmark helps to identify the location of the human body within the video image, and background segmentation helps to isolate the human body from other objects in a work environment. Figure 4 below shows human landmark detection and angle calculation process flow.

3.4 Pose estimation using nodal points

The next step is pose estimation from the video frames or photos. This novel method of combining BlazePose and OpenCv explores the consistency of the three main features, converting images to Raw blue, green, and red samples based on heatmap (RGB), 3D pose detection, and angle calculation. The novel method for human pose deduction using video frame/photo involves three models that work in conjunction:

  1. a detector using a heat mapping principle on images captured using cameras,

  2. the location of the human body associated with a region of interest (ROI),

  3. the angle of the given nodal point of the human body.

The workflow is shown in Figure 5.

3.5 The angle between the nodal points in a frame

The next step is the initial angle calculation. The camera alignment to detect the view of our human pose is used to measure the angle between the nodal points. The digital architecture (workflow) is given below in Figure 6.

For the calculation of the angle between two points, the tan−1 function is used. The inverse of the tan function takes the rise (vertical distance) over run (horizontal distance) to give the angle. For example, Figure 7 shows the nodal points (nodal point numbers same as in Figure 2) and corresponding x and y of the left shoulder, left elbow and left wrist.

The angle between the left shoulder and left elbow = tan−1 ((y11-y13)/(x11-x13). To find the angle between the left elbow and left wrist = tan−1 ((y13-y15)/(x13-x15).

Similarly, the angle between the left shoulder and left elbow = tan−1 ((y11-y15)/(x11-x15). Further, to find the angle between the left shoulder and left wrist, keeping the left elbow as the pivotal point, the angle between the left shoulder and left elbow and the angle between the left elbow and left wrist were added.

3.6 The angle between the neck and torso lines and the human body axis in a frame

The angle between human shoulders and hip is challenging to detect using BlazePose and OpenCV because these have no practical neckline in anatomical detection. Therefore, this research used multiple methods to generate the angle of the neck and hip area. The angle is the primary deterministic factor for the posture, subtended by the neckline and the torso line. The neckline connects the middle points of the shoulders and the middle point of the eyes. After this, the shoulder nodal points were used as the pivotal point. Similarly, the torso line connects the hip and the shoulder, where the hip is considered a pivotal point. The inclination angle calculates the result of the person bending a threshold angle.

Taken the neckline as a base, the points are P1(x1, y1) (shoulder), P2(x2, y2) (eye), and P3(x3, y3) (any points on the vertical axis passing through P1). The vector approach was considered to find the inner angle of three points. The angle between two vectors P12 and P13 is given by,

θ=arccos(P12.P13|P12|.|P13|)

Solving for θWeget

(1)θ=arccos(y12y1.y2y1(x2x1)2+(y2y1)2)

3.7 The angle and movement of nodal points over a given period

Next, calculate the angle and movement of nodal points at a given time using the segregation of video frames and analysis of pixels. Following the IBM architecture of CNN (IBM, n.d.), the three-dimensional data for image classification and object recognition tasks were done, as shown in Figure 9.

Distance, angle, and distance angle relation between frames will give relative contraction or stretch over time that satisfies:

(2)Xi(t1)={xi1(t1),xi2(t2)xiD(t2)}
In the x D i (t − 1) is a position of an agent, an agent defines by i, D is the search space dimension, and t is the process's iteration time. Prediction of the pose location enhanced with the below equations. As a part of the implementation, the Pooling layer took the part of getting diminished the feature maps produced. Because it is a necessary part of the human body pose detection features. The pooling window size and the stride are hyperparameters that can be adjusted to change the size of the output feature map. It can also be zero-padded to maintain the exact size of the input feature map.
(3)Xiˆ(t)=Xi(t1)
(4)Pˆ(t)=Pˆ(t1)+Qˆ

Equations (3) and (4) work as search agents to enhance the measuring point accuracy for the prediction used to calculate the angles of the detection locations. Then the difference between initial and a given time (video frame) angles were used to calculate the bending movement of a particular nodal point (Joint or human position), subsequently used to assess the relative pose and associated errors.

3.8 The distance between nodal points over a given period

Another major factor in determining the extent of spline bending during manual lifting is the change in the distance between the shoulder and hip nodal points. The distance is employed to measure the offset distance between two points. The fixed nodal points were the hip, eyes, and shoulder, as these points are always more or less symmetric to the central axis of the human body. With this assumption, the alignment features are incorporated as

(5)distance=(x2x1)2+(y2y1)2

Let the initial video frame be F1, and the Last frame after the completion of the lift be Fn. From Figure 8, the left shoulder and the hip's nodal points are P12 and P23, respectively. Let the landmarks for:

Frame F1 Left shoulder be x1, y1 and Frame Fn be xn, yn.

Then,

(6)Leftshouldermovementdistance,C=(F1P12(x1)FnP12(xn))2+(F1P12(y1)FnP12(yn))2
Similarly,
(7)Hipmovementdistance,D=(F1P23(x1)FnP23(xn))2+(F1P23(y1)FnP23(yn))2

3.9 The hip-shoulder length change over a given period

To calculate the change in distance between the initial left shoulder-hip distance and to final left shoulder-hip distance:

First, calculate the F1 left shoulder-to-hip distance,

(8)E=(F1P12(x1)F1P23(x1))2+(F1P12(y1)F1P23(y1))2
Then, calculate the Fn left shoulder to hip distance,
(9)F=(FnP12(xn)FnP23(xn))2+(FnP12(yn)FnP23(yn))2

Change in the length of hip-shoulder distance,

(10)l=EF

Since the measurements are based on an individual's real work-life video, the results are customised to the individual. No specific methodology was employed for training the model since the research proposed capturing real-life experience; the model was advised to lift and turn to his comfort. Though the study used a male model, due to the use of BlazePose and CNN, the dataset's size, diversity, and representativeness can be equated to all gender and all sizes when captured as full-body visuals. The calculated angle and human poses were validated using REBA (Rapid Entire Body Assessment). This evaluation method considered human body postures, movements, and actions. The images in Figures 3–12 were captured using the mobile camera, and the 33 nodal points and angles shown are those projected in real-time with the aid of AI and mobile applications. Using mobile cameras and application helps capture and display angle and other data in real-time, in natural construction environments, and in instant pose correction and training. The application can display angles to 10 (the AI program can be altered to be more precise if required) and isolate the backgrounds to capture the human pose. These features aid in capturing human poses in natural construction work environments without pre-settings.

4. Results

The visual and quantitative results from the experimental captured data and the HMDB data set are given in this section.

4.1 Experimental data and SVBM real-time vs offline analysis

The top row of Figure 10 shows four video frames from a clip. The middle row shows the superimposed pose nodal points and calculated angles displayed in the mobile application using SVBM. The last row shows the offline three-dimensional pose analysis that can be used for training and Pose correction.

4.2 SVBM accuracy and low light intensity test

Figure 11 shows the ability of the SVBM regarding the capability to display angle variation to a minimum accuracy of 1° (refer to the foot). Figure 12 shows the ability of SVBM to process video frames and images with low light intensity, using heatmap and segmentation. Figure 12 also displays the SVBM's ability to isolate the backgrounds and process the angle of different nodal points.

4.3 SVBM validation using HMDB dataset

Figures 13 and 14 show the SVBM's ability to process recorded videos. The images are from video clips of the HMDB dataset. The HMDB dataset is an extensive, publicly available human motion capture data for human motion analysis, recognition, and understanding research. It contains over 3,600 video clips of human actions, with more than 50 action categories, such as walking, running, jumping, and dancing. The videos were captured in various settings, such as indoor and outdoor scenes, and were recorded with multiple cameras to capture different viewpoints. The HMDB dataset has been widely used as a benchmark for evaluating the performance of human action recognition algorithms, and many state-of-the-art methods have been developed using this dataset. Each video clip in the HMDB dataset is labelled with the action category it represents. The dataset also includes information about each video's camera viewpoint, frame rate, and resolution.

4.4 SVBM repeatability test

Figure 15 shows experimental data's human body angle variations at a fixed time of repeated lifting. The time was set at 3s from the start of the manual handling, and various angles of Torso inclination, neck, elbows, knees and ankles were plotted. Such analysis helps understand the pose variation at a given point during repetitive lifting and is helpful for pose correction and fatigue analysis. Figure 16 shows the study of human body angle variations at a fixed time of repeated lifting using HMDB data.

The SVBM was validated for accuracy of angle, repeatability of angle results and ability to show the angle over a period. The accuracy of the angle calculated was 1°. The repeatability was tested by running the program 32 times on a single video clip that lasted 9 s. The video recording was at 30 frames/sec and had 293 frames. Four nodal points and two line angles (left hip, right hip, left knee, right knee, torso inclination, and neck inclination) were plotted for each frame of the run. The results were identical for each frame on every run. This meant that the program could precisely return results to an accuracy of 1°. The program could plot nodal points and inclination lines for each frame, demonstrating its ability to acquire data over time. Figure 17 shows the angle movement of a single video clip.

4.5 Accuracy, precision and recall scores validation of model performance

Further, SVBM's results were validated for classification model performance metrics such as accuracy, precision, and recall scores (Figure 18). These performance metrics are commonly used to evaluate the performance of a classification model (program coding).

  1. The accuracy score is a performance metric that measures the overall accuracy of a classification model. It is calculated as the proportion of correct predictions (true positives and negatives) out of the total number of predictions.

  2. The precision score is a performance metric that measures the proportion of true positive predictions among all positive predictions. It is calculated as the number of true positives divided by the sum of true and false positives.

  3. Recall score is a performance metric that measures the proportion of true positive predictions among all actual positive cases. It is calculated as the number of true positives divided by the sum of true positives and false negatives.

Fracy, precision and recall scores plotted against the true positive rate (TPR) and False positive rate (FPR).

  1. TPR, also known as sensitivity, is the proportion of actual positive cases correctly identified as positive by a classification model. It is calculated as the number of true positives divided by the sum of true positives and false negatives.

  2. FPR is the proportion of actual negative cases incorrectly identified as positive by a classification model. It is calculated as the number of false positives divided by the sum of false positives and true negatives.

TPR and FPR are often used to create a Receiver Operating Characteristic (ROC) curve, a graphical representation of the trade-off between TPR and FPR for different thresholds of a classification model.

5. Discussion

WMSD in workers caused by repetitive tasks and lifting heavy objects is a significant health and safety concern across all countries reviewed (New Zealand, Australia, the United States, China and the United Kingdom). The primary responsibility of employee care is the employer and health and safety regulators providing guidelines on mitigating injury from manual handling. However, Poitras et al. (2019) state that the current techniques used for risk assessment, such as questionnaires and providing guideline handbooks, are subjective. The subjective analysis gives varied and generalised results over time. Since the manual lifting related to WSMD is individualistic, the risk assessment must be individual, objective, and reliable (Singh et al., 2014). WT, such as Exoskeletons and IMUs, offers employees and workers personalised quantitative data and assistance to reduce the risk of injury from lifting heavy objects. Irrespective of research and statistics showing the benefits of WT still a lack of adoption within the industry due to inconvenience, trust, and privacy concerns.

This research aimed to eliminate the inconvenience caused by attachments to the body using SVBM. Considering the potential benefits of SVBM and the advancement of digital security and reliability, trust and privacy concerns will reduce over time. However, trust and privacy concerns remain at large, currently. With the SVBM, there is potentially an individualistic constant watch. However, the potential benefits of SVBM could offset the concern. Unlike the WT, SVBM does not offer attachments to the body, thus reducing technology-related H&S concerns to a greater extent. Since the SVBM is compactable to a mobile phone application, the workers can self-video record their manual lifting and do self-assessments, which is impossible with manual training, WT and lab-based technologies.

SVBM offers quantitative data that is closer to WT. However, Lab-based experiments generate more accurate data. The prime disadvantage of lab-based experiments is that they are not conducted in a work environment that is primarily dynamic in the construction sector. The SVBM using a heat map and segmentation offers quantitative data in real work environments comparable to lab-based results. Unlike WT and lab-based technologies, which use close-to-body sensors, the vision can be captured using mobile and installed cameras from close and long ranges. Unlike the lab-based vision technologies that use multiple cameras installed at specific angles, this SVBM offers single camera-based mobile and installed camera-based analysis and results. The flexibility of placing the camera at any angle is available with SVBM, unlike lab-based technologies. However, similar to lab-based technologies, the results are provided live and post-processing.

Though numerous articles combine computer vision with human pose estimation, the literature survey identified a lack of real-time human poses deduction using single-camera mobile or embedded devices that can be used in construction sites. The SVBM is a real-time neural network vision-based systems for efficient human pose estimation that uses a single-camera mobile application and is cost-effective and accurate. The proposed SVBM application is adoptable in mobile and camera-embedded devices, which can be used at workplaces for real-time human pose analysis. SVBM, an AI program, correlates the angles, neckline and torso line using computer vision (recorded and live videos using mobile and embedded cameras) that aids in manual lifting human pose deduction, analysis, and training in the construction sector. Unlike other vision systems that use high-quality images, SVBM can analyse low-intensity images and display angles to an accuracy of 1° in real time. The accuracy of the angle can be improved if required. The existing computer vision-based analysis uses up to 17 nodal points to calculate the human pose.

In contrast, SVBM gathers 33 critical nodal point data of the human body in real work situations and calculates the body part angles with respect to the x-axis and y-axis and the difference in angles over a period of time. This provides greater analysis accuracy and reliability. The survey also revealed that existing computer vision-based applications do not consider the combination of angles, neckline, and Torso line for manual handling pose deduction and analysis in an actual construction work environment. In SVBM, video capturing can be done in most work environments; however, no vision blocking is allowed. WT could be disturbed at times due to the type of sensors used. For example, infrared sensors need a clear transmission, and RFID has a range at which data can be transmitted. The SVBM that uses human Isolation through frame segmentation and heatmap mean most work-related background. However, it also has a range limitation; the data interpretation is more accurate when the range is closer. Previously, mobile camera's advanced motion-capture technology to make ergonomic assessments to reduce musculoskeletal disorders (Schulz, 2021) by post-lab processing. This research is novel because it provides quantitative data capturing, real-time analysis, and visual capturing in actual work environments. This research demonstrated by combining the BlazePose and OpenCV platforms, more nodal points can be added.

Further, using heat maps, segmentation and model calculations, the contraction of Spinal cord stretch and contraction can be deducted and added in future. The ease of data capturing using the mobile allows frequently comparing actual data for long-term effect analysis. The SVBM aids frequent real-life recording through mobile or camera-embedded devices and record keeping that can be used for training and treatment. Furthermore, the SVBM video analysis can be used for monitoring recovery. Furthermore, SVBM video analysis can monitor vulnerable workplace jobs and affect people in real environments. The SVBM considers all aspects of NIOSH's equation except the weight lifted. According to NIOSH's equation for calculating the Recommended Weight Limit (RWL), seven factors are critical to manual lifting (Choi et al., 2012 and Singh et al., 2014 and VelocityEHS, 2020). Further, like WT and lab-based technologies, the SVBM does not consider psychological factors (Khalaf et al., 2021) and operator and Environmental variables, as stated by (Drury and Pfeil, 1975), as these factors are qualitative and subjective.

6. Conclusion

WMSD in workers caused by repetitive tasks and lifting heavy objects is a significant health and safety concern in New Zealand, Australia, the United States, China and the United Kingdom. The primary health and safety responsibility lies with the employer and regulators providing guidelines to mitigate injury from manual handling. Researchers in the past have used various-techniques for risk assessment, such as questionnaires and providing training and guideline handbooks, that are subjective. In the recent digital era, WT, such as exoskeletons and IMUs, offers employees and workers personalised qualitative data and assistance to reduce the risk of injury from lifting heavy objects. To a reasonable extent, WT is used for training and pose correction during manual lifting in laboratories and workplaces. Irrespective of its benefits, WT lacks adoption within the industry due to attachment to the body, trust, and privacy. Attachments hinder operations, and the human pose must be adjusted due to the extensions.

Moreover, with attachments, the workers might find it difficult to work all day. In this research, using novel SVBM and with the ease of using the application on mobile devices, the authors established that attachment to the body could be redundant for training and pose correction during manual lifting. This reduces the health and safety risks of attachments of the WT. Further, the SVBM research highlights using commonly available computer vision-based systems such as mobile cameras and AI-based applications. Furthermore, the SVBM's real-time and offline analysis capabilities, low-intensity vision compatibility, and background isolation method are also discussed in this article, ensuring its use in natural work-life environments. SVBM gathers 33 critical nodal point data of the human body in real work situations and calculates the body part angles with respect to the x-axis and y-axis and the difference in angles over a period of time. The novelty of including the combination of angles, neckline, and Torso line for manual handling pose deduction and analysis in an actual construction work environment that existing computer vision-based applications do not consider helps in real-time analysis that is more accurate and reliable. The offline analysis additionally yields a change in hip-shoulder distance that can be used to calculate the arc of the spinal cord. Since the measurements are based on an individual's real work-life video, the results are customised to that individual. This would help measure the performance of the individual over a period and provide information on the change in pose pattern over the long run, which can be used for diagnosis, training, and prescribing recovery.

In this paper, we have demonstrated the feasibility of SVBM for worker pose detection and measuring operations live in real work-life situations using single-camera mobile and embedded devices. The SVBM can provide individualistic data useful for analysing individuals' health due to repetitive tasks. The practical uses of SVBM include training, pose estimation, pose variation analysis, and posture analysis concerning actual work environments in real time. The theoretical implications include mimicking the human pose and lab-based analysis without attaching sensors that naturally alter the working poses. This would help researchers develop more accurate data and theoretical models close to actuals. The critical limitation, like WT, is that the trust, privacy and psychological issues are not addressed in SVBM, which is acknowledged. However, the benefits of SVBM naturally offset this limitation to be adopted practically. Future research could focus on adding more nodal points to the spinal cord to get a direct output on overstretch or contraction.

Concluding, SVBM has the advantage of capturing the required data without interrupting the everyday working styles in natural work-life settings. With no wearable item, workers can perform activities and capture data. Instant analysis results can be obtained through mobile applications. SVBM also supports the analysis of recorded clips, and high-resolution and high-zoom cameras can capture video from a distance. Detailed analysis is possible offline.

Figures

The method

Figure 1

The method

The 33 pose landmarks of the human body (adopted from Media pipe pose)

Figure 2

The 33 pose landmarks of the human body (adopted from Media pipe pose)

The dataset of the human pose of lifting one and two boxes

Figure 3

The dataset of the human pose of lifting one and two boxes

The process flow of human landmark detection

Figure 4

The process flow of human landmark detection

The workflow of the calculation of the angle

Figure 5

The workflow of the calculation of the angle

Nodal point tracking network architecture

Figure 6

Nodal point tracking network architecture

The angle using three nodal points

Figure 7

The angle using three nodal points

The measurement of neckline and Torso inclination

Figure 8

The measurement of neckline and Torso inclination

The CNN classifier for human body poses detection and angle calculation

Figure 9

The CNN classifier for human body poses detection and angle calculation

Captured video frames, real-time superimposed nodal points and offline analysis

Figure 10

Captured video frames, real-time superimposed nodal points and offline analysis

Nodal points and angle variation

Figure 11

Nodal points and angle variation

Low light intensity image and angle display

Figure 12

Low light intensity image and angle display

Analysis of HMDB dataset 1

Figure 13

Analysis of HMDB dataset 1

Analysis of HMDB dataset 2

Figure 14

Analysis of HMDB dataset 2

The analysis of human body angles of the experimental dataset

Figure 15

The analysis of human body angles of the experimental dataset

The analysis of human body angles of the HMDB dataset

Figure 16

The analysis of human body angles of the HMDB dataset

Nodal points movement

Figure 17

Nodal points movement

Accuracy, precision and recall scores

Figure 18

Accuracy, precision and recall scores

This work was supported by Auckland university of Technology.

References

ACC (2023), “Work injury statistics”, ACC, available at: https://www.acc.co.nz/newsroom/media-resources/work-injury-statistics/ (accessed 21 January 2023).

Aghazadeh, F., Arjmand, N. and Nasrabadi, A.M. (2020), “Coupled artificial neural networks to estimate 3D whole-body posture, lumbosacral moments, and spinal loads during load-handling activities”, Journal of Biomechanics, Vol. 102, 109332, doi: 10.1016/j.jbiomech.2019.109332.

Akhmad, S., Arendra, A., Findiastuti, W., Lumintu, I. and Pramudita, Y.D. (2020), “Wearable IMU wireless sensors network for smart instrument of ergonomic risk assessment”, in 2020, October 6th Information Technology International Seminar, IEEE, Surabaya, pp. 213-218, October 14-16, 2020 (ITIS), doi: 10.1109/ITIS50118.2020.9321084.

AmTrust (n.d), “Exoskeleton and exosuits in the workplace”, available at: https://amtrustfinancial.com/blog/loss-control/exoskeleton-and-exosuits-in-the-workplace#:∼:text=Major%20corporations%20such%20as%20Toyota,in%20the%20groups%20using%20exoskeleton (accessed 12 August 2022).

Antwi-Afari, M.F. and Li, H. (2018), “Fall risk assessment of construction workers based on biomechanical gait stability parameters using wearable insole pressure system”, Advanced Engineering Informatics, Vol. 38, pp. 683-694, doi: 10.1016/j.aei.2018.10.002.

Bazarevsky, V., Grishchenko, I., Raveendran, K., Zhu, T., Zhang, F. and Grundmann, M. (2020), “Blazepose: on-device real-time body pose tracking”, arXiv preprint arXiv:2006.10204, doi: 10.48550/arXiv.2006.10204.

Bionics, E. (2020), How Exoskeletons Impact the Workplace, Ekso Bionics, San Rafael, CA, USA, available at: https://eksobionics.com/how-exoskeletons-impact-the-workplace/ (accessed 12 August 2022).

Centers for Disease Control and Prevention [CDC] (2020), “Work-related musculoskeletal disorders and ergonomics”, available at: https://www.cdc.gov/workplacehealthpromotion/health-strategies/musculoskeletal-disorders/index.html#:∼:text=Musculoskeletal%20disorders%20(MSD)%20are%20injuries,to%20the%20condition%3B%20and%2For (accessed 12 August 2022).

Chen, S., Bangaru, S.S., Yigit, T., Trkov, M., Wang, C. and Yi, J. (2021), “Real-time walking gait estimation for construction workers using a single wearable inertial measurement unit (IMU)”, in 12-16 July 2021, Netherlands IEEE/ASME International Conference On Advanced Intelligent Mechatronics (AIM), Delft, pp. 753-758, IEEE, doi: 10.1109/AIM46487.2021.9517592.

Choi, S.D., Borchardt, J. and Proksch, T. (2012), “Translating academic research on manual lifting tasks observations into construction workplace good practices”, Journal of Safety, Health and Environmental Research, Vol. 8 No. 1, pp. 3-10, available at: https://web-s-ebscohost-com.ezproxy.aut.ac.nz/ehost/pdfviewer/pdfviewer?vid=3&sid=da62f4b0-d4b6-4e10-b7d9-ab6939adccba%40redis (accessed 12 August 2022).

Chubb (n.d.), “4 technologies to improve workplace safety”, available at: https://www.chubb.com/us-en/businesses/resources/4-technologies-to-improve-workplace-safety.html (accessed 12 August 2022).

Cronin, N.J. (2021), “Using deep neural networks for kinematic analysis: challenges and opportunities”, Journal of Biomechanics, Vol. 123, 110460, doi: 10.1016/j.jbiomech.2021.110460.

Dang, Q., Yin, J., Wang, B. and Zheng, W. (2019), “Deep learning based 2D human pose estimation: a survey”, Tsinghua Science and Technology, Vol. 24 No. 6, pp. 663-676, doi: 10.26599/TST.2018.9010100.

Daniels, N. and Dustin, C. (2022), “World Economic Forum 2022 - empowering the next-generation manufacturing workforce through AR innovation”, available at: https://www.weforum.org/agenda/2022/05/ar-manufacturing-next-generation-workforce/#:∼:text=Using%20AR%20in%20manufacturing%20enables,productivity%20and%20time%20to%20resolution (accessed 12 August 2022).

DorsaVi (n.d), “ViSafe”, available at: https://www.dorsavi.com/visafe/ (accessed 12 August 2022).

Drury, C.G. and Pfeil, R.E. (1975), “A task-based model of manual lifting performance”, THE INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, Vol. 13 No. 2, pp. 137-148.

Eshghi, B. (2022), “Top 6 Use cases of IoT in manufacturing”, available at: https://research.aimultiple.com/iot-manufacturing/#6-workforce-efficiency (accessed 12 August 2022).

Getac (2021), “How technology helps increase workplace safety in heavy industries”, available at: https://www.getac.com/intl/blog/safety-technologies-for-heavy-industries/#h-the-most-effective-technology-solutions-for-increasing-workplace-safety (accessed 12 August 2022).

Gleadhill, S. (2019), Validating New Wearable Technology Methods to Semi-automate Biomechanical Models for Primary Prevention of Low Back Disorders in the Workplace (Doctoral Dissertation, Charles Darwin University (Australia), Haymarket, NSW.

Hellsten, T., Karlsson, J., Shamsuzzaman, M. and Pulkkis, G. (2021), “The potential of computer vision-based marker-less human motion analysis for rehabilitation”, Rehabilitation Process and Outcome, Vol. 10, 11795727211022330, doi: 10.1177/11795727211022330.

Horton, J., Cameron, A., Devaraj, D., Hanson, R.T. and Hajkowicz, S.A. (2018), Workplace Safety Futures: The Impact of Emerging Technologies and Platforms on Work Health and Safety and Workers' Compensation over the Next 20 Years, CSIRO, Canberra, ACT, available at: https://scholar.google.com/scholar?inst=4292032217678865036&q=Horton%2C+J.%2C+Cameron%2C+A.%2C+Devaraj%2C+D.%2C+Hanson%2C+R.+T.%2C+%26+Hajkowicz%2C+S.+A.+%282018%29.+Workplace+safety+futures%3A+the+impact+of+emerging+technologies+and+platforms+on+work+health+and+safety+and+workers%27+compensation+ver+the+next+20+years.+Canberra%2C+ACT%2C+Australia%3A+CSIRO (accessed 12 August 2022).

Huang, C.C. and Nguyen, M.H. (20192019), “Robust 3D skeleton tracking based on OpenPose and a probabilistic tracking framework”, 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC), Bari. doi: 10.1109/SMC.2019.8913977.

IBM (n.d), “Convolutional Neural Networks: what are convolutional neural networks?”, available at: https://www.ibm.com/topics/convolutional-neural-networks#:∼:text=The%20convolutional%20layer%20is%20the%20core%20building%20block%20of%20a,matrix%20of%20pixels%20in%203D (accessed 12 August 2022).

Jain, A., Tompson, J., LeCun, Y. and Bregler, C. (2015), “MoDeep: a deep learning framework using motion features for human pose estimation”, in Cremers, D., Reid, I., Saito, H. and Yang, M.-H. (Eds), Computer Vision -- ACCV 2014 Revised Selected Papers, Part II 12 Computer Vision--ACCV 2014: 12th Asian Conference on Computer Vision, Springer, Cham, Singapore, Singapore, November 1-5, 2014 doi: 10.1007/978-3-319-16808-1_21.

Jung, S., Su, B., Wang, H., Lu, L., Xie, Z., Xu, X. and Fitts, E.P. (2022), “A computer vision-based lifting task recognition method”, Proceedings of the Human Factors and Ergonomics Society Annual Meeting, Vol. 66 No. 1, pp. 1210-1214, doi: 10.1177/1071181322661507.

Kalia, T. (2017), 6 Key Challenges of Wearable Product Development, Medium, London, UK, available at: https://outdesign.medium.com/6-key-challenges-of-wearable-product-development-49717d88c684 (accessed 12 August 2022).

Karakhan, A., Xu, Y., Nnaji, C. and Alsaffar, O. (2019), “Technology alternatives for workplace safety risk mitigation in construction: exploratory study”, in Mutis, I. and Hartmann, T. (Eds), Advances in Informatics and Computing in Civil and Construction Engineering, Springer, Cham, doi: 10.1007/978-3-030-00220-6_99.

Khalaf, T.M., Ramadan, M.Z., Ragab, A.E., Alhaag, M.H. and AlSharabi, K.A. (2021), “Psychophysiological responses to manual lifting of unknown loads”, Plos One, Vol. 16 No. 2, e0247442, doi: 10.1371/journal.pone.0247442.

Kuehne, H., Jhuang, H., Garrote, E., Poggio, T. and Serre, T. (2011), “HMDB: a large video database for human motion recognition”, 2011 International Conference on Computer Vision, Barcelona, pp. 2556-2563, doi: 10.1109/ICCV.2011.6126543.

Kulkarni, S., Deshmukh, S., Fernandes, F., Patil, A. and Jabade, V. (2023), “PoseAnalyser: a survey on human pose estimation”, SN Computer Science, Vol. 4 No. 2, p. 136, doi: 10.1007/s42979-022-01567-2.

Kumar, A.S. and Iyer, E. (2019), “An industrial iot in engineering and manufacturing industries—benefits and challenges”, International Journal of Mechanical and Production Engineering Research and Dvelopment (IJMPERD), Vol. 9 No. 2, pp. 151-160, available at: ://efaidnbmnnnibpcajpcglclefindmkaj/https://www.researchgate.net/profile/Senthil-Kumar-Arumugam-2/publication/336216692_An_Industrial_IOT_in_Engineering_and_Manufacturing_Industries_-_Benefits_and_Challenges/links/5d94a462458515202b7c0557/An-Industrial-IOT-in-Engineering-and-Manufacturing-Industries-Benefits-and-Challenges.pdf (accessed 12 August 2022).

Lan, G., Wu, Y., Hu, F. and Hao, Q. (2022), “Vision-based human pose estimation via deep learning: a survey”, IEEE Transactions on Human-Machine Systems, Vol. 53 No. 1, pp. 253-268, doi: 10.1109/THMS.2022.3219242.

Lan, G., Wu, Y., Hu, F. and Hao, Q. (2023), “Vision-based human pose estimation via deep learning: a survey”, IEEE Transactions on Human-Machine Systems, Vol. 53 No. 1, pp. 253-268, doi: 10.1109/THMS.2022.3219242.

Lee, Y., Liu, X., Gummeson, J. and Lee, S.I. (2019), “A wearable RFID system to monitor hand use for individuals with upper limb paresis”, in 2019 IEEE 16th International Conference on Wearable and Implantable Body Sensor Networks (BSN), Chicago, IL, pp. 1-4, doi: 10.1109/BSN.2019.8771099.

Liu, M., Han, S. and Lee, S. (2017), “Potential of convolutional neural network-based 2D human pose estimation for onsite activity analysis of construction workers”, Computing in Civil Engineering 2017, Proccedings of ASCE International Workshop on Computing in Civil Engineering 2017, Seattle, Washington, USA, June 25–27, 2017, pp. 141-149, doi: 10.1061/9780784480847.018.

Luo, H., Wang, M., Wong, P. K.-Y. and Cheng, J.C.P. (2020), “Full body pose estimation of construction equipment using computer vision and deep learning techniques”, Automation in Construction, Vol. 110, 103016, doi: 10.1016/j.autcon.2019.103016.

Maarouf, H. (2019), “Pragmatism as a supportive paradigm for the mixed research approach: conceptualising the ontological, epistemological, and axiological stances of pragmatism”, International Business Research, Vol. 12 No. 9, pp. 1-12, doi: 10.5539/ibr.v12n9pl.

McDevitt, S., Hernandez, H., Hicks, J., Lowell, R., Bentahaikt, H., Burch, R., Ball, J., Chander, H., Freeman, C., Taylor, C. and Anderson, B. (2022), “Wearables for biomechanical performance optimization and risk assessment in industrial and sports applications”, Bioengineering, Vol. 9 No. 1, p. 33, doi: 10.3390/bioengineering9010033.

Mehrizi, R., Peng, X., Xu, X., Zhang, S., Metaxas, D. and Li, K. (2018), “A computer vision based method for 3D posture estimation of symmetrical lifting”, Journal of Biomechanics, Vol. 69, pp. 40-46, doi: 10.1016/j.jbiomech.2018.01.012.

Moeslund, T.B., Hilton, A. and Krüger, V. (2006), “A survey of advances in vision-based human motion capture and analysis”, Computer Vision and Image Understanding, Vol. 104 No. 2, pp. 90-126, doi: 10.1016/j.cviu.2006.08.002.

Navarra, K. (2022), “New uses for wearable devices in the workplace”, available at: https://www.shrm.org/resourcesandtools/hr-topics/technology/pages/new-uses-wearable-devices-in-the-workplace.aspx (accessed 12 August 2022).

Okpala, I., Nnaji, C., Ogunseiju, O. and Akanmu, A. (2022), “Assessing the role of wearable robotics in the construction industry: potential safety benefits, opportunities, and implementation barriers”, in Jebelli, H., Habibnezhad, M., Shayesteh, S., Asadi, S. and Lee, S. (Eds), Automation and Robotics in the Architecture, Engineering, and Construction Industry, Springer, Cham, pp. 165-180. doi: 10.1007/978-3-030-77163-8_8.

Ordr (n.d), IoT in Manufacturing, How OT/IT Convergence Is Changing The Industry, Santa Clara, CA, available at: https://ordr.net/article/iot-in-manufacturing/ (accessed 12 August 2022).

Ohio University (2020), How Wearable Tech is Transforming a Coach's Decision-Making. OHIO Online, Ohio, USA, available at: https://onlinemasters.ohio.edu/blog/how-wearable-tech-is-transforming-a-coachs-decision-making/#:~:text=These%20new%20technologies%20are%20also,and%20recovery%20after%20an%20injury (accessed 30 January 2023).

Pang, G., Shen, C., Cao, L. and Hengel, A.V.D. (2021), “Deep learning for anomaly detection: a review”, ACM Computing Surveys (CSUR), Vol. 54 No. 2, pp. 1-38, doi: 10.1145/3439950.

Phaneuf, A. (2022). “Latest trends in medical monitoring devices and wearable health technology”, Insider Intelligence, New York, NY, USA, available at: https://www.insiderintelligence.com/insights/wearable-technology-healthcare-medical-devices/#:∼:text=wearable%20healthcare%20technology%3F-,Wearable%20technology%20in%20healthcare%20includes%20electronic%20devices%20that%20consumers%20can,healthcare%20professional%20in%20real%20time (accessed 30 January 2023).

Poitras, I., Bielmann, M., Campeau-Lecours, A., Mercier, C., Bouyer, L.J. and Roy, J.S. (2019), “Validity of wearable sensors at the shoulder joint: combining wireless electromyography sensors and inertial measurement units to perform physical workplace assessments”, Sensors, Vol. 19 No. 8, p. 1885, doi: 10.3390/s19081885.

Poppe, R. (2010), “A survey on vision-based human action recognition”, Image and Vision Computing, Vol. 28 No 6, pp. 976-990, doi: 10.1016/j.imavis.2009.11.014.

Roberts, D., Torres Calderon, W., Tang, S. and Golparvar-Fard, M. (2020), “Vision-based construction worker activity analysis informed by body posture”, Journal of Computing in Civil Engineering, Vol. 34 No. 4, 04020017, doi: 10.1061/(ASCE)CP.1943-5487.0000898.

Rogez, G., Rihan, J., Ramalingam, S., Orrite, C. and Torr, P.H. (2008), “Randomised trees for human pose detection”, in 23-28 June 2008 Anchorage, AK, USA IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp. 1-8, doi: 10.1109/CVPR.2008.4587617.

Safety Champion (2021), “Top health and safety tech trends in the manufacturing and industrial sector”, available at: https://www.safetychampion.com.au/safety-tech-trends-manufacturing/ (accessed 12 August 2022).

Schulz, M. (2021), “Mobile EHS technology tackles complex challenges and encourages more participation”, available at: https://www.ishn.com/articles/113174-mobile-ehs-technology-tackles-complex-challenges-and-encourages-more-participation (accessed 12 August 2022).

Singh, R.P., Batish, A. and Singh, T.P. (2014), “Determining safe limits for significant task parameters during manual lifting”, Workplace Health and Safety, Vol. 62 No. 4, pp. 150-160, available at: chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://journals.sagepub.com/doi/pdf/10.1177/216507991406200404 (accessed 12 August 2022).

Smith, J.E. (1978), Purpose and Thought: the Meaning of Pragmatism, Yale University Press, New Heaven, 0300021712.

Snyder, K., Thomas, B., Lu, M.-L., Jha, R., Barim, M.S., Hayden, M. and Werren, D. (2021), “A deep learning approach for lower back-pain risk prediction during manual lifting”, PLoS One, Vol. 16 No. 2, e0247162, doi: 10.1371/journal.pone.0247162.

Taddei, P., Sánchez, C., Rodríguez, A.L., Ceriani, S. and Sequeira, V. (2014), “Detecting ambiguity in localisation problems using depth sensors”, IEEE, 2014 2nd International Conference on 3D Vision, Vol. 2, pp. 129-136, doi: 10.1109/3DV.2014.44.

United States Bone and Joint Initiative USBJI (n.d.), “Musculoskeletal diseases”, available at: https://www.boneandjointburden.org/#:∼:text=Trauma%2C%20back%20pain%2C%20and%20arthritis,and%20hospitals%20occur%20each%20year (accessed 12 August 2022).

VelocityEHS (2020), “A how-to guide: the NIOSH lifting equation”, available at: https://www.ehs.com/2020/03/a-how-to-guide-the-niosh-lifting-equation/#:∼:text=RWL%20%3D%20LC%20x%20HM%20x,FM%20x%20AM%20x%20CM&text=Calculating%20the%20RWL%20using%20this,the%20RWL%20to%20prevent%20injuries (accessed 12 August 2022).

Wang, M.J., Huang, G.J., Yeh, W.Y. and Lee, C.L. (1996), “Manual lifting task risk evaluation using computer vision system”, Computers and Industrial Engineering, Vol. 31 No. 3, pp. 657-660, doi: 10.1016/S0360-8352(96)00254-9.

Wei, W., Kurita, K., Kuang, J. and Gao, A. (2021), “Real-time 3D arm motion tracking using the 6-axis IMU sensor of a smartwatch”, 2021 IEEE 17th International Conference on Wearable and Implantable Body Sensor Networks, BSN 2021, Jul 27 2021, 2021-January, doi: 10.1109/BSN51625.2021.9507012.

WorkSafe, NZ (2022), Injuries Resulting in More than a Week Away from Work, WorkSafe New Zealand, Wellington, New Zealand, available at: https://data.worksafe.govt.nz/graph/summary/injuries_week_away (accessed 12 August 2022).

WorkSafe NZ (2019a), “New Zealand health and safety at work strategy Outcomes dashboard”, WorkSafe New Zealand, Wellington, New Zealand, available at: https://www.worksafe.govt.nz/dmsdocument/30158-new-zealand-health-and-safety-at-work-strategy-outcomes-dashboard/latest (accessed 12 August 2022).

WorkSafe NZ (2019b), Use of New Technology, WorkSafe New Zealand, Wellington, New Zealand, available at: https://www.worksafe.govt.nz/laws-and-regulations/operational-policy-framework/worksafe-positions/use-of-new-technology/ (accessed 12 August 2022).

Xu, M., Guo, L. and Wu, H.C. (2023), “Robust abnormal human-posture recognition using OpenPose and Multiview cross-information”, IEEE Sensors Journal, Vol. 23 No. 11, pp. 12370-12379, doi: 10.1109/JSEN.2023.3267300.

Yamauchi, M. and Iwamoto, K. (2010), “Combination of optical shape measurement and augmented reality for task support: i. Accuracy of position and pose detection by ARToolKit”, Optical Review, Vol. 17, pp. 263-268, doi: 10.1007/s10043-010-0046-z.

Yasar, K. (2022), “Wearable technology”, available at: https://www.techtarget.com/searchmobilecomputing/definition/wearable-technology (accessed 12 August 2022).

Zelik, K. (2021), “A weighty proposition: exoskeletons in the workplace”, available at: https://www.ehstoday.com/safety-technology/article/21154410/a-weighty-proposition-exoskeletons-in-the-workplace (accessed 12 August 2022).

Zheng, C., Wu, W., Chen, C., Yang, T., Zhu, S., Shen, J., Kehtarnavaz, N. and Shah, M. (2020), “Deep learning-based human pose estimation: a survey”, arXiv preprint arXiv:2012.13392, doi: 10.48550/arXiv.2012.13392.

Further reading

Antwi-Afari, M.F., Li, H., Edwards, D.J., Pärn, E.A., Seo, J. and Wong, A. (2017), “Effects of different weights and lifting postures on balance control following repetitive lifting tasks in construction workers”, International Journal of Building Pathology and Adaptation, Vol. 35 No. 3, pp. 247-263, doi: 10.1108/IJBPA-05-2017-0025.

Centers for Disease Control and Prevention [CDC] (2022), “About NIOSH”, available at: https://www.cdc.gov/niosh/docs/94-110/ (accessed 12 August 2022).

Cippitelli, E., Fioranelli, F., Gambi, E. and Spinsante, S. (2017), “Radar and RGB-depth sensors for fall detection: a review”, IEEE Sensors Journal, Vol. 17 No. 12, pp. 3585-3604, doi: 10.1109/JSEN.2017.2697077.

Niswander, W., Wang, W. and Kontson, K. (2020), “Optimisation of IMU sensor placement for the measurement of lower limb joint kinematics”, Sensors, Vol. 20 No. 21, p. 5993, doi: 10.3390/s20215993.

Corresponding author

Mahesh Babu Purushothaman is the corresponding author can be contacted at: mahesh.babu@aut.ac.nz

About the authors

Mahesh Babu Purushothaman has a doctorate from AUT and is a lecturer in management at built environment engineering in the school of future environments. Previously Mahesh managed lean manufacturing units for 25 years. He is experienced in the fields of manufacturing, procurement, warehousing, marketing, demand planning and change management. He has implemented projects that combined digital and Information technology integration to reduce stress in the manufacturing environment. He has executed projects in the USA, Europe, Australia, China, Southern Asia and Dubai. Mahesh has designed process architecture and was involved in developing IT projects for facilities, manufacturing and SCM. He is a lecturer at the Auckland University of Technology with research interests in the human factors influencing lean manufacturing systems and waste reduction.

Kasun Moolika Gedara is a PhD student at Auckland University of Technology, Auckland, New Zealand, with a research interest in information technology. He researches computer vision-based programming and AI.

Related articles