Markerless human motion capture has received much attention in computer vision and computer graphics communities. A hierarchical skeleton template is frequently used to model the human body in literature, because it simplifies markerless human motion capture as a problem of estimating the human body shape and joint angle parameters. The proposed work establishes a skeleton based markerless human motion capture framework, comprising of (1) an improved deformation skin model suitable for markerless motion capture while it is compliant with the computer animation standard, (2) image segmentation by using Gaussian mixture static background subtraction and (3) nonlinear dynamic temporal tracking with annealed particle filter. This framework is able to efficiently represent markerless human motion capture as an optimisation problem in the temporal domain and solve it by the classic optimisation scheme. Several experiments are used to illustrate its robustness and accuracy comparing with the existing approach.