Quality estimation (QE) of machine translation (MT) aims to evaluate the quality of machine-translated sentences without references, and is important in practical applications of MT. Training QE models requires massive parallel data with hand-crafted quality annotations, which are time-consuming and labor-intensive to obtain. To address the issue of the absence of annotated training data, previous studies attempt to develop unsupervised QE methods. However, very few of them can be applied to both sentence- and word-level tasks, and they may suffer from the noise in the synthetic data. To reduce the negative im015 pact of the noise, we propose a self-supervised method for both sentence- and word-level QE, which performs quality estimation by recovering the masked target words. Experimental results show that our method outperforms previous unsupervised methods on several QE tasks in different language pairs and domains.